A Singular Perspective on Unique IDs: Q&A with UEFI Forum President Mark Doran
The far end of the value chain demands getting device driver loading right, with unique ID assignment crucial. How can the technology ecosystem help to alleviate some of the hurdles to this ongoing challenge?
Editor’s note: Back when MC Hammer’s “U Can’t Touch This” was in the top 10 on the Billboard charts vendors assigned device IDs in a manner that Mark Doran, Intel Fellow and President, UEFI Forum, describes as “pretty fragile.” In a recent interview with EECatalog, Doran explained why the device ID assignment practices he has implemented at Intel have served the company well, discussed device ID history, and offered a prescription to assure that Operating Systems can indeed touch the correct driver.
EECatalog: One of your roles is, and has been for more than a decade, the issuer of device IDs and president of the Unified Extensible Firmware Interface Forum, or the UEFI Forum, which manages the Advanced Configuration and Power Interface (ACPI) Specification and the Unified Extensible Firmware Interface (UEFI) Specification. Through these roles, you encourage the use of industry-standards based firmware and are in a distinctive position to outline the history of and need for unique device IDs.
Mark Doran, UEFI Forum: We have had, as an industry, several attempts at doing our driver matching homework. Our experience in the 90’s taught us that having vendors assign device IDs using a table method to look up the IDs and figure out which drivers to load was not ideal. For example, the IDs were often manually entered into the data structure tables incorrectly. This seemed problematic for the entire product ecosystem, as improper assignments could result in duplicate IDs. Also, driver matching systems were not robust enough to keep up with the increasing machine scale and the number of vendors participating in the ecosystem.
We are at the point now where device IDs, based on the definition in the ACPI specification, have become the primary way that we deal with devices that do not self-identify. Similarly, sometimes the Windows INF file (the file used by Microsoft Windows for the installation of software and drivers) is poorly constructed and the match process does not work. All-in-all there are opportunities for error.
Unfortunately, unique vendor ID assignment is a very difficult thing to automate, as many people have to do their job with great precision for this to work. In the 90’s, using the Intel Architecture (IA) platform, we moved towards more capable buses for connecting devices. These buses had properties that allowed you to programmatically discover which devices were present in the platform and reveal the parameters when the operating system needed to find and load drivers.
The PCI Express® (PCIe®) architecture is a well-known bus interconnect with an extremely reliable set of algorithms that you can write in a single driver. Using this approach, the driver can identify which devices are present, figure out what they are based on and the device IDs from that physical hardware. In this scenario, loading drivers is much more reliable and automated.
You would think having “solved” this problem, we would be in much better shape. Sadly, this is not the case and device IDs are still vulnerable to miss-matched assignments due to the advent of mobile devices.
In the mobile consumer electronics ecosystem there are the numerous I/O devices that you find clustered around CPUs of any given processor architecture that originate from the ARM world vs. the Intel Architecture world.
The industry started building these mobile platforms using silicon formulations of system on chips (SoCs) without always including the PCIe architecture. Essentially, I/O devices, modems, GPS devices and touch sensor designs were becoming more reliant on SoC designs through bus interconnects that do not include a self-discovery/self-enumeration type of property.
EECatalog: Was the self-discovery property not included because of the need to make certain trade-offs?
Mark Doran, UEFI Forum: Indirectly yes, but it is more a question of a different evolution path. The silicon overhead (aka the number of gates you have to put into the PCI bus) is quite large. ARM started out building devices that were of a much smaller scale – think microcontrollers as opposed to CPUs for PCs. In the case of smaller devices, the amount of overhead and space it takes to include the PCIe interconnect bus may not make sense on an economical level.
The PCIe bus interconnect has all kinds of interesting properties including self-discovery. If you can afford incorporating this interconnect on a larger-scale machine, that is definitely one way to go, but on these smaller-scale devices it may not pencil out in terms of cost. Instead, the ARM approach is more about taking the I/O device and finding the simplest way to connect it to the rest of the processing and memory architecture. Since ARM’s target markets are much more focused on cost and keeping its gate count low, adding something like a PCIe interconnector might be considered overkill.
There is a growing ecosystem of devices that do not include bus interconnects like PCIe and other I/O devices and don’t know anything about self-discovering. Instead, they are connected by more primitive means and require manual enumeration of the platform. This brings us back to the concept of device IDs.
EECatalog: And what are you seeing as a result of this increase in I/O devices?
Mark Doran, UEFI Forum: I have been the keeper of Intel’s device ID space for quite some time. In the first decade of the 2000’s we gave out a handful—maybe 4 or 5—device IDs total through that entire period. If you start at 2010 and fast forward to today, I am issuing them at the rate of tens per month because of the SoC devices that are being connected in a way that is not programmatically discoverable by the operating system. In this situation, we have to describe the device presence in a machine in a way that allows it to figure out which device drivers it needs to load. Today, manufacturers are using ACPI specification defined device IDs to do it.
The volume and rate of using these IDs has grown significantly in the last few years—and it doesn’t seem to be slowing down. This is partially because the ARM ecosystem joined the UEFI Forum and has steered customers and licensees towards using industry standards-based firmware, i.e., ACPI and UEFI-based firmware. By doing so, more machines now participate in that ecosystem and they all want to benefit from being able to identify and describe their devices. Those of us who have been doing this for 15-20 years are seeing an uptick in the number of ecosystem participants since ARM began migrating to standards-based firmware.
A great number of new device manufacturers and vendors are starting to use ACPI as their firmware enumeration strategy. These manufacturers previously had no need for device IDs because they were not working on IA machines or using Windows, but now that Windows is showing up on ARM systems the landscape has changed.
While the increased number of participants is impressive, the process of assigning device IDs is still lagging. Ultimately, we need to make sure the correct driver is associated with the correct device ID when it shows up in the firmware description of the platform and that the OS recognizes it and loads the right driver every time. This is easier said than done.
With more and more participants entering the ecosystem each day who are not familiar with the device ID process, it’s becoming difficult to track who is assigning which numbers and how they are being assigned. There is a process that should be followed in order to avoid duplicate device ID allocations that could flood the market and make it nearly impossible to track. What’s happening now is that we are seeing all types of creative interpretations of how vendors could assign these IDs.
EECatalog: What problems could result from these “creative” interpretations?
Mark Doran, UEFI Forum: This is about the industry getting its act together so we don’t deliver a poor customer experience on the far end of the value chain. I am concerned about the economic inefficiency of device discovery, the last stage of validation in the life cycle of a machine, that some device driver wasn’t loaded properly because there was a typo in an INF file. If a machine manufacturer is doing their job properly in terms of validation, this should never occur in the first place.
Once upon a time, phones were shipped with an operating system payload that was captive or tied to a carrier, but with the advent of jail breaking phones and users installing their own system software payloads, we faced a situation that we know and love from the PC world. First, you can buy a mobile platform and wipe it clean to create a blank machine that doesn’t software on it. Then you can treat it as a sort of commodity device and install your own, after-market copies of Windows or Linux on it.
What I’ve just described is similar to a situation where you’re entirely dependent on the machine correctly describing itself so that the OS you install loads the correct drivers for the devices on it. For devices that are shipping with the OS payload captive on the machine, the manufacturer validation process should be able to verify everything correctly. The increased reliance on these IDs for helping to identify devices is already causing us to see a spike in bug reports that are found in validation, thereby requiring us to go back and do the extra work to correct the discrepancy that should not have occurred in the first place.
We’re not talking millions of dollars’ worth of validation, but in terms improving the process, we need to reevaluate how things are handled, especially with the increased industry reliance on device IDs over self-enumerating and self-discovering buses.
EECatalog: What are some ways of addressing the problems you are describing?
Mark Doran, UEFI Forum: Each player involved in this operational life cycle needs to do their part and have a clear set of expectations about what their role is and how they fulfill that.
I know people who obtain a vendor ID prefix and don’t realize someone else at their company is using that same prefix for different devices. This causes confusion in the market, because now, an operating system may have two drivers with the same device ID for two different types of devices. When the operating system finds a device that matches this particular device ID in some other platform it is running we run into issues of the system not knowing which one to load. That’s an obvious flaw in the process.
If we give you one of these vendor ID prefixes to issue device ID numbers as a company, we expect you distribute individual device IDs in a way that ensures only ONE device is ever known by a given device ID number. Even this simple step in the process of vendor ID allocation is being misunderstood, which creates a daisy-chain reaction in the rest of the value chain.
The point of these device IDs is to associate a driver with a particular device. There are vendors in the device space who will happily sell you a device, but do not know the first thing about drivers. They don’t have drivers and they don’t write them. Instead, they hand you a device and it is up to you to create a driver for it. In this scenario the questions become, “Whose vendor ID do I use?” “Is it the original device vendor or should I use the vendor ID of whatever company I used to write the particular driver?”
Let’s say that particular device is sold to a company that makes Windows-based products and one that is making Linux-based products. If they both assign a number out, you could end up with two aliases for the same physical device. Now imagine on the Windows machine somebody comes along after market with a couple of Linux devices and tries to load the Linux OS on that machine. That machine is going to have the wrong version of the device ID number as far as the Linux driver is concerned. The driver would work perfectly but is not loaded because the OS thinks that it has no driver that matches the device.
I came across one company that was making both the devices and the drivers for it—so far so good. However, this company was selling to a number of different customers, and for every customer it was assigning a different ID number. Now we’ve got all these devices that are in fact the same, but are apparently different to judge the device IDs. Now imagine you’ve got a security issue that affects the driver for that device—if all of the devices are in fact the same and known by the same device ID number, you can make one change to the driver that corresponds to that one ID, and now you’ve got everybody’s machine potentially fixed as soon as they do a driver update.
In a scenario, however, where the vendor is handing out different IDs for every single customer, you’ve got to spin a driver for every single customer in order to fix that. It complicates the problem of doing things like propagating critical security fixes for bugs in drivers out into the field. It is not that it did not work in the first place, it’s that if you did it right the first time, assigning the same ID to the same device always known by that same alias, all the maintenance operations would work much more efficiently.
EECatalog: What are best practices for avoiding device ID assignment problems?
Mark Doran, UEFI Forum: Inside Intel, we have a centralized website repository that keeps the master list of numbers that have been allocated. I maintain this list of IDs that describes what the ID was assigned to, in other words, the name of the device and what it does. I also track the person and the business group that requested the number, so that if we have a customer support problem come in about that particular device, we can find that the ID number and locate somebody who understands how the device works.
“You must not assign your own device ID; you must ask for one created from a central pool,” is the message we’re pounding home. We also make the request process as automated and simple as possible to avoid people opting out and assigning their own random ID numbers.
Although the specifications do not formally describe the process and distribution of device IDs inside the company, this is one methodology that has worked well for Intel and I’m sure there are others that could as well.
[Editor’s Note: Mark also described during the course of our conversation what he called “one other little interesting wrinkle.”]
Mark Doran, UEFI Forum: We’ve discussed two styles of ID, [Plug-and-Play] PNP and [Advanced Configuration and Power Interface] ACPI. However, there are also other dependent standards, the most well-known of which is the VESA [Extended Display Identification Data] EDID specification. VESA is the standards organization for video card and display devices. Its EDID specification defines a type of data structure that describes the properties of a monitor, LCD, LED, or OLED screen to the operating system; the refresh rate, pixel density, resolution of the screen, and things of that sort.
The EDID record includes a field that requires you to fill in the ID of the vendor who created the device. VESA choses to use the three-letter PNP vendor ID prefix because that was a well- known industry standard registered list of IDs that somebody else was managing for them. VESA effectively depends on the UEFI Forum to maintain that list now that the Forum has assumed that task from the former keeper, Microsoft.
Probably 25 percent of the requests I have had in the last two years are from panel or display device manufacturers. The folks who have wandered up to me in the course of the past 18 months and said, “Could we have an ID, please?” are those that we might never have heard from prior to the use of industry standards-based ways of doing device labeling.
It’s a good problem to have. It is making the work of the UEFI Forum that much more important because the display ecosystem can rely on the Forum to manage the vendor IDs. However, because of the increase in the number of new applicants for IDs, in the case of three-letter IDs, we’re not too far away from running out of IDs. The number of three-letter combinations you can have is somewhat limited.
We already started a conversation with the industry leaders who manage the VESA standard to suggest they allow use of the four-letter ID. We define those to use any of the 26 alpha characters plus 0 to 9. If you do the math, it turns out we have more than 1.2 million fresh IDs in the ACPI space available. I am loath to channel Bill Gates and the famous quip, “Who would ever need more than 512 Kbytes of memory?” but for the moment we have room to grow.
Unless or until the EDID specification changes, if you are a device vendor and contacting the UEFI Forum for an ID and your intention is to use it for a display device, do not ask for a four-letter ID because that will not work. You absolutely need a three-letter one.