A Tale of Two COTS
How the COTS for safety certifiable design process differs from the typical COTS design process.
In this article we will discuss how the typical COTS board design process differs from that of a board designed to be safety certifiable. While the focus will be on Single Board Computers, the same principles apply to graphics and I/O modules.
The typical COTS SBC design process starts with the reference design supplied by Intel or AMD for an Intel Architecture design or NXP for a PowerPC design. These reference designs many times are available prior to the public release of the latest microprocessor to allow advanced development of the board level products. The engineers will then adapt the reference design to accommodate the bus structure of the target form factor, VME, VPX (6U or 3U), VNX, etc. And yes, there are still VME refreshes taking place for legacy designs. Additions to the schematic are typically to provide forward and backward compatibility with previous products on the market for life cycle considerations.
Once the schematic capture is accomplished, component placement begins. Some companies will do separate layouts for air-cooled and conduction cooled variants, others will use the same layout for the PCB and use different metalwork between the two cooling methods. Air-cooled boards will typically have both front panel and rear panel I/O, while rugged conduction cooled boards will only support rear panel I/O. Using the same PCB for both cooling schemes requires making the front panel I/O a population option at assembly, however you gain economies of scale by doing so.
When a new processor requires component placement substantially different than preceding products, modeling heat flow and the effects of shock and vibration are crucial to developing a stable product. Using finite element analysis prior to starting trace layout can show whether the current placement can be properly cooled and whether or not the PCB deflection to the natural harmonic frequency is likely to damage the board. Large dense FPGAs are subject to fracturing of the solder balls if the deflection is too extreme. Once placement is properly modeled layout can proceed. Preference is usually given to the memory busses and other high-speed signal paths, which must frequently be tuned to obtain maximum signal integrity over the temperature range to which the board will be subjected.
Once the placement and layout are complete, PCB fabrication and assembly are commissioned and the debug process starts. With the board up and running with perhaps some minor patch wires, serious environmental testing begins. Temperature cycling with guard bands, and shock and vibration testing insure that the product will meet the published specifications. Preliminary testing for EMI and RFI are frequently done at this stage with final testing done after the PCB is respun to final form.
A good design process usually results in a minor cleanup of the first build and being able to go to production on the first respin. Here any final regulatory testing is done and the product is released to manufacturing. Some companies maintain their own manufacturing while others will use a contract manufacturer. While utilizing a contract manufacturer can gain economic benefits for the components on the Bill of Materials, you lose control over the scheduling of manufacturing and they charge premiums for items such as component traceability and small lot runs. This is the typical COTS design process.
Different from the Start
When designing a COTS board that is to be safety certifiable, the design process differs in many ways right from the start. You must first determine if are you going to be certifiable to DO-178 for software, DO-254 for hardware or both. You must then determine which Design Assurance Level (DAL) you wish to achieve. There are five levels to choose from: from DAL-E, which is essentially a don’t care, up to DAL-A, which is the most critical and difficult to accomplish. These must be carefully designated at the start of the process. Understanding whether or not the deployment will be for commercial or military aviation and which certification body will be making the approval also has a bearing on the planning process. With all of this information now in place the plan is formulated, and the design requirements are finalized.
The next phase is also very different from the standard COTS design. Component selection is based on the ability to prove certifiability. Here the latest and greatest components are not friendly to becoming certified, particularly as the DAL levels get to DAL-B or DAL-A. Willingness of the component supplier to support the certification process can be critical. Performance requirements are also critical at his stage, as the ability to utilize multiple cores on the microprocessor is just starting to occur. Most certified designs to date require disabling all but one of the cores as well as disabling L-2 and L-3 caches. This can have a substantial hit on the performance that a processor is capable of achieving. Since certification requires determinism and headroom margins this must be carefully assessed. While COTS hardware designs usually include many options, unused circuits should be eliminated from certifiable designs. A rigorous Failure Mode and Effect Analysis (FMEA) is usually performed to insure a sound design.
The rest of the design process is somewhat similar to the normal hardware COTS procedure. The exception is that at each stage the designer must document and preserve the evidence that the design was done in accordance to the plan to achieve certifiability. Also, any changes or deviations must be evaluated as to the impact on the ability to certify. Changes can result in significant back tracking and additional expensew. Figure 1 shows an example of a DAL-C Certifiable COTS design.
The software design process to support the board is under similar restrictions. DO-178 compliance starts with choosing an operating system for which the OS vendor provides certification artifacts for the chosen processor architecture to the desired DAL level. Typically, these operating systems will have an ARINC 653 compliant architecture. There are a number of choices available, however, we will not name them in this article to avoid accidentally leaving one out. The device drivers developed by the board designer and the resulting Board Support Package require the same due diligence in documenting and preserving the evidence that the process complies with the plan. Every line of code must be proven to provide repeatable results, and there can be no unused functions present. Figure 2 shows an example of a DAL-B Certifiable COTS design.
In the final analysis, the cost to produce a COTS design and a safety certifiable design are competitive in nature if the consumer does not have to purchase the certification package. However, there is substantial added value in the design processes that go into producing a safety certifiable product. As a result, if you are choosing products to produce a mission critical computer, why would you trust your design to anything less than a mission computer built from safety certifiable products?
Wayne McGee is the vice president of sales and general Manager for North American Operations for Creative Electronic Systems SA. He has served in various senior management positions in his career and has more than 30 years of experience in the VME, CompactPCI, ATCA and VPX markets. McGee is also the chairperson for the VNX VITA-74 Marketing Alliance. Companies McGee has worked for include Motorola Computer Group, VMIC, SBS Technologies and GE Intelligent Platforms. He holds a BSEE from the University of South Carolina.