The Role of Functional Safety in Automotive Electronics

Electronics designers in mission-critical applications have long grappled with the problem of Functional Safety. Broadly defined as the absence of unreasonable risk due to hazards caused by malfunctioning electrical engineering system behavior, the advent of automotive applications involving autonomous operation presents a challenge unlike any other to the semiconductor industry. Increasing complexity at the component level demands a reliable, repeatable and verifiable process for demonstrating functional safety. Cost sensitivities at the system level preclude traditional solutions such as replication and voting logic usually employed by aerospace and other industries.

However, where traditional industries relied on IEC61508 and various industry-specific guidelines, the automotive industry crystallized its functional safety requirements in the form of the ISO 26262 standard. Now in its second revision, the standard defines a process and traceability flow for systematic fault reduction and specifies specific architectural metrics to combat random hardware failures that occur during operation. A combination of the two provides the basis for the all-important Automotive Safety Integrity Level (ASIL) classification of electronic components and systems.

While the ISO standard comprises 11 volumes dealing with end-to-end safety lifecycle of electrical engineering systems, this article focuses on the impact of the standard and its hardware-oriented failure metrics on the integrated circuit (IC) development flow at each stage of the project. It outlines some of the key attributes of an effective solution.

The ISO 26262 standard defines work products derived from the Hazards and Risk analysis (HARA) that result in Safety goals with associated ASILs. Sec 4.6 outlines the derivation of technical safety requirements from the safety goals allocated to Hardware, Software or both.

Of particular interest to designers are sections 4, 5 and 6 of the standard (Figure 1). Hardware designers begin their safety-oriented product development by deriving hardware-oriented technical requirements to satisfy relevant Technical Safety Requirements, typically achieved by manual analysis and expert design judgement and recorded using a variety of front-end tools. The feature/functional block and its associated ASIL requirement are recorded.

Figure 1: Sections 4, 5 and 6 of the ISO 26262 standard offer derivations of the system safety requirements.

At its core, safety management is a three-stage problem –– safety analysis, safety verification and safety synthesis –– as shown in Figure 2.

Figure 2: The three stages of safety management are safety analysis, safety verification and safety synthesis.

Essentially, safety analysis is a “gap” analysis. The first stage takes as its inputs IC-level Failure Mode and Effects Analysis (FMEA) of Fault-Tree Analysis (FTA) and the design specifics. This first step in the safety flow of the IC development typically is conducted by the safety architect early in the design.

The primary challenge is the lack of clarity in the design collateral, much of which is either undetermined, unavailable or present in abstract model form. Computing the probabilistic metric for random hardware failures (PMHF) such as base failure rates and diagnostic coverage metrics such as single- and latent-point failures is an imposing task. It requires both structural information (Figure 3) about the IP block(s) as well as functional information regarding their utility, use model and safety criticality.

Figure 3: Structural composition plays a key role in the Hardware Safety metrics.

Any effective solution to the analysis phase of the safety problem should possess two key attributes.

Structural Analytics –– One key decision the architect makes is the estimate of diagnostic coverage for each of the functional units. While the standard provides suggestions, reality is dictated by a variety of factors with regard to ratio of that include:

  • Standard cells versus memory
  • Combinatorial versus sequential logic
  • Synthesis efficiency

Figure 4: An optimal path to safety analysis moves consistently and seamlessly from black box to RTL to netlist to layout.

That’s why structural insight into the design is important to take the guesswork out of the equation. The accuracy of the Figure 2: failure metrics is dependent on the efficacy of the structural analysis.

Multiple Levels of Abstractions –– Given that this is the foundational step of the design, most IP blocks continue to be in the front end. Thus, any good solution must deal with register transfer level (RTL)-based designs and/or black boxes.

The ability to progressively move from black-box-to-RTL-to-netlist-to-layout provides the optimal path to safety analysis (Figure 4).

Once the gap analysis is complete, the safety synthesis phase provides the remedial action to close the gap to the desired ASIL level. Solutions span the gamut from system to software to hardware to even package-level techniques. They all fit within one of two approaches, either fault tolerance or fault detection. Diagnostic coverage is indifferent to either category, but fault tolerance reduces the effective PMHF metric.

The safety synthesis phase (Figure 5) is owned jointly by the architect and the designer. Depending on the level of abstraction of the coverage mechanism, the designer could be at the system, platform, software or IC level. Regardless, an effective solution to the synthesis problem should show the following characteristics:

  • Scenario Analysis –– The ability to ask “What-if” queries is valuable. As an example, if inserting error-correction (ECC) into the IC, the ability to estimate the resulting coverage prior to performing an actual synthesis is desirable.
  • Cost Estimation –– No matter what safety mechanism mode, there is a cost. A software mechanism steals cycles from the processor while a hardware mechanism increases area and hurts the timing budget. Estimating and tuning the solution toward the desired cost function is sought after by designers.
  • Automatic Validation –– Any change to the functional design in software or hardware is disruptive. Thus, a mechanism that automatically checks for correct implementation, as well as preservation of original design intent, is a pre-requisite. As an example, a formal equivalence checking of pre- and post-safety insertion is prized by hardware designers.

Figure 5: Designers must make tradeoffs at the safety synthesis phase.

The final phase of safety engineering is the domain of the hardware design verification engineer whose task has become harder since every node in the design is now a potential bug, in the presence of a fault. Worse, there are multiple fault models –– stuck-at-1, stuck-at-0, bridging, opens, for example –– that may require testing. The problem space explodes once the concept of time is introduced, known as transient faults.

Figure 6: Safety verification is the final phase of the design. Multiple approaches can be used to verify the design.

Multiple approaches exist with Figure 6 highlighting the pros and cons of each. The ISO 26262 preference for the automotive sector is fault simulation in Volume 5. An exhaustive fault campaign, however, requires injecting millions of faults into the design, propagating them and checking to see if they were detecting within the fault tolerant time interval (FTTI) applicable to the simulation.

An effective fault simulation solution should possess most of the following characteristics:

  • Fault List Generation –– A safety verification engineer must identify the location and type of faults to inject as a first step. The ability to generate this automatically, perhaps from the safety analysis phase, solves a key hurdle.
  • Simulator Independence –– While conceptually a fault simulation is no different than a logic simulation with a force statement on a given node, the subtle differences between each simulator in terms of how they deal with race conditions is a problem best avoided. When generating the safety stimulus and running the design under test (DUT), a standalone fault propagation engine is invaluable.
  • Auto Fault Classification and Disposition –– At the end of the fault campaign, the safety coverage report should classify each fault into detected, safe or undetected. The safety verification engineer is often confronted with a fourth category of unresolved faults. Causes of unresolved faults range from insufficient safety stimuli to black-boxes to encrypted design segments. The ability to classify and resolve such corner cases, even if only a small portion of the overall fault space is vital to a smooth fault campaign.
  • Speed –– The ability to provide an exponential speedup over standard simulator frameworks is important. For the reasons of enlarged state-spaces described earlier, linear speedup is insufficient. A new class of solutions that can take advantage of spatial and temporal state-space reduction is required to tackle the fault verification in automobile electronics.

The move toward assisted and autonomous navigation in automobile electronics is credited to the industry’s focus on functional safety. The arrival of the ISO 26262 standard is proof of the attention paid by the design community to ensure the absence of “unreasonable risk due to hazards caused by malfunctioning behavior of electric and electronic systems.” By integrating the tasks of functional safety requirements specification, safety analysis, synthesis and validation into a single workflow, the designer can automate the entire safety design process from requirements to certification.


Share and Enjoy:
  • Digg
  • Sphinn
  • Facebook
  • Mixx
  • Google
  • TwitThis
Extension Media websites place cookies on your device to give you the best user experience. By using our websites, you agree to placement of these cookies and to our Privacy Policy. Please click here to accept.