Top 3 Mistakes with Static Analysis for Embedded and Safety-Critical Development
The high cost of repairing defects shipped in embedded devices, paired with the increasing need to follow regulatory compliance initiatives for safety-critical embedded systems software (FDA, DO-178B/C, IEC 61508) has driven many organizations to adopt static analysis as a key part of their quality strategies.
Static analysis is one of the most effective and least burdensome of such industry-standard best practices. In fact, it is often explicitly recommended (e.g., per the FDA’s recommendation for infusion pumps) as one key component of a comprehensive quality strategy. When properly implemented, static analysis is a very powerful tool for exposing error-prone code. Finding and fixing such code from the earliest phases of the software development lifecycle has been proven to be a very effective (and cost-efficient) way to prevent defects from being shipped in the final product.
Static analysis is a critical component of a comprehensive quality process…but it is just one component. It’s important to remember that most effective quality processes involve a combination of test and analysis practices embedded throughout the SDLC. In addition to static analysis, an effective quality strategy covers practices such as:
- Unit testing (host and target)
- Regression testing
- Peer code review
- Coverage analysis
- Runtime error detection
- Requirements traceability
At Parasoft, we’ve been assisting software development organizations to implement and optimize static analysis since 1996. Over the years of analyzing static analysis deployments across safety critical, embedded, and enterprise software development organizations, we’ve determined what mistakes are most likely to result in failed static analysis initiatives. Here’s what we’ve found to be the top three reasons why static analysis initiatives don’t deliver real value in embedded and safety-critical development environments—and some critically important tips for avoiding these common pitfalls.
3. Starting With Too Many Rules
Some eager teams take the “big bang” approach to static analysis. With all the best intentions, they plan to invest considerable time and resources into carving out the penultimate static analysis implementation from the start—one that is so good, it will last them for years.
They assemble a team of their best engineers. They read stacks of programming best practices books. They vow to examine all of their reported defects and review the rule descriptions for all of the rules that their selected vendor provides.
I’ve found that teams who take this approach have too many rules to start with and too few implemented later on. It’s much better to start with a very small rule set, and as you come into compliance with it, phase in more rules.
Static analysis actually delivers better results if you don’t bite off more than you can chew. When you perform static analysis, it’s like you’re having an experienced engineer stand over the shoulder of an inexperienced engineer and give him tips as he writes code. If the experienced engineer is constantly harping on nitpicky issues in every few lines of code, the junior engineer will soon become overwhelmed and start filtering out all advice—good and bad. However, if the experienced engineer focuses on one or two issues that he knows are likely to cause serious problems, the junior engineer is much more likely to remember what advice he was given, start writing better code, and actually appreciate receiving this kind of feedback.
It’s the same for static analysis. Work incrementally—with an initial focus on truly critical issues—and you’ll end up teaching your engineers more and having them resent the process much less. Would you rather have a smaller set of rules that are followed, or a larger set that isn’t?
Out of the hundreds or sometimes even thousands of rules that are available with many static analysis tools, how do you know where to start? Here are a few simple guidelines:
- Would team leaders stop shipping if a violation of this rule was found?
- (In the beginning only) Does everyone agree that a violation of this rule should be fixed?
- Are there too many violations from this rule?
In regulated environments, this rule is elevated to the status of a commandment. The more you get into the habit of frequently suppressing or ignoring rule violations, the more likely you are to have to tell an auditor or attorney why you ignored reports of an issue that ultimately caused a serious defect in the field. From a negligence perspective, it’s much safer to define a tight rule set and ensure that every violation is addressed than to have a large one that is loosely followed.
2. No Automated Process Enforcement
Without automated process enforcement, engineers are likely to perform static analysis sporadically and inconsistently—which is not only problematic from a compliance perspective, but also diminishes your ability to derive maximum defect-prevention value from static analysis. The more you can automate the tedious static analysis process, the less it will burden engineers and distract them from the more challenging tasks they truly enjoy. Plus, the added automation will help you achieve consistent results across the team and organization. Avoid the false economy of an automated run that still requires a manual triage process at the end. A tighter configuration will provide more value without the need for manual review and selection of what to fix.
Many organizations follow a multi-level automated process. Each day, as the engineer works on code in the development environment, he or she can run analysis on demand—or configure an automated analysis to run continuously in the background (like spell check does). Engineers clean these violations before adding new or modified code to source control.
Then, a server-based analysis can run as part of continuous integration, or on a nightly basis, to make sure nothing slipped through the cracks.
Assuming that you have a policy requiring that all violations from the designated rule set are cleaned before check in, any violations reported at this level indicate that the policy is not being followed. If this occurs, don’t just have the engineers fix the reported problems. Take the extra step to figure out where the process is breaking down, and how you can fix it (e.g., by fine-tuning the rule set, using suppressions judiciously).
1. Lack of a Clear Policy
It’s common for organizations to overlook policy because they think that simply making the tool available is sufficient. It’s not. Even though static analysis (done properly) will save engineers time in the long run, they’re not going to be attracted to the extra work it adds upfront. If you really want to ensure that static analysis is performed as you expect—even when the team’s in crunch mode, scrambling to just take care of the essentials—policy is key.
Every team has a policy, whether or not it’s formally defined. You might as well codify the process and make it official. After all, it’s a lot easier to identify and diagnose problems with a formalized policy than an unwritten one.
Ideally, you want your policy to have a direct correlation to the problems you’re currently experiencing (and/or committed to preventing). This way, there’s a good rationale behind both the general policy and the specific ways that it’s implemented.
With these goals in mind, the policy should clarify:
- What teams need to perform static analysis
- What projects require static analysis
- What rules are required
- What degree of compliance is required
- When suppressions are allowed
- When violations in legacy code need to be fixed
- Whether you ship code with static analysis violations
Arthur Hicken, Evangelist for Parasoft, has been involved in automating various practices at Parasoft for almost 20 years. He has worked on projects including database development, the software development lifecycle, web publishing and monitoring, and integration with legacy systems.