Diagnosis of Embedded Software Using Program Spectra
Summary (4 min read)
1 Introduction
- Software reliability can generally be improved through extensive testing and debugging, but this is often in conflict with market conditions: software cannot be tested exhaustively, and of the bugs that are found, only those with the highest impact on the user-perceived reliability can be solved before the release.
- Testing reveals more bugs than can be solved, and debugging is a bottleneck for improving reliability.
- Locating a fault is an important step in actually solving it, and program spectra have successfully been applied for this purpose in several tools focusing on various application domains, such as Pinpoint [4], which focuses on large, dynamic on-line transaction processing systems, AMPLE [5], which focuses on object-oriented software, and Tarantula [9], which focuses on C programs.
- The remainder of this paper is organized as follows.
- In Section 2 the authors explain the diagnosis technique in more detail, and in Section 3 they discuss its applicability to embedded software in consumer electronics products.
2.1 Failures, Errors, and Faults
- As defined in [3], the authors use the following terminology.
- An error is the part of the total state of the system that may cause a failure.
- To illustrate these concepts, consider the C function in Figure 1.
- A failure occurs when applying RationalSort yields anything other than a sorted version of its input.
- In a software context, faults are often called bugs, and diagnosis is part of debugging.
2.2 Program Spectra
- A program spectrum [11] is a collection of data that provides a specific view on the dynamic behavior of software.
- As an example, a block count spectrum tells how often each block of code is executed during a run of a program.
- A block of code is a C language statement, where the authors do not distinguish between the individual statements of a compound statement, but where they do distinguish between the cases of a switch statement1.
- Block 5, the RationalGT function body, is executed six times: once for every iteration of the inner loop.
- Beside block count/hit spectra, many other forms of program spectra exist.
2.3 Fault Diagnosis
- The hit spectra of M runs constitute a binary matrix, whose columns correspond to N different parts of the program (see Figure 2).
- In their case, these parts are blocks of 1This is a slightly different notion than a basic block, which is a block of code that has no branch.
- This vector corresponds to a hypothetical part of the program that is responsible for all observed errors.
- In the field of data clustering, resemblances between vectors of binary, nominally scaled data, such as the columns in their matrix of program spectra, are quantified by means of similarity coefficients (see, e.g., [8]).
- I3 is not sorted, but the denominators in this sequence happen to be equal, in which case no error occurs.
3 Relevance to Embedded Software
- The effectiveness of the diagnosis technique described in the previous section has already been demonstrated in several articles (see, e.g., [1], [4], [9]).
- Especially because of constraints imposed by the market, the conditions under which this software is developed are somewhat different from those for other software products: Moreover, concurrent systems are difficult to model.
- The technique improves insight in the run-time behavior.
- Profiling tools such as gcov are convenient for obtaining program spectra, but they are typically not available in a development environment for embedded software.
4.1 Platform
- The subject of their experiments is the control software in a particular product line of analog television sets.
- All audio and video processing is implemented in hardware, but the software is responsible for tasks such as decoding remote control input, displaying the on-screen menu, and coordinating the hardware (e.g., optimizing parameters for audio and video processing based on an analysis of the signals).
- Most teletext2 functionality is also implemented in software.
- Essentially, the run-time environment consists of several threads with increasing priorities, and for synchronization purposes, the work on these threads is organized in 315 logical threads inside the various components.
- The total available RAM memory in consumer sets is two megabyte, but in the special developer version that the authors used for their experiments, another two megabyte was available.
4.2 Faults
- The authors diagnosed two faults, one existing, and one that was seeded to reproduce an error from a different product line.
- The CPU load clearly increases around the 60th sample, when the teletext viewing starts, but never returns to its initial level after sample 90, when the authors switch back to TV mode.
- An existing fault in this functionality entails that searching in a page without visible content locks up the teletext system.
- For which only specific combinations are allowed.
- The authors hardcoded a remote control key-sequence that injects this error on their test platform.
4.3 Implementation
- The authors wrote a small Koala component for recording and storing program spectra, and for transmitting them off the television set via the serial connection.
- The transmission is done on a low-priority thread while the CPU is otherwise idle, in order to minimize the impact on the timing behavior.
- Pending their transmission via the serial connection, their component caches program spectra in the extra memory available in their developer version of the hardware.
- For diagnosing the load problem the authors obtained hit spectra for the logical threads mentioned in Section 4.1, resulting in spectra of 315 binary flags.
- For the lock-up problem, the authors define a transaction as the computation in between two key-presses on the remote control.
4.4 Diagnosis
- For the load problem the authors used the scenario of Figure 3.
- The authors marked the last 60 spectra, for the second period of TV mode as ‘failed,’ and those of earlier transactions as ‘passed.’.
- In the first position was a logical thread related to teletext, whose activation is part of the problem, so in this case the authors can conclude that although the diagnosis is not perfect, the implied suggestion for investigating the problem is quite useful.
- For the lock-up problem, the authors used a proper error detection mechanism.
- On each key-press, when caching the current spectrum, a separate routine verifies the values of the two state variables, and marks the current spectrum as failed if they assume an invalid combination.
5 Discussion
- Especially the results for the lock-up problem have convinced us that program spectra, and their application to fault diagnosis are a viable technique and useful tool in the area of embedded software in consumer electronics.
- There are a number of issues with their implementation.
- Because of its rigorous design, the TV is still functioning properly, but everything runs much slower with the block-level instrumentation (e.g., changing channels now takes seconds).
- In their case the authors could store 25 spectra of 65,536 counters, which was already slowing down the scenarios with more than that number of transactions, but even with a more memory-efficient implementation, this inevitably becomes a problem with, for example, overnight testing.
- If an error detection mechanism is available, like in their experiments with the lock-up problem, then these four counters can be calculated on the fly, and the memory requirements become linear in the number columns in the matrix of Figure 2.
7 Conclusion
- On a largescale industrial test case in the area of embedded software in consumer electronics devices.the authors.
- In addition to confirming established effectiveness results, their experiments indicate that the technique lends itself well for application in the resource-constrained environments that are typical for the development of embedded software.
- While their current experiments focus on developmenttime debugging, they open corridors to further applications, such as run-time recovery by rebooting only those parts of a system whose activities correlate with detected errors.
Did you find this useful? Give us your feedback
Citations
686 citations
Cites background from "Diagnosis of Embedded Software Usin..."
...It can easily be integrated with existing testing procedures, and because of the relatively small overhead with respect to CPU time and memory requirements, it lends itself well for application within resource-constrained environments [24]....
[...]
...In addition to our benchmark studies on the Siemens set, we have also evaluated spectrum-based fault localization on a large-scale industrial code (embedded software in consumer electronics, [24])....
[...]
443 citations
Cites background from "Diagnosis of Embedded Software Usin..."
...It can easily be integrated with existing testing procedures, and because of the relatively small overhead with respect to CPU time and memory requirements, it lends itself well for application within resource-constrained environments (Zoeteweij et al., 2007)....
[...]
...All rights reserved....
[...]
353 citations
Cites methods from "Diagnosis of Embedded Software Usin..."
...As an illustration, near-zero wasted effort is measured in experiments with SFL on a 0.5 MLOC industrial software product, reported in [ 40 ], where the problem reports (tests) typically focus on a particular anomaly (small C)....
[...]
87 citations
81 citations
Cites methods from "Diagnosis of Embedded Software Usin..."
..., with higher accuracy, (ii) integrating MBSD with spectrabased approaches to focus the debugging process [27], [28], and (iii) providing simple user interaction for incremental specification of complex program behaviour....
[...]
References
9,439 citations
"Diagnosis of Embedded Software Usin..." refers background in this paper
...As an example, the Jaccard similarity coefficient (see also [8]) expresses the similarity sj of column j and the error vector as the number of positions in which these vectors share an entry 1 (i....
[...]
4,695 citations
"Diagnosis of Embedded Software Usin..." refers methods in this paper
...As defined in [3], we use the following terminology....
[...]
4,335 citations
2,199 citations
"Diagnosis of Embedded Software Usin..." refers background in this paper
..., [6]), where a diagnosis is obtained by logical inference from a formal model of the system, combined with a set of run-time observations....
[...]
Related Papers (5)
Frequently Asked Questions (13)
Q2. What is the role of the software in determining the performance of a particular product?
All audio and video processing is implemented in hardware, but the software is responsible for tasks such as decoding remote control input, displaying the on-screen menu, and coordinating the hardware (e.g., optimizing parameters for audio and video processing based on an analysis of the signals).
Q3. What is the reason for the transmission of a program?
The transmission is done on a low-priority thread while the CPU is otherwise idle, in order to minimize the impact on the timing behavior.
Q4. How many times is the body of the conditional statement executed?
To sort their example array, three exchanges must be made, and block 4, the body of the conditional statement, is executed three times.
Q5. How many binary flags did the authors obtain for the load problem?
For diagnosing the load problem the authors obtained hit spectra for the logical threads mentioned in Section 4.1, resulting in spectra of 315 binary flags.
Q6. How many megabytes of RAM are available in consumer sets?
The total available RAM memory in consumer sets is two megabyte, but in the special developer version that the authors used for their experiments, another two megabyte was available.
Q7. What is the way to solve the lock-up problem?
Especially the results for the lock-up problem have convinced us that program spectra, and their application to fault diagnosis are a viable technique and useful tool in the area of embedded software in consumer electronics.
Q8. How many counters can be calculated on the fly?
If an error detection mechanism is available, like in their experiments with the lock-up problem, then these four counters can be calculated on the fly, and the memory requirements become linear in the number columns in the matrix of Figure 2.
Q9. How many lines of code is used in the teletext2 program?
The software itself consists of approximately 450K lines of C code, which is configured from a much larger (several MLOC) code base of Koala software components [12].
Q10. What are the main reasons why the techniques are complicated?
their design and implementation are complicated by factors that can largely be abstracted away from in other software systems, such as deadlock prevention, and timing constraints involved in, e.g., writing to the graphics display only in those fractions of a second that the screen is not being refreshed.•
Q11. What is the useful tool for obtaining program spectra?
Profiling tools such as gcov are convenient for obtaining program spectra, but they are typically not available in a development environment for embedded software.
Q12. What is the CPU load in the TV set?
A known problem with the specific version of the control software that the authors had access to, is that after teletext viewing, the CPU load when watching television (TV mode) is approximately 10% higher than before teletext viewing.
Q13. What is the CPU load in the teletext viewing?
The CPU load clearly increases around the 60th sample, when the teletext viewing starts, but never returns to its initial level after sample 90, when the authors switch back to TV mode.