Proceedings Article•DOI•

Diagnosis of Embedded Software Using Program Spectra

Q: What have the authors contributed in "Diagnosis of embedded software using program spectra∗" ?

In this paper the authors discuss the application of a specific automated debugging technique, namely software fault localization through the analysis of program spectra, in the area of embedded software in high-volume consumer electronics products. The authors discuss why the technique is particularly well suited for this application domain, and through experiments on an industrial test case they demonstrate that it can lead to highly accurate diagnoses of realistic errors.

Q: How many times is the body of the conditional statement executed?

To sort their example array, three exchanges must be made, and block 4, the body of the conditional statement, is executed three times.

Q: How many lines of code is used in the teletext2 program?

The software itself consists of approximately 450K lines of C code, which is configured from a much larger (several MLOC) code base of Koala software components [12].

Q: What are the main reasons why the techniques are complicated?

their design and implementation are complicated by factors that can largely be abstracted away from in other software systems, such as deadlock prevention, and timing constraints involved in, e.g., writing to the graphics display only in those fractions of a second that the screen is not being refreshed.•

Peter Zoeteweij¹, Rui Abreu¹, R. Golsteijn¹, A.J.C. van Gemund²•Institutions (2)

Delft University of Technology¹, NXP Semiconductors²

26 Mar 2007-pp 213-220

TL;DR: This paper discusses the application of a specific automated debugging technique, namely software fault localization through the analysis of program spectra, in the area of embedded software in high-volume consumer electronics products, and demonstrates that it can lead to highly accurate diagnoses of realistic errors.

read less

Abstract: Automated diagnosis of errors detected during software testing can improve the efficiency of the debugging process, and can thus help to make software more reliable. In this paper we discuss the application of a specific automated debugging technique, namely software fault localization through the analysis of program spectra, in the area of embedded software in high-volume consumer electronics products. We discuss why the technique is particularly well suited for this application domain, and through experiments on an industrial test case we demonstrate that it can lead to highly accurate diagnoses of realistic errors

...read moreread less

Summary (4 min read)

Jump to: [1 Introduction] – [2.1 Failures, Errors, and Faults] – [2.2 Program Spectra] – [2.3 Fault Diagnosis] – [3 Relevance to Embedded Software] – [4.1 Platform] – [4.2 Faults] – [4.3 Implementation] – [4.4 Diagnosis] – [5 Discussion] – [6 Related Work] and [7 Conclusion]

1 Introduction

Software reliability can generally be improved through extensive testing and debugging, but this is often in conflict with market conditions: software cannot be tested exhaustively, and of the bugs that are found, only those with the highest impact on the user-perceived reliability can be solved before the release.
Testing reveals more bugs than can be solved, and debugging is a bottleneck for improving reliability.
Locating a fault is an important step in actually solving it, and program spectra have successfully been applied for this purpose in several tools focusing on various application domains, such as Pinpoint [4], which focuses on large, dynamic on-line transaction processing systems, AMPLE [5], which focuses on object-oriented software, and Tarantula [9], which focuses on C programs.
The remainder of this paper is organized as follows.
In Section 2 the authors explain the diagnosis technique in more detail, and in Section 3 they discuss its applicability to embedded software in consumer electronics products.

2.1 Failures, Errors, and Faults

As defined in [3], the authors use the following terminology.
An error is the part of the total state of the system that may cause a failure.
To illustrate these concepts, consider the C function in Figure 1.
A failure occurs when applying RationalSort yields anything other than a sorted version of its input.
In a software context, faults are often called bugs, and diagnosis is part of debugging.

2.2 Program Spectra

A program spectrum [11] is a collection of data that provides a specific view on the dynamic behavior of software.
As an example, a block count spectrum tells how often each block of code is executed during a run of a program.
A block of code is a C language statement, where the authors do not distinguish between the individual statements of a compound statement, but where they do distinguish between the cases of a switch statement1.
Block 5, the RationalGT function body, is executed six times: once for every iteration of the inner loop.
Beside block count/hit spectra, many other forms of program spectra exist.

2.3 Fault Diagnosis

The hit spectra of M runs constitute a binary matrix, whose columns correspond to N different parts of the program (see Figure 2).
In their case, these parts are blocks of 1This is a slightly different notion than a basic block, which is a block of code that has no branch.
This vector corresponds to a hypothetical part of the program that is responsible for all observed errors.
In the field of data clustering, resemblances between vectors of binary, nominally scaled data, such as the columns in their matrix of program spectra, are quantified by means of similarity coefficients (see, e.g., [8]).
I3 is not sorted, but the denominators in this sequence happen to be equal, in which case no error occurs.

3 Relevance to Embedded Software

The effectiveness of the diagnosis technique described in the previous section has already been demonstrated in several articles (see, e.g., [1], [4], [9]).
Especially because of constraints imposed by the market, the conditions under which this software is developed are somewhat different from those for other software products: Moreover, concurrent systems are difficult to model.
The technique improves insight in the run-time behavior.
Profiling tools such as gcov are convenient for obtaining program spectra, but they are typically not available in a development environment for embedded software.

4.1 Platform

The subject of their experiments is the control software in a particular product line of analog television sets.
All audio and video processing is implemented in hardware, but the software is responsible for tasks such as decoding remote control input, displaying the on-screen menu, and coordinating the hardware (e.g., optimizing parameters for audio and video processing based on an analysis of the signals).
Most teletext2 functionality is also implemented in software.
Essentially, the run-time environment consists of several threads with increasing priorities, and for synchronization purposes, the work on these threads is organized in 315 logical threads inside the various components.
The total available RAM memory in consumer sets is two megabyte, but in the special developer version that the authors used for their experiments, another two megabyte was available.

4.2 Faults

The authors diagnosed two faults, one existing, and one that was seeded to reproduce an error from a different product line.
The CPU load clearly increases around the 60th sample, when the teletext viewing starts, but never returns to its initial level after sample 90, when the authors switch back to TV mode.
An existing fault in this functionality entails that searching in a page without visible content locks up the teletext system.
For which only specific combinations are allowed.
The authors hardcoded a remote control key-sequence that injects this error on their test platform.

4.3 Implementation

The authors wrote a small Koala component for recording and storing program spectra, and for transmitting them off the television set via the serial connection.
The transmission is done on a low-priority thread while the CPU is otherwise idle, in order to minimize the impact on the timing behavior.
Pending their transmission via the serial connection, their component caches program spectra in the extra memory available in their developer version of the hardware.
For diagnosing the load problem the authors obtained hit spectra for the logical threads mentioned in Section 4.1, resulting in spectra of 315 binary flags.
For the lock-up problem, the authors define a transaction as the computation in between two key-presses on the remote control.

4.4 Diagnosis

For the load problem the authors used the scenario of Figure 3.
The authors marked the last 60 spectra, for the second period of TV mode as ‘failed,’ and those of earlier transactions as ‘passed.’.
In the first position was a logical thread related to teletext, whose activation is part of the problem, so in this case the authors can conclude that although the diagnosis is not perfect, the implied suggestion for investigating the problem is quite useful.
For the lock-up problem, the authors used a proper error detection mechanism.
On each key-press, when caching the current spectrum, a separate routine verifies the values of the two state variables, and marks the current spectrum as failed if they assume an invalid combination.

5 Discussion

Especially the results for the lock-up problem have convinced us that program spectra, and their application to fault diagnosis are a viable technique and useful tool in the area of embedded software in consumer electronics.
There are a number of issues with their implementation.
Because of its rigorous design, the TV is still functioning properly, but everything runs much slower with the block-level instrumentation (e.g., changing channels now takes seconds).
In their case the authors could store 25 spectra of 65,536 counters, which was already slowing down the scenarios with more than that number of transactions, but even with a more memory-efficient implementation, this inevitably becomes a problem with, for example, overnight testing.
If an error detection mechanism is available, like in their experiments with the lock-up problem, then these four counters can be calculated on the fly, and the memory requirements become linear in the number columns in the matrix of Figure 2.

7 Conclusion

On a largescale industrial test case in the area of embedded software in consumer electronics devices.the authors.
In addition to confirming established effectiveness results, their experiments indicate that the technique lends itself well for application in the resource-constrained environments that are typical for the development of embedded software.
While their current experiments focus on developmenttime debugging, they open corridors to further applications, such as run-time recovery by rebooting only those parts of a system whose activities correlate with detected errors.

Did you find this useful? Give us your feedback

Figures (3)

Figure 2. The ingredients of fault diagnosis

Figure 1. A faulty C function for sorting rational numbers

Content maybe subject to copyright Report

Diagnosis of Embedded Software using Program Spectra

∗

Peter Zoeteweij

Rui Abreu

Rob Golsteijn

Arjan J.C. van Gemund

Embedded Software Lab

Delft University of Technology

The Netherlands

{p.zoeteweij,r.f.abreu,a.j.c.vangemund}@tudelft.nl

Innovation Center Eindhoven

NXP Semiconductors

The Netherlands

rob.golsteijn@nxp.com

Abstract

Automated diagnosis of errors detected during software

testing can improve the efﬁciency of the debugging pro-

cess, and can thus help to make software more reliable.

In this paper we discuss the application of a speciﬁc au-

tomated debugging technique, namely software fault local-

ization through the analysis of program spectra, in the area

of embedded software in high-volume consumer electronics

products. We discuss why the technique is particularly well

suited for this application domain, and through experiments

on an industrial test case we demonstrate that it can lead to

highly accurate diagnoses of realistic errors.

Keywords: diagnosis, program spectra, automated debug-

ging, embedded systems, consumer electronics.

1 Introduction

Software reliability can generally be improved through

extensive testing and debugging, but this is often in con-

ﬂict with market conditions: software cannot be tested ex-

haustively, and of the bugs that are found, only those with

the highest impact on the user-perceived reliability can be

solved before the release. In this typical scenario, testing

reveals more bugs than can be solved, and debugging is a

bottleneck for improving reliability. Automated debugging

techniques can help to reduce this bottleneck.

The subject of this paper is a particular automated debug-

ging technique, namely software fault localization through

the analysis of program spectra [11]. These can be seen as

projections of execution traces that indicate which parts of

a program were active during various runs of that program.

The diagnosis consist in analyzing the extent to which the

∗

This work has been carried out as part of the TRADER project under

the responsibility of the Embedded Systems Institute. This project is par-

tially supported by the Netherlands Ministry of Economic Affairs under

the BSIK03021 program.

activity of speciﬁc parts correlates with errors detected in

the different runs.

Locating a fault is an important step in actually solving

it, and program spectra have successfully been applied for

this purpose in several tools focusing on various application

domains, such as Pinpoint [4], which focuses on large, dy-

namic on-line transaction processing systems, AMPLE [5],

which focuses on object-oriented software, and Tarantula

[9], which focuses on C programs.

In this paper, we discuss the applicability of the tech-

nique to embedded software, and speciﬁcally to embed-

ded software in high-volume consumer electronics prod-

ucts. Software has become an important factor in the de-

velopment, marketing, and user-perception of these prod-

ucts, and the typical combination of limited computing re-

sources, complex systems, and tight development deadlines

make the technique a particularly attractive means for im-

proving product reliability.

To support our argument, we report the outcome of two

experiments, where we diagnosed two different errors oc-

curring in the control software of a particular product line

of television sets from a well-known international consumer

electronics manufacturer. In both experiments, the tech-

nique is able to locate the (known) faults that cause these

errors quite well, and in one case, this implies an accuracy

of a single statement in approximately 450K lines of code.

The remainder of this paper is organized as follows. In

Section 2 we explain the diagnosis technique in more detail,

and in Section 3 we discuss its applicability to embedded

software in consumer electronics products. In Section 4 we

describe our experiments, and in Section 5 we discuss how

our current implementation can be improved. In Section 6

we discuss related work. We conclude in Section 7.

2 Preliminaries

In this section we introduce program spectra, and de-

scribe how they are used for diagnosing software faults.

void RationalSort(int n, int

num, int

den)

{

block 1

int i,j,temp;

for ( i=n-1; i>=0; i-- ) {

block 2

for ( j=0; j<i; j++ ) {

block 3

if (RationalGT(num[j], den[j],

num[j+1], den[j+1])) {

block 4

temp = num[j];

num[j] = num[j+1];

num[j+1] = temp; } } }

}

Figure 1. A faulty C function for sorting ratio-

nal numbers

First we introduce the necessary terminology.

2.1 Failures, Errors, and Faults

As deﬁned in [3], we use the following terminology.

• A failure is an event that occurs when delivered service

deviates from correct service.

• An error is the part of the total state of the system that

may cause a failure.

• A fault is the cause of an error in the system.

To illustrate these concepts, consider the C function in

Figure 1. It is meant to sort, using the bubble sort algo-

rithm, a sequence of n rational numbers whose numerators

and denominators are passed via parameters num and den,

respectively. There is a fault (bug) in the swapping code of

block 4: only the numerators of the rational numbers are

swapped. The denominators are left in their original order.

A failure occurs when applying RationalSort yields

anything other than a sorted version of its input. An error

occurs after the code inside the conditional statement is ex-

ecuted, while den[j] "= den[j+1]. Such errors can be

temporary: if we apply RationalSort to the sequence

$, an error occurs after the ﬁrst two numerators are

swapped. However, this error is “canceled” by later swap-

ping actions, and the sequence ends up being sorted cor-

rectly. Faults do not automatically lead to errors either: no

error will occur if the input is already sorted, or if all de-

nominators are equal.

The purpose of diagnosis is to locate the faults that are

the root cause of detected errors. As such, error detection is

a prerequisite for diagnosis. As a rudimentary form of er-

ror detection, failure detection can be used, but in software

more powerful mechanisms are available, such as pointer

checking, array bounds checking, deadlock detection, etc.

In a software context, faults are often called bugs, and

diagnosis is part of debugging. Computer-aided techniques

as the one we consider here are known as automated debug-

ging.

2.2 Program Spectra

A program spectrum [11] is a collection of data that pro-

vides a speciﬁc view on the dynamic behavior of software.

This data is collected at run-time, and typically consist of

a number of counters or ﬂags for the different parts of a

program. As such, recording a program spectrum is a light-

weight analysis compared to other run-time methods, such

as, e.g., dynamic slicing [10].

As an example, a block count spectrum tells how often

each block of code is executed during a run of a program. In

this paper, a block of code is a C language statement, where

we do not distinguish between the individual statements of a

compound statement, but where we do distinguish between

the cases of a switch statement

. Suppose that the function

RationalSort of Figure 1 is used to sort the sequence

$, which it happens to do correctly. This would

result in the following block count spectrum, where block 5

refers to the body of the RationalGT function, which has

not been shown in Figure 1.

block

12345

count 14636

Block 1, the body of the function RationalSort, is exe-

cuted once. Blocks 2 and 3, the bodies of the two loops, are

executed four and six times, respectively. To sort our exam-

ple array, three exchanges must be made, and block 4, the

body of the conditional statement, is executed three times.

Block 5, the RationalGT function body, is executed six

times: once for every iteration of the inner loop.

If we are only interested in whether a block is executed

or not, we can use binary ﬂags instead of counters. In this

case, the block count spectra revert to block hit spectra. Be-

side block count/hit spectra, many other forms of program

spectra exist. See [7] for an overview. In this paper we

will work with block hit spectra, and hit spectra for logi-

cal threads used in the software of our test case (see Sec-

tion 4.1).

2.3 Fault Diagnosis

The hit spectra of M runs constitute a binary matrix,

whose columns correspond to N different parts of the pro-

gram (see Figure 2). In our case, these parts are blocks of

This is a slightly different notion than a basic block, which is a block

of code that has no branch.

N parts errors

M spectra







... x



















... s

Figure 2. The ingredients of fault diagnosis

C code. In some of the runs an error is detected. This in-

formation constitutes another column vector, the error vec-

tor. This vector corresponds to a hypothetical part of the

program that is responsible for all observed errors. Fault lo-

calization essentially consists in identifying the part whose

column vector resembles the error vector most.

In the ﬁeld of data clustering, resemblances between vec-

tors of binary, nominally scaled data, such as the columns

in our matrix of program spectra, are quantiﬁed by means

of similarity coefﬁcients (see, e.g., [8]). As an example,

the Jaccard similarity coefﬁcient (see also [8]) expresses the

similarity s

of column j and the error vector as the num-

ber of positions in which these vectors share an entry 1 (i.e.,

block was exercised and the run has failed), divided by this

same number plus the number of positions in which the vec-

tors have different entries:

(j)

(j)+a

(j)

(1)

where a

(j)=|{i | x

= p ∧ e

= q}|, and p, q ∈{0, 1}.

Under the assumption that a high similarity to the error

vector indicates a high probability that the corresponding

parts of the software cause the detected errors, the calcu-

lated similarity coefﬁcients rank the parts of the program

with respect to their likelihood of containing the faults.

To illustrate the approach, suppose that we apply the

RationalSort function to the input sequences I

#$, I

= #

$, I

= #

$ and I

= #

$, I

$, and I

= #

, I

, and I

are already sorted, and lead to passed runs.

is not sorted, but the denominators in this sequence hap-

pen to be equal, in which case no error occurs. I

is the ex-

ample from Section 2.1: it is not sorted, and an error occurs

during its execution, but this error goes undetected. Only for

the program fails. The calculated result is #

$ in-

stead of #

$, which is a clear indication that an error

has occurred.

The block hit spectra for these runs are as follows (’1’

denotes a hit), where block 5 corresponds to the body of

the RationalGT function, which has not been shown in

Figure 1.

block

input

12345error

10000 0

11000 0

11111 0

11111 1

11101 0

For this data, the calculated Jaccard coefﬁcients are s

, s

, which (correctly)

identiﬁes block 4 as the most likely location of the fault.

3 Relevance to Embedded Software

The effectiveness of the diagnosis technique described

in the previous section has already been demonstrated in

several articles (see, e.g., [1], [4], [9]). In this paper we

present the beneﬁts and discuss the issues speciﬁcally re-

lated to debugging embedded software in consumer elec-

tronics products. Especially because of constraints imposed

by the market, the conditions under which this software is

developed are somewhat different from those for other soft-

ware products:

• To reduce unit costs, and often to ensure portability

of the devices, the software runs on non-commodity

hardware, and computing resources are limited.

• As a consequence, many facilities that developers of

non-embedded software have come to rely on are ab-

sent, or are available only in rudimentary forms. Ex-

amples are proﬁling tools that give insight in the dy-

namic behavior of systems.

• At the same time, the systems are highly concurrent,

and operate at a low level of abstraction from the hard-

ware. Therefore, their design and implementation are

complicated by factors that can largely be abstracted

away from in other software systems, such as dead-

lock prevention, and timing constraints involved in,

e.g., writing to the graphics display only in those frac-

tions of a second that the screen is not being refreshed.

• On top of challenges that the entire software indus-

try has to deal with, such as geographically distributed

development organizations, the strong competition be-

tween manufacturers of consumer electronics makes it

absolutely vital that release deadlines are met.

• Although important safety mechanisms, such as short-

circuit detection, are sometimes implemented in soft-

ware, for a large part of the functionality there are no

personal risks involved in transient failures.

Consequently, it is not uncommon that consumer elec-

tronics products are shipped with several known software

faults outstanding. To a certain extent, this also holds for

other software products, but the combination of the com-

plexity of the systems, the tight constraints imposed by the

market, and the relatively low impact of the majority of pos-

sible system failures creates a unique situation. Instead of

aiming for correctness, the goal is to create a product that is

of value to customers, despite its imperfections, and to bring

the reliability to a commercially acceptable level (also com-

pared to the competition) before a product must be released.

The technique of Section 2 can help to reach this goal

faster, and may thus reduce the time-to-market, and lead to

more reliable products. Speciﬁc beneﬁts are the following.

• As a black-box diagnosis technique, it can be applied

without any additional modeling effort. This effort

would be hard to justify under the market conditions

described above. Moreover, concurrent systems are

difﬁcult to model.

• The technique improves insight in the run-time behav-

ior. For embedded software in consumer electronics,

this is often lacking, because of the concurrency, but

also because of the decentralized development.

• We expect that the technique can easily be integrated

with existing testing procedures, such as overnight

playback of recorded usage scenarios. In addition to

the information that errors have occurred in some sce-

narios, this gives a ﬁrst indication of the parts of the

software that are likely to be involved in these errors.

In the large, geographically distributed development

organizations that we are dealing with, it may also help

to identify which teams of developers to contact.

• Last but not least, the technique is light-weight, which

is relevant because of the non-commodity hardware

and limited computing resources. All that is needed is

some memory for storing program spectra, or for cal-

culating the similarity coefﬁcients on the ﬂy (which re-

duces the space complexity from O(M ×N) to O(N ),

see Section 5). Proﬁling tools such as gcov are conve-

nient for obtaining program spectra, but they are typ-

ically not available in a development environment for

embedded software. However, the same data can be

obtained through source code instrumentation.

While none of these beneﬁts are unique, their combination

makes program spectrum analysis an attractive technique

for diagnosing embedded software in consumer electronics.

4 Experiments

In this section we describe our experience with applying

the techniques of Section 2 to an industrial test case.

4.1 Platform

The subject of our experiments is the control software

in a particular product line of analog television sets. All

audio and video processing is implemented in hardware,

but the software is responsible for tasks such as decoding

remote control input, displaying the on-screen menu, and

coordinating the hardware (e.g., optimizing parameters for

audio and video processing based on an analysis of the sig-

nals). Most teletext

functionality is also implemented in

software.

The software itself consists of approximately 450K lines

of C code, which is conﬁgured from a much larger (several

MLOC) code base of Koala software components [12].

The control processor is a MIPS running a small multi-

tasking operating system. Essentially, the run-time environ-

ment consists of several threads with increasing priorities,

and for synchronization purposes, the work on these threads

is organized in 315 logical threads inside the various com-

ponents. Threads are preempted when work arrives for a

higher-priority thread.

The total available RAM memory in consumer sets is

two megabyte, but in the special developer version that we

used for our experiments, another two megabyte was avail-

able. In addition, the developer sets have a serial connec-

tion, and a debugger interface for manual debugging on a

PC.

4.2 Faults

We diagnosed two faults, one existing, and one that was

seeded to reproduce an error from a different product line.

Load Problem. A known problem with the speciﬁc version

of the control software that we had access to, is that after

teletext viewing, the CPU load when watching television

(TV mode) is approximately 10% higher than before tele-

text viewing. This is illustrated in Figure 3, which shows the

CPU load for the following scenario: one minute TV mode,

30 s teletext viewing, and one minute of TV mode. The

CPU load clearly increases around the 60th sample, when

the teletext viewing starts, but never returns to its initial

level after sample 90, when we switch back to TV mode.

Teletext Lock-up Problem. Another product line of televi-

sion sets provides a function for searching in teletext pages.

An existing fault in this functionality entails that searching

in a page without visible content locks up the teletext sys-

tem. A likely cause for the lock-up is an inconsistency in

the values of two state variables in different components,

A standard for broadcasting information (e.g., news, weather, TV

guide) in text pages, very popular in Europe.

100

0 20 40 60 80 100 120 140 160

Load %

Sample

Figure 3. CPU load measured per second

for which only speciﬁc combinations are allowed. We hard-

coded a remote control key-sequence that injects this error

on our test platform.

4.3 Implementation

We wrote a small Koala component for recording and

storing program spectra, and for transmitting them off the

television set via the serial connection. The transmission is

done on a low-priority thread while the CPU is otherwise

idle, in order to minimize the impact on the timing behav-

ior. Pending their transmission via the serial connection,

our component caches program spectra in the extra mem-

ory available in our developer version of the hardware.

For diagnosing the load problem we obtained hit spectra

for the logical threads mentioned in Section 4.1, resulting

in spectra of 315 binary ﬂags. We approached the lock-

up problem at a much ﬁner granularity, and obtained block

hit spectra for practically all blocks of code in the control

software, resulting in spectra of over 60,000 ﬂags.

The hit spectra for the logical threads are obtained by

manually instrumenting a centralized scheduling mecha-

nism. For the block hit spectra we automatically instru-

mented the entire source code using the Front [2] parser

generator.

In Section 2.3 we use program spectra for different runs

of the software, but for embedded software in consumer

electronics, and indeed for most interactive systems, the

concept of a run is not very useful. Therefore we record

the spectra per transaction, instead of per run, and we use

two different notions of a transaction for the two different

faults that we diagnosed:

• for the load problem, we use a periodic notion of a

transaction, and record the spectra per second.

• for the lock-up problem, we deﬁne a transaction as the

computation in between two key-presses on the remote

control.

4.4 Diagnosis

For the load problem we used the scenario of Figure 3.

We marked the last 60 spectra, for the second period of

TV mode as ‘failed,’ and those of earlier transactions as

‘passed.’ In the ranking that follows from the analysis of

Section 2.3, the logical thread that had been identiﬁed by

the developers as the actual cause of the load problem was

in the second position out of 315. In the ﬁrst position was a

logical thread related to teletext, whose activation is part of

the problem, so in this case we can conclude that although

the diagnosis is not perfect, the implied suggestion for in-

vestigating the problem is quite useful.

For the lock-up problem, we used a proper error detec-

tion mechanism. On each key-press, when caching the cur-

rent spectrum, a separate routine veriﬁes the values of the

two state variables, and marks the current spectrum as failed

if they assume an invalid combination. Although this is a

special-purpose mechanism, including and regularly check-

ing high-level assert-like statements about correct behavior

is a valid means to increase the error-awareness of systems.

Using a very simple scenario of 23 key-presses that es-

sentially (1) veriﬁes that the TV and teletext subsystems

function correctly, (2) triggers the error injection, and (3)

checks that the teletext subsystem is no longer responding,

we immediately got a good diagnosis of the detected error:

the ﬁrst two positions in the total ranking of over 60,000

blocks pointed directly to our error injection code. Adding

another three key-presses to exonerate an uncovered branch

in this code made the diagnosis perfect: the exact statement

that introduced the state inconsistency was located out of

approximately 450K lines of source code.

5 Discussion

Especially the results for the lock-up problem have con-

vinced us that program spectra, and their application to fault

diagnosis are a viable technique and useful tool in the area

of embedded software in consumer electronics. However,

there are a number of issues with our implementation.

First, we cannot claim that we have not altered the timing

behavior of the system. Because of its rigorous design, the

TV is still functioning properly, but everything runs much

slower with the block-level instrumentation (e.g., changing

channels now takes seconds). One reason is that currently,

we collect block count spectra at byte resolution, and con-

vert to block hit spectra off-line. Updating the counters in

a multi-threaded environment requires a critical section for

every executed block, which is hugely expensive. Fortu-

nately, this information is not used, and we believe we can

implement a binary ﬂag update without a critical section.

Second, we cache the spectra of passed transactions, and

transmit them off the system during CPU idle time. Be-

HTML Viewer

Frequently Asked Questions (13)

Q1. What have the authors contributed in "Diagnosis of embedded software using program spectra∗" ?

In this paper the authors discuss the application of a specific automated debugging technique, namely software fault localization through the analysis of program spectra, in the area of embedded software in high-volume consumer electronics products. The authors discuss why the technique is particularly well suited for this application domain, and through experiments on an industrial test case they demonstrate that it can lead to highly accurate diagnoses of realistic errors.

Q2. What is the role of the software in determining the performance of a particular product?

All audio and video processing is implemented in hardware, but the software is responsible for tasks such as decoding remote control input, displaying the on-screen menu, and coordinating the hardware (e.g., optimizing parameters for audio and video processing based on an analysis of the signals).

Q3. What is the reason for the transmission of a program?

The transmission is done on a low-priority thread while the CPU is otherwise idle, in order to minimize the impact on the timing behavior.

Q4. How many times is the body of the conditional statement executed?

To sort their example array, three exchanges must be made, and block 4, the body of the conditional statement, is executed three times.

Q5. How many binary flags did the authors obtain for the load problem?

For diagnosing the load problem the authors obtained hit spectra for the logical threads mentioned in Section 4.1, resulting in spectra of 315 binary flags.

Q6. How many megabytes of RAM are available in consumer sets?

The total available RAM memory in consumer sets is two megabyte, but in the special developer version that the authors used for their experiments, another two megabyte was available.

Q7. What is the way to solve the lock-up problem?

Especially the results for the lock-up problem have convinced us that program spectra, and their application to fault diagnosis are a viable technique and useful tool in the area of embedded software in consumer electronics.

Q8. How many counters can be calculated on the fly?

If an error detection mechanism is available, like in their experiments with the lock-up problem, then these four counters can be calculated on the fly, and the memory requirements become linear in the number columns in the matrix of Figure 2.

Q9. How many lines of code is used in the teletext2 program?

The software itself consists of approximately 450K lines of C code, which is configured from a much larger (several MLOC) code base of Koala software components [12].

Q10. What are the main reasons why the techniques are complicated?

their design and implementation are complicated by factors that can largely be abstracted away from in other software systems, such as deadlock prevention, and timing constraints involved in, e.g., writing to the graphics display only in those fractions of a second that the screen is not being refreshed.•

Q11. What is the useful tool for obtaining program spectra?

Profiling tools such as gcov are convenient for obtaining program spectra, but they are typically not available in a development environment for embedded software.

Q12. What is the CPU load in the TV set?

A known problem with the specific version of the control software that the authors had access to, is that after teletext viewing, the CPU load when watching television (TV mode) is approximately 10% higher than before teletext viewing.

Q13. What is the CPU load in the teletext viewing?

The CPU load clearly increases around the 60th sample, when the teletext viewing starts, but never returns to its initial level after sample 90, when the authors switch back to TV mode.

Diagnosis of Embedded Software Using Program Spectra

Summary (4 min read)

1 Introduction

2.1 Failures, Errors, and Faults

2.2 Program Spectra

2.3 Fault Diagnosis

3 Relevance to Embedded Software

4.1 Platform

4.2 Faults

4.3 Implementation

4.4 Diagnosis

5 Discussion

7 Conclusion

Figures (3)

Citations

Cites background from "Diagnosis of Embedded Software Usin..."

Cites background from "Diagnosis of Embedded Software Usin..."

Cites methods from "Diagnosis of Embedded Software Usin..."

Cites methods from "Diagnosis of Embedded Software Usin..."

References

"Diagnosis of Embedded Software Usin..." refers background in this paper

"Diagnosis of Embedded Software Usin..." refers methods in this paper

"Diagnosis of Embedded Software Usin..." refers background in this paper

Related Papers (5)

Frequently Asked Questions (13)

Q1. What have the authors contributed in "Diagnosis of embedded software using program spectra∗" ?

Q2. What is the role of the software in determining the performance of a particular product?

Q3. What is the reason for the transmission of a program?

Q4. How many times is the body of the conditional statement executed?

Q5. How many binary flags did the authors obtain for the load problem?

Q6. How many megabytes of RAM are available in consumer sets?

Q7. What is the way to solve the lock-up problem?

Q8. How many counters can be calculated on the fly?

Q9. How many lines of code is used in the teletext2 program?

Q10. What are the main reasons why the techniques are complicated?

Q11. What is the useful tool for obtaining program spectra?

Q12. What is the CPU load in the TV set?

Q13. What is the CPU load in the teletext viewing?