scispace - formally typeset
Search or ask a question
Book ChapterDOI

Applying Jlint to Space Exploration Software

11 Jan 2004-Vol. 2937, pp 297-308
TL;DR: The results show that a few analysis techniques are sufficient to avoid almost all false positives in the multi-threading warnings, and these techniques include investigating all possible callers and a few code idioms.
Abstract: Java is a very successful programming language which is also becoming widespread in embedded systems, where software correctness is critical. Jlint is a simple but highly efficient static analyzer that checks a Java program for several common errors, such as null pointer exceptions, and overflow errors. It also includes checks for multi-threading problems, such as deadlocks and data races. The case study described here shows the effectiveness of Jlint in finding certain faults, including multi-threading problems. Analyzing the reasons for false positives in the multi-threading warnings gives an insight into design patterns commonly used in multi-threaded code. The results show that a few analysis techniques are sufficient to avoid almost all false positives. These techniques include investigating all possible callers and a few code idioms. Verifying the correct application of these patterns is still crucial, because their correct usage is not trivial.

Summary (2 min read)

1 Introduction

  • Therefore an automated tool which can detect faults easily, preferably early in the lifecycle of software, can be very useful to find defects.
  • Among similar tools geared towards Java, it is one of the most suitable with respect to ease of use (no annotations required) and free availability (the tool is Open Source) [I]. . . .

13 Outline

  • This text is organized as follows: Section 2 describes Jlint and how it was used for this project.
  • Sections 3 and 4 show the results of applying Jlint to space exploration program code.
  • Design patterns which are common among these two projects are analyzed in Section 5.
  • Section 6 summarizes the results and concludes.

2.1 Tool description

  • Typical warnings about possible faults issued by Jlint are n u l l pointer dereferences, array bounds overflows, and value overflows.
  • The latter may occur if one multiplies two 32 bit integer values without converting them to 64 bit fist Many warnings that Jlint issues are code guidelines:.

I

  • Jlint also includes many analyses for multi-threaded programs.
  • Some of Jlint's warnings for multi-threaded programs are overly cautious.
  • Possible data race warnings for method calls or variable accesses do not necessarily imply a data race.
  • The reason for such false positives are both difficulties inherent to static analysis, such as pointer aliasing across method calls, and limitations in Jlint itself, where its algorithms could be refined with known techniques.

2.2 Warning review process

  • During the review process, Jlint's warnings were categorized to see whether they refer to the same problem.
  • Such situations constitute calls to the same method from different callers, the same variable used in different contexts, or the same design pattern applied throughout the class.
  • In a separate count, counting the number of distinct problems rather than individual warnings, all such cases were counted once.
  • Furthermore, the time required for this process was recorded.
  • Note that the review activity was often interrupted by other activities such as writing this paper.

3.2 Jlint evaluation

  • The 30 deadlock warnings all referred to the same two classes.
  • There were two sets of warnings, the first set containing ten, the second one 20 warnings.
  • The first ten warnings, all of them false positives, showed incomplete loops in the call graph.
  • Another lock was used that makes a deadlock possible.
  • Therefore these warnings referred to actual faults in the code.

Results:

  • While reviewing the multi-threading warnings was time-consuming due to the complex interactions in the code, it was feasible and helped to highlight the critical parts of the source code.
  • The effort was justifiable for a project of this complexity.

3.3 Comparison to other projects

  • The eleven new bugs found by J h t were a great success, even considering that the seven deadlocks correspond to two classes where other deadlocks have been known to occur.
  • Jlint reported different methods than those reported in otiier analyses.

4 DS1

  • The second case study consisted of an attitude control system and a fault protection system for the Deep Space 1 @SI) space craft.
  • For the DS1 code base, it took 0.17 seconds to check the entire code base on the same PowerPC G4 with a clock frequency of 500 MHz.

4.1 Description of DSI

  • DS1 was a technology-testing mission, which was launched October 24 1998, and which ended its primary mission in September 1999.
  • DS1 contained and tested twelve new kinds of space-travel technologies, for example, ion propulsion and artificial intelligence for autonomous control.
  • The attitudecontrol system monitors and controls the space craft's attitude, that is, its position in 3dimensional space.
  • The fault-protection system monitors the operation of the space craft and initiates corrective actions in case errors occur.
  • The original C code was re-designed in Java, using best practices in object-oriented design.

4.2 Jlint evaluation

  • This study indicates that four design patterns prevail in cases where code is apparently not thread-safe: Synchronization of all callers, use of read-only values, threadlocal copies of data, and the use of thread-safe container classes.
  • Some of the data race warnings for the Rover code pointed out cases where it was attempted to use the read-only pattern, but the use was not carried out consistently throughout the project.
  • Such a small mistake violates the property that guarantees thread-safety.

6 Conclusions

  • An analysis of the false positives showed that in apparently thread-unsafe code, four common design patterns ensure thread-safety in all cases.
  • Static analysis tools should therefore be extended with specific algorithms geared towards these patterns to reduce false positives.
  • Furthermore, these patterns were not always applied correctly and are still a significant source of programming errors.
  • This calls for tools that verify the correct application of these patterns, thereby pointing out even more subtle errors than previously possible.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Applying Jlint to Space Exploration Software
Cyrille
Artho‘
and Klaus Havelund2
Computer Systems Institute,
ETH
Zurich, Switzerland
Keseel Technology, NASA Ames Research Center, Moffett Field, California USA
Abstract.
Java is a very successful programming language which is also
be-
coming widespread in embedded systems, where software correctness is critical.
Jlint is a simple
but
highly efficient static analyzer that checks a Java program
for several common errors,
such
as
null
pointer exceptions, and overflow er-
rors. It also includes checks for multi-threading problems, such
as
deadlocks and
data races. The case study described here shows the effectiveness
of
Jhnt in find-
false positives in the muIti-threading warnings gives
an
insight
into
design
pat-
terns commonly used
in
multi-threaded code. The results show that a few analy-
sis techniques are sufficient to avoid almost all false positives. These techniques
include investigating all possible callers and a few code idioms. Verifying the
comect application of these patterns is still crucial, because their correct usage
is
not trivial.
k.?g
ccittii
faiiYt,
kc!nding
iiiiiYt-*&-e&ig
piobkiii;.
Analyzing the ieasaiis
fai
1
Introduction
Java is becoming more widespread
in
the
area of embedded systems,
both
as
a
scaled-
down “Micro Edition”
[20]
or
by having real-time extensions
[6,5].
In
such systems,
software failures
are
very costly, because the software cannot always be replaced on a
running system, and failures may have expensive
or
even catastrophic consequences..
These costs
are
obviously prohibitively high when
a
software-related problem causes
the
failure of
a
space craft
[14].
Therefore
an
automated tool which
can
detect faults easily, preferably early
in
the
lifecycle of software, can be very useful
to
find
defects. One tool that allows fault de-
tection easily,
even
in
incomplete systems,
is
Jlint.
Among
similar tools geared towards
Java, it is one of the most suitable with respect to
ease
of use
(no
annotations required)
and free availability (the tool is
Open
Source)
[I].
.
..
1.1
The
Java
programming
langgage
Java is
a
modem, object-oriented programtning language that has had a large success
in
the
past few
years.
It
was
one of
the
first
languages where the source code was not
compiled to machine code, but to a different form,
the
byrecode.
This bytecode
runs
in
a
dedicated environment,
the
virrual
machine.
In
order to guarantee
the
integrity of
the
system,
each
class
file containing bytecode is checked prior to execution
[II,
19,211.
The Java language allows each object
to
have any number offields, which are at-
tributes of
each
object.
These
may
be
static,
Le.,
shared among all instances of a certain
,

class, or dynamic, Le., each instance has its own fields.
In
contrast to that,
local
vari-
ables
are thread-local and only visible within one method.
Java allows inheritance: a method of a given class may be
overridden
by a method
of the same name. Similarly, fields in
a
subclass
shadow
those with the same name
in
the superclass.
In
general, these mechanisms work well for small code examples
but are dangerous in larger projects. Methods overriding other methods must ensure
they do
not
violate invariants of the Superclass. Similar problems occur with variable
shadowing. The programmer is not always aware that a variable with the same name
already exists
on
a different level, such as the superclass.
In
order to prevent incorrect programs from corrupting the system, Java’s virtual
machine has various safety mechanisms built in. Each variable access is guarded against
manipulating memory outside the allocated area.
In
particular, pointers must not
be
null
when dereferenced, and array indices must be in a valid range. If these properties
are violated,
an
exception
is thrown indicating a programming error. This is a highly
undesirable behavior
in
most cases. Ideally, such errors should be prevented by static
analysis, rather than caught at run-time.
Furthermore, Java offers mechanisms to write multi-threaded programs. The two
key mechanisms are locking primitives, using the
synchronized
keyword, and inter-
thread synchronization with the
wait
and
notify
methods.
A
method or block which
is declared
synchronized
is only entered after the exclusive lock for that critical sec-
tion
has
been obtained. Lock usage for shared data is specified by the programmer.
Incorrect lock usage using too many locks may lead to
deud1ock.s.
For
example, if two
threads each wait
on
a lock held by the other thread, both threads cannot continue their
execution.
On
the other hand, if a value is accessed with insufficient lock protection,
data
races
may occur:
two
threads may access the same value concurrently, and the
results
of
the operations are
no
longer deterministic.
Java’s message passing mechanisms for threads also
is
a source of problems.
A
call
to
wait
allows a thread to suspend
until
a condition becomes true, which must
be signaled by
notify
by
another thread. When calling
wait
the calling thread must
ensure that it owns
the
lock it
waits
on,
and
also
release any other locks before the call.
Otherwise, remaining locks held are unavailable to other threads, which may in
turn
block when trying to obtain them. This can prevent them from calling
notify
which
would allow the waiting thread to release
its
lock. This situation is also a deadlock.
1.2
Related
work
Much
effort
has gone into fault-finding in Java programs, single-threaded and multi-
threaded. The approaches can be separated into
static
checkers,
which
check a program
at compile-time and try to approximate its run-time behavior, and
dynamic
checkers,
which
try
to catch and analyze anomaiies during program execution.
Several static analysis tools exist that examine
a
program for faults such as
null
pointer dereferences
or
data races. The ESC/Java
[9]
tool is, like nint,
also
based on
static analysis, or more generally
on
theorem proving- It, however, requires annotation
of
the
program. While it is more precise than Jlint, it is not nearly as fast and requires a
large effort from the user to fully exploit the power of this tool
[9].
2

Dynamic tools have the advantage of having more precise information available
in
the execution trace. The Eraser algorithm [22], which has been implemented in the
Visual Threads tool [12] to analyze
C
and
C++
programs, is an example of
a
such
an
algorithm that examines a program execution trace for locking patterns and variable
accesses in order to predict potential data races. It also checks for deadlocks and several
other errors.
The Java PathExplorer tool (JPaX)
[
161 performs deadlock analysis and the Eraser
data race analysis
on
Java programs. It furthermore recently has been extended with the
high-level data race detection algorithm described in [3]. This algorithm analyzes how
collections
of
variables are accessed by multiple threads.
More heavyweight dynamic approaches include model checking, which explores
all possible schedules
in
a program. Recently, model checkers have been developed
that apply directly to programs (instead of just models thereof).
This
includes the Java
PathFinder system
(JPF)
developed by
NASA
[15,24], and similar systems [lo, 8,17,4,
231. Such systems, however, suffer from the state space explosion problem. In [13] we
describe an extension of Java PathFiider which performs data race analysis (and dead-
lock analysis) in simulation mode, whereafter the model checker
is
used to demonstrate
whether the data race (deadlock) warnings are real or
not.
This paper focuses
on
applying Jlint
[2]
to the software for detecting errors stat-
ically. JLint uses static analysis and abstract interpretation
to
find difficult errors at
compile-time.
A
similar case study with Kit has been made before, applying it to large
projects [2]. The difference to this case study is that the other case study had scalability
in mind. Jlint had been applied to packages containing several hundred thousand lines
of code, generating hundreds of warning messages. Because of this, the warnings had
been evaluated selectively, omitting some hard-to-check deadlock warnings. In
this
case
study, an effort was made to analyze every single warning and also see what kinds of
design patterns cause false positives?
13
Outline
This text is organized
as
follows: Section
2
describes Jlint and how it was used for this
project. Sections
3
and 4 show the results
of
applying Jlint to space exploration program
code. Design patterns which are common among these
two
projects are analyzed
in
Section
5.
Section
6
summarizes the results and concludes.
2
Jlint
2.1
Tool
description
Jlint checks Java code and finds bugs, inconsistencies and synchronization problems
by
performing a data flow analysis, abstract interpretation, and building the lock graph. It
issues warnings about potential problems. These warnings do
not
imply that
an
actual
Design patterns
commonly
denote compositions
of
objects
in
software.
In
this
paper, the notion
of
composition is different.
It
indudes lock patterns
and
sometimes
only
applies to a small part
of
the program.
In
that context, we
also
use the term “code idiom”.
3

error exists. This makes Jlint unsound
as
a program prover. Moreover, Jlint can also
miss errors, making it incomplete. The reason for
this
is that the goal was to make Jlint
practical, scalable, and possible to implement it in
a
short time.
Typical warnings about possible faults issued by Jlint are
null
pointer dereferences,
array bounds overflows, and value overflows. The latter may occur if one multiplies two
32
bit integer values without converting them to
64
bit
fist
Many warnings that Jlint issues are code guidelines:
A
local variable should never
have the same name as a field of the same class or
a
superclass. When a method of a
given name
is
overridden, all its variants should be ovemdden,
in
order to guarantee
a
consistent behavior of the subclass.
I
Jlint also includes many analyses for multi-threaded programs. Some of Jlint’s
warnings for multi-threaded programs are overly cautious. For instance, possible data
race warnings for method calls or variable accesses do not necessarily imply a data
race. The reason for such false positives are both difficulties inherent to static analysis,
such
as
pointer aliasing across method calls, and limitations in Jlint itself, where its
algorithms could be refined with known techniques.
2.2 Warning
review
process
Jlint gives fairly descriptive warnings for each problem found. The context given is
limited to the class
in
which the error occurs,
the
line number, and fields used or meth-
ods called.
This
is always sufficient to find
the
source of simple warnings, which con-
cern
sequentid
properties
such
as
null
pointer dereferences. These warnings are easy
to
review and were considered in a first pass. The other warnings, concerning multi-
threading problems,
take
much more time
to
consider, and were evaluated in a second
phase.
The review process essentially checks whether
the
problems described in the warn-
ings cm actually occur at run-time.
In
simple cases, warnings may be ruled out given
the
algorithmic properties of the program. Complex cases include reviewing callers to
the method in question.
Data race and deadlock warnings fall
in
this
category. They require constructing a
part of
the
call graph including locks owned by callers when a method is called. If it
can be ensured that all calls to non-synchronized shared methods
are
made only through
methods that already employ lock protection then there cannot be
a
data race:
This review process can be rather time-consuming and took up to twelve minutes for
one problem instance in the experiments carried out. Many warnings occur in similar
contexts,
so
warnings referring
to
the same problem can usually be easily confirmed as
duplicates. This part of the review process was not yet automated
in
any way but could
be automated to a large extent with known techniques. Both cases studies were done
without prior knowledge of
the
program code. It can be assumed that the time to review
the
warnings is shorter for the author of
the
code, especially when reviewing data race
or deadlock warnings.
Methods
that
access
a
shared
field
are
also
considered “shared”
in
this
context. The
lock
used
for
ensuring
mutual exclusion
must
be
the
same lock
for
all
calls.
4

During the review process, Jlint’s warnings were categorized to see whether they
refer to the same problem. Such situations constitute calls to the same method from
different callers, the same variable used in different contexts, or the same design pattern
applied throughout the class.
In
a separate count, counting the number of distinct prob-
lems rather than individual warnings,
all
such cases were counted once. Furthermore,
the time required for this process was recorded. Note that the review activity was often
interrupted by other activities such as writing this paper. We believe this reduced the
overall time required because manual code reviews require much attention, and cannot
be canied out in one run without a degradation of the concentration required.
3
First
case
study:
Rover
code
The first case study is a software module, called the Executive, for controlling the move-
ment of the planetary wheeled rover
K9,
developed at NASA Ames Research Center.
The run time for analyzing the code with Jlint was
0.10
seconds
on
a PowerPC
G4
with
a clock frequency of
500
MHz.
3.1
K9
is a hardware platform for experimenting with rover technology for exploration of
the Martian surface. The Executive is
a
software module for controlling the rover, and is
essentially an interpreter
of
plans, where a plan is a special form of a program. Plans are
constructed from high-level constructs, such as sequential composition and condition-
als, but
no
while loops. The effect of while
loops
is achieved by assuming that plans are
generated
on
the fly during rover operation as environment conditions change. The low-
est level nodes of a plan
are
tasks to be directly executed by the rover hardware.
A
node
in
a plan can be further constrained by a set of conditions, which when failing during
exzcution, cailse the Executive
to
abort the execution of the subsequent sibling nodes,
unless specified otherwise through options. Examples of conditions are pre-conditions
and post-conditions, as well
as
invariants to be maintained during the execution of the
node. The examined Executive consists of
7,300
lines of Java code. This code was ex-
tracted by a colleague from the original rover code, written
in
35,000
lines of
Cti-.
The code is highly multi-threaded, and hence provides a risk for concurrency errors.
The Java version of the code was extracted as part of a different project, the purpose of
which was to compare various formal methods, such as model checking, static analysis,
runtime analysis, and simple testing
[7].
The code contained
a
number seeded of errors.
Description
of
the Rover project
3.2
Jlint evaluation
Jlint issues
249
warnings when checking the Rover code. Table
1
summarizes Jlint’s
output. The first
two
columns show how each type
of
problem and how many warnings
Jlint generated for them. The third, forth and fifth column show the result of the manual
source code analysis: how many actual, distinct faults, or at least serious problems,
in the code were found, how many warnings described such actual faults, and how
many were considered
to
be false positives. The last column shows the time spent
on
5

Citations
More filters
Proceedings ArticleDOI
12 Jun 2008
TL;DR: The design of Parfait is presented, a static layered program analysis framework for bug checking, designed for scalability and precision by improving false positive rates and scale to millions of lines of code.
Abstract: We present the design of Parfait, a static layered program analysis framework for bug checking, designed for scalability and precision by improving false positive rates and scale to millions of lines of code. The Parfait framework is inherently parallelizable and makes use of demand driven analyses.In this paper we provide an example of several layers of analyses for buffer overflow, summarize our initial implementation for C, and provide preliminary results. Results are quantified in terms of correctly-reported, false positive and false negative rates against the NIST SAMATE synthetic benchmarks for C code.

53 citations


Cites methods from "Applying Jlint to Space Exploration..."

  • ...Tools that support both timing and state, and input validation and representation bugs include: ESC [8], a Modula-3 and Java checker that uses a theorem prover (Simplify) to reason about the semantics of language constructs, driven by annotations in the code; Coverity [10, 11], a C, C++ and Java checker based on \may belief" analysis; Jlint [1, 2 ], a checker of Java classles that is based on data ow and abstract interpretation; ......

    [...]

  • ...The program initializes two buers: the stack buer buf is initialized to the \AAA...A" string with a trailing C end-of-string character, and the heap buer buf2 is initialized to the input data provided as the third parameter to the program (argv[ 2 ]), after allocating data from the heap of size equal to the second parameter (argv[1])....

    [...]

  • ...In the example there are two user inputs: the length of the data (argv[1]) and the string of data (argv[ 2 ])....

    [...]

Proceedings ArticleDOI
24 Mar 2009
TL;DR: The Software Quality Assessment and Visualization Toolset (SQuAVisiT), a flexible tool for visual software analytics that allows for integration of multiple programming languages and variety of analysis and visualization tools, is presented.
Abstract: We present the Software Quality Assessment and Visualization Toolset (SQuAVisiT), a flexible tool for visual software analytics. Visual software analytics supports analytical reasoning about software systems facilitated by interactive visual interfaces. In particular, SQuAVisiT assists software developers, maintainers and assessors in performing quality assurance and maintenance tasks. Flexibility ofSQuAVisiT allows for integration of multiple programming languages and variety of analysis and visualization tools.SQuAVisiT has been successfully applied in a number of case studies, ranging from hundreds to thousands KLOC,from homogeneous to heterogeneous systems.

26 citations


Cites background from "Applying Jlint to Space Exploration..."

  • ...Remainder of this paper is organized as follows....

    [...]

Proceedings ArticleDOI
09 Nov 2008
TL;DR: A non-null annotations inferencer for the Java bytecode language and a substantial improvement in the precision is shown and, despite being a whole-program analysis, production applications can be analyzed within minutes.
Abstract: We present a non-null annotations inferencer for the Java bytecode language. We previously proposed an analysis to infer non-null annotations and proved it soundness and completeness with respect to a state of the art type system. This paper proposes extensions to our former analysis in order to deal with the Java bytecode language. We have implemented both analyses and compared their behaviour on several benchmarks. The results show a substantial improvement in the precision and, despite being a whole-program analysis, production applications can be analyzed within minutes.

18 citations

Journal ArticleDOI
TL;DR: This work proposes a novel approach for generating random benchmarks for evaluating program analysis and testing tools and compilers that uses stochastic parse trees, where language grammar production rules are assigned probabilities that specify the frequencies with which instantiations of these rules will appear in the generated programs.
Abstract: Benchmarks are heavily used in different areas of computer science to evaluate algorithms and tools. In program analysis and testing, open-source and commercial programs are routinely used as benchmarks to evaluate different aspects of algorithms and tools. Unfortunately, many of these programs are written by programmers who introduce different biases, not to mention that it is very difficult to find programs that can serve as benchmarks with high reproducibility of results. We propose a novel approach for generating random benchmarks for evaluating program analysis and testing tools and compilers. Our approach uses stochastic parse trees, where language grammar production rules are assigned probabilities that specify the frequencies with which instantiations of these rules will appear in the generated programs. We implemented our tool for Java and applied it to generate a set of large benchmark programs of up to 5Mlines of code each with which we evaluated different program analysis and testing tools and compilers. The generated benchmarks let us independently rediscover several issues in the evaluated tools. Copyright © 2014 John Wiley & Sons, Ltd.

13 citations


Cites methods from "Applying Jlint to Space Exploration..."

  • ...Like FindBugs, JLint [41] applies syntactic bug patterns and dataflow analysis on AUT bytecode, but it is not easy to expand [21]....

    [...]

  • ...JLint Like FindBugs, JLint [41] applies syntactic bug patterns and dataflow analysis on AUT bytecode, but it is not easy to expand [21]....

    [...]

DissertationDOI
01 Jan 2005
TL;DR: A new kind of generic analysis has been implemented in the JNuke framework presented here, which can utilize the same algorithm in both a static and dynamic setting by abstracting differences between the two scenarios into a corresponding environment.
Abstract: Multi-threaded programming gives rise to errors that do not occur in sequential programs. Such errors are hard to find using traditional testing. In this context, verification of the locking and data access discipline of a program is very promising, as it finds many kinds of errors quickly, without requiring a user-defined specification. Run-time verification utilizes such rules in order to detect possible failures, which do not have to actually occur in a given program execution. A common such failure is a data race, which results from inadequate synchronization between threads during access to shared data. Data races do not always result in a visible failure and are thus hard to detect. Traditional data races denote direct accesses to shared data. In addition to this, a new kind of high-level data race is introduced, where accesses to sets of data are not protected consistently. Such inconsistencies can lead to further failures that cannot be detected by other algorithms. Finally, data races leave other errors untouched which concern atomicity. Atomicity relates to sequences of actions that have to be executed atomically, with no other thread changing the global program state such that the outcome differs from serial execution. A data-flow-based approach is presented here, which detects stale values, where local copies of data are outdated. The latter property can be analyzed efficiently using static analysis. In order to allow for comparison between static and dynamic analysis, a new kind of generic analysis has been implemented in the JNuke framework presented here. This generic analysis can utilize the same algorithm in both a static and dynamic setting. By abstracting differences between the two scenarios into a corresponding environment, common structures such as analysis logics and context can be used twofold. The architecture and other implementation aspects of JNuke are also described in this work.

13 citations

References
More filters
Proceedings ArticleDOI
16 May 1999
TL;DR: This paper describes a verification method that requires little or no specialized knowledge in model construction and allows us to extract models mechanically from the source of software applications, securing accuracy.
Abstract: Formal verification methods are used only sparingly in software development. The most successful methods to date are based on the use of model checking tools. To use such tools, the user must first define a faithful abstraction of the application (the model), specify how the application interacts with its environment, and then formulate the properties that it should satisfy. Each step in this process can become an obstacle. To complete the verification process successfully often requires specialized knowledge of verification techniques and a considerable investment of time. In this paper we describe a verification method that requires little or no specialized knowledge in model construction. It allows us to extract models mechanically from the source of software applications, securing accuracy. Interface definitions and property specifications have meaningful defaults that can be adjusted when the checking process becomes more refined. All checks can be executed mechanically, even when the application itself continues to evolve. Compared to conventional software testing, the thoroughness of a check of this type is unprecedented.

133 citations

01 Sep 2001
TL;DR: The Java PathExplorer (\JPaX) as mentioned in this paper is a tool for monitoring the execution of Java programs, which can be used during program testing to gain increased information about program executions, and can potentially furthermore be applied during operation to survey safety critical systems.
Abstract: We present recent work on the development of Java PathExplorer (\JPaXX), a tool for monitoring the execution of Java programs. \JPaX can be used during program testing to gain increased information about program executions, and can potentially furthermore be applied during operation to survey safety critical systems. The tool facilitates automated instrumentation of a program''s byte code, which will then emit events to an observer during its execution. The observer checks the events against user provided high-level requirement specifications, for example temporal logic formulae, and against lower level error detection procedures, usually concurrency related such as deadlock and data race algorithms. High level requirement specifications together with their underlying logics are defined in rewriting logic using Maude, and then can either be directly checked using Maude rewriting engine, or be first translated to efficient data structures and then checked in Java.

114 citations

Book ChapterDOI
30 Aug 2000
TL;DR: This paper describes the automatic runtime checking for multithreaded applications incorporated in Visual Threads, Compaq's runtime debugging and analysis tool for multi-threaded applications.
Abstract: Multithreaded applications are notoriously difficult to design and build while avoiding defects. Many of Compaq’s customers need to employ threads to implement high-performance, scalable applications that address their needs in business and science. In order to ensure their success using threads, Compaq provides a runtime debugging and analysis tool for multithreaded applications called Visual Threads. This paper describes the automatic runtime checking for multithreaded applications incorporated in Visual Threads.

102 citations

Proceedings ArticleDOI
Cyrille Artho, Armin Biere1
27 Aug 2001
TL;DR: Applying Jlint2 to various large software packages, including commercial packages from Trilogy, found 12 faults, two of which related to multi-threading, and the statistical analysis proves that these extensions are relevant and useful.
Abstract: Static analysis is a tremendous help when trying to find faults in complex software. Writing multi-threaded programs is difficult, because the thread scheduling increases the program state space exponentially, and an incorrect thread synchronization produces faults that are hard to find. Program checkers have become sophisticated enough to find faults in real, large-scale software. In particular, Jlint, a very fast Java program checker; can check packages in a highly automated manner. The original version, Jlint1, still lacked full support for synchronization statements in Java. We extended Jlint1's model to include synchronizations on arbitrary objects, and named our version Jlint2. Our statistical analysis proves that these extensions are relevant and useful. Applying Jlint2 to various large software packages, including commercial packages from Trilogy, found 12 faults, two of which related to multi-threading.

99 citations

Book
07 Aug 2013
TL;DR: The study consisted of a controlled experiment where three technologies were compared to traditional testing with respect to their ability to find seeded errors in a prototype Mars Rover controller and confirmed the belief that advanced tools can outperform testing when trying to locate concurrency errors.
Abstract: We report on a study to determine the maturity of different verification and validation technologies (V&V) applied to a representative example of NASA flight software. The study consisted of a controlled experiment where three technologies (static analysis, runtime analysis and model checking) were compared to traditional testing with respect to their ability to find seeded errors in a prototype Mars Rover controller. What makes this study unique is that it is the first (to the best of our knowledge) controlled experiment to compare formal methods based tools to testing on a realistic industrial-size example, where the emphasis was on collecting as much data on the performance of the tools and the participants as possible. The paper includes a description of the Rover code that was analyzed, the tools used, as well as a detailed description of the experimental setup and the results. Due to the complexity of setting up the experiment, our results cannot be generalized, but we believe it can still serve as a valuable point of reference for future studies of this kind. It confirmed our belief that advanced tools can outperform testing when trying to locate concurrency errors. Furthermore, the results of the experiment inspired a novel framework for testing the next generation of the Rover.

94 citations