scispace - formally typeset
Open AccessBook ChapterDOI

Applying Jlint to Space Exploration Software

Cyrille Artho, +1 more
- Vol. 2937, pp 297-308
Reads0
Chats0
TLDR
The results show that a few analysis techniques are sufficient to avoid almost all false positives in the multi-threading warnings, and these techniques include investigating all possible callers and a few code idioms.
Abstract
Java is a very successful programming language which is also becoming widespread in embedded systems, where software correctness is critical. Jlint is a simple but highly efficient static analyzer that checks a Java program for several common errors, such as null pointer exceptions, and overflow errors. It also includes checks for multi-threading problems, such as deadlocks and data races. The case study described here shows the effectiveness of Jlint in finding certain faults, including multi-threading problems. Analyzing the reasons for false positives in the multi-threading warnings gives an insight into design patterns commonly used in multi-threaded code. The results show that a few analysis techniques are sufficient to avoid almost all false positives. These techniques include investigating all possible callers and a few code idioms. Verifying the correct application of these patterns is still crucial, because their correct usage is not trivial.

read more

Content maybe subject to copyright    Report

Applying Jlint to Space Exploration Software
Cyrille
Artho‘
and Klaus Havelund2
Computer Systems Institute,
ETH
Zurich, Switzerland
Keseel Technology, NASA Ames Research Center, Moffett Field, California USA
Abstract.
Java is a very successful programming language which is also
be-
coming widespread in embedded systems, where software correctness is critical.
Jlint is a simple
but
highly efficient static analyzer that checks a Java program
for several common errors,
such
as
null
pointer exceptions, and overflow er-
rors. It also includes checks for multi-threading problems, such
as
deadlocks and
data races. The case study described here shows the effectiveness
of
Jhnt in find-
false positives in the muIti-threading warnings gives
an
insight
into
design
pat-
terns commonly used
in
multi-threaded code. The results show that a few analy-
sis techniques are sufficient to avoid almost all false positives. These techniques
include investigating all possible callers and a few code idioms. Verifying the
comect application of these patterns is still crucial, because their correct usage
is
not trivial.
k.?g
ccittii
faiiYt,
kc!nding
iiiiiYt-*&-e&ig
piobkiii;.
Analyzing the ieasaiis
fai
1
Introduction
Java is becoming more widespread
in
the
area of embedded systems,
both
as
a
scaled-
down “Micro Edition”
[20]
or
by having real-time extensions
[6,5].
In
such systems,
software failures
are
very costly, because the software cannot always be replaced on a
running system, and failures may have expensive
or
even catastrophic consequences..
These costs
are
obviously prohibitively high when
a
software-related problem causes
the
failure of
a
space craft
[14].
Therefore
an
automated tool which
can
detect faults easily, preferably early
in
the
lifecycle of software, can be very useful
to
find
defects. One tool that allows fault de-
tection easily,
even
in
incomplete systems,
is
Jlint.
Among
similar tools geared towards
Java, it is one of the most suitable with respect to
ease
of use
(no
annotations required)
and free availability (the tool is
Open
Source)
[I].
.
..
1.1
The
Java
programming
langgage
Java is
a
modem, object-oriented programtning language that has had a large success
in
the
past few
years.
It
was
one of
the
first
languages where the source code was not
compiled to machine code, but to a different form,
the
byrecode.
This bytecode
runs
in
a
dedicated environment,
the
virrual
machine.
In
order to guarantee
the
integrity of
the
system,
each
class
file containing bytecode is checked prior to execution
[II,
19,211.
The Java language allows each object
to
have any number offields, which are at-
tributes of
each
object.
These
may
be
static,
Le.,
shared among all instances of a certain
,

class, or dynamic, Le., each instance has its own fields.
In
contrast to that,
local
vari-
ables
are thread-local and only visible within one method.
Java allows inheritance: a method of a given class may be
overridden
by a method
of the same name. Similarly, fields in
a
subclass
shadow
those with the same name
in
the superclass.
In
general, these mechanisms work well for small code examples
but are dangerous in larger projects. Methods overriding other methods must ensure
they do
not
violate invariants of the Superclass. Similar problems occur with variable
shadowing. The programmer is not always aware that a variable with the same name
already exists
on
a different level, such as the superclass.
In
order to prevent incorrect programs from corrupting the system, Java’s virtual
machine has various safety mechanisms built in. Each variable access is guarded against
manipulating memory outside the allocated area.
In
particular, pointers must not
be
null
when dereferenced, and array indices must be in a valid range. If these properties
are violated,
an
exception
is thrown indicating a programming error. This is a highly
undesirable behavior
in
most cases. Ideally, such errors should be prevented by static
analysis, rather than caught at run-time.
Furthermore, Java offers mechanisms to write multi-threaded programs. The two
key mechanisms are locking primitives, using the
synchronized
keyword, and inter-
thread synchronization with the
wait
and
notify
methods.
A
method or block which
is declared
synchronized
is only entered after the exclusive lock for that critical sec-
tion
has
been obtained. Lock usage for shared data is specified by the programmer.
Incorrect lock usage using too many locks may lead to
deud1ock.s.
For
example, if two
threads each wait
on
a lock held by the other thread, both threads cannot continue their
execution.
On
the other hand, if a value is accessed with insufficient lock protection,
data
races
may occur:
two
threads may access the same value concurrently, and the
results
of
the operations are
no
longer deterministic.
Java’s message passing mechanisms for threads also
is
a source of problems.
A
call
to
wait
allows a thread to suspend
until
a condition becomes true, which must
be signaled by
notify
by
another thread. When calling
wait
the calling thread must
ensure that it owns
the
lock it
waits
on,
and
also
release any other locks before the call.
Otherwise, remaining locks held are unavailable to other threads, which may in
turn
block when trying to obtain them. This can prevent them from calling
notify
which
would allow the waiting thread to release
its
lock. This situation is also a deadlock.
1.2
Related
work
Much
effort
has gone into fault-finding in Java programs, single-threaded and multi-
threaded. The approaches can be separated into
static
checkers,
which
check a program
at compile-time and try to approximate its run-time behavior, and
dynamic
checkers,
which
try
to catch and analyze anomaiies during program execution.
Several static analysis tools exist that examine
a
program for faults such as
null
pointer dereferences
or
data races. The ESC/Java
[9]
tool is, like nint,
also
based on
static analysis, or more generally
on
theorem proving- It, however, requires annotation
of
the
program. While it is more precise than Jlint, it is not nearly as fast and requires a
large effort from the user to fully exploit the power of this tool
[9].
2

Dynamic tools have the advantage of having more precise information available
in
the execution trace. The Eraser algorithm [22], which has been implemented in the
Visual Threads tool [12] to analyze
C
and
C++
programs, is an example of
a
such
an
algorithm that examines a program execution trace for locking patterns and variable
accesses in order to predict potential data races. It also checks for deadlocks and several
other errors.
The Java PathExplorer tool (JPaX)
[
161 performs deadlock analysis and the Eraser
data race analysis
on
Java programs. It furthermore recently has been extended with the
high-level data race detection algorithm described in [3]. This algorithm analyzes how
collections
of
variables are accessed by multiple threads.
More heavyweight dynamic approaches include model checking, which explores
all possible schedules
in
a program. Recently, model checkers have been developed
that apply directly to programs (instead of just models thereof).
This
includes the Java
PathFinder system
(JPF)
developed by
NASA
[15,24], and similar systems [lo, 8,17,4,
231. Such systems, however, suffer from the state space explosion problem. In [13] we
describe an extension of Java PathFiider which performs data race analysis (and dead-
lock analysis) in simulation mode, whereafter the model checker
is
used to demonstrate
whether the data race (deadlock) warnings are real or
not.
This paper focuses
on
applying Jlint
[2]
to the software for detecting errors stat-
ically. JLint uses static analysis and abstract interpretation
to
find difficult errors at
compile-time.
A
similar case study with Kit has been made before, applying it to large
projects [2]. The difference to this case study is that the other case study had scalability
in mind. Jlint had been applied to packages containing several hundred thousand lines
of code, generating hundreds of warning messages. Because of this, the warnings had
been evaluated selectively, omitting some hard-to-check deadlock warnings. In
this
case
study, an effort was made to analyze every single warning and also see what kinds of
design patterns cause false positives?
13
Outline
This text is organized
as
follows: Section
2
describes Jlint and how it was used for this
project. Sections
3
and 4 show the results
of
applying Jlint to space exploration program
code. Design patterns which are common among these
two
projects are analyzed
in
Section
5.
Section
6
summarizes the results and concludes.
2
Jlint
2.1
Tool
description
Jlint checks Java code and finds bugs, inconsistencies and synchronization problems
by
performing a data flow analysis, abstract interpretation, and building the lock graph. It
issues warnings about potential problems. These warnings do
not
imply that
an
actual
Design patterns
commonly
denote compositions
of
objects
in
software.
In
this
paper, the notion
of
composition is different.
It
indudes lock patterns
and
sometimes
only
applies to a small part
of
the program.
In
that context, we
also
use the term “code idiom”.
3

error exists. This makes Jlint unsound
as
a program prover. Moreover, Jlint can also
miss errors, making it incomplete. The reason for
this
is that the goal was to make Jlint
practical, scalable, and possible to implement it in
a
short time.
Typical warnings about possible faults issued by Jlint are
null
pointer dereferences,
array bounds overflows, and value overflows. The latter may occur if one multiplies two
32
bit integer values without converting them to
64
bit
fist
Many warnings that Jlint issues are code guidelines:
A
local variable should never
have the same name as a field of the same class or
a
superclass. When a method of a
given name
is
overridden, all its variants should be ovemdden,
in
order to guarantee
a
consistent behavior of the subclass.
I
Jlint also includes many analyses for multi-threaded programs. Some of Jlint’s
warnings for multi-threaded programs are overly cautious. For instance, possible data
race warnings for method calls or variable accesses do not necessarily imply a data
race. The reason for such false positives are both difficulties inherent to static analysis,
such
as
pointer aliasing across method calls, and limitations in Jlint itself, where its
algorithms could be refined with known techniques.
2.2 Warning
review
process
Jlint gives fairly descriptive warnings for each problem found. The context given is
limited to the class
in
which the error occurs,
the
line number, and fields used or meth-
ods called.
This
is always sufficient to find
the
source of simple warnings, which con-
cern
sequentid
properties
such
as
null
pointer dereferences. These warnings are easy
to
review and were considered in a first pass. The other warnings, concerning multi-
threading problems,
take
much more time
to
consider, and were evaluated in a second
phase.
The review process essentially checks whether
the
problems described in the warn-
ings cm actually occur at run-time.
In
simple cases, warnings may be ruled out given
the
algorithmic properties of the program. Complex cases include reviewing callers to
the method in question.
Data race and deadlock warnings fall
in
this
category. They require constructing a
part of
the
call graph including locks owned by callers when a method is called. If it
can be ensured that all calls to non-synchronized shared methods
are
made only through
methods that already employ lock protection then there cannot be
a
data race:
This review process can be rather time-consuming and took up to twelve minutes for
one problem instance in the experiments carried out. Many warnings occur in similar
contexts,
so
warnings referring
to
the same problem can usually be easily confirmed as
duplicates. This part of the review process was not yet automated
in
any way but could
be automated to a large extent with known techniques. Both cases studies were done
without prior knowledge of
the
program code. It can be assumed that the time to review
the
warnings is shorter for the author of
the
code, especially when reviewing data race
or deadlock warnings.
Methods
that
access
a
shared
field
are
also
considered “shared”
in
this
context. The
lock
used
for
ensuring
mutual exclusion
must
be
the
same lock
for
all
calls.
4

During the review process, Jlint’s warnings were categorized to see whether they
refer to the same problem. Such situations constitute calls to the same method from
different callers, the same variable used in different contexts, or the same design pattern
applied throughout the class.
In
a separate count, counting the number of distinct prob-
lems rather than individual warnings,
all
such cases were counted once. Furthermore,
the time required for this process was recorded. Note that the review activity was often
interrupted by other activities such as writing this paper. We believe this reduced the
overall time required because manual code reviews require much attention, and cannot
be canied out in one run without a degradation of the concentration required.
3
First
case
study:
Rover
code
The first case study is a software module, called the Executive, for controlling the move-
ment of the planetary wheeled rover
K9,
developed at NASA Ames Research Center.
The run time for analyzing the code with Jlint was
0.10
seconds
on
a PowerPC
G4
with
a clock frequency of
500
MHz.
3.1
K9
is a hardware platform for experimenting with rover technology for exploration of
the Martian surface. The Executive is
a
software module for controlling the rover, and is
essentially an interpreter
of
plans, where a plan is a special form of a program. Plans are
constructed from high-level constructs, such as sequential composition and condition-
als, but
no
while loops. The effect of while
loops
is achieved by assuming that plans are
generated
on
the fly during rover operation as environment conditions change. The low-
est level nodes of a plan
are
tasks to be directly executed by the rover hardware.
A
node
in
a plan can be further constrained by a set of conditions, which when failing during
exzcution, cailse the Executive
to
abort the execution of the subsequent sibling nodes,
unless specified otherwise through options. Examples of conditions are pre-conditions
and post-conditions, as well
as
invariants to be maintained during the execution of the
node. The examined Executive consists of
7,300
lines of Java code. This code was ex-
tracted by a colleague from the original rover code, written
in
35,000
lines of
Cti-.
The code is highly multi-threaded, and hence provides a risk for concurrency errors.
The Java version of the code was extracted as part of a different project, the purpose of
which was to compare various formal methods, such as model checking, static analysis,
runtime analysis, and simple testing
[7].
The code contained
a
number seeded of errors.
Description
of
the Rover project
3.2
Jlint evaluation
Jlint issues
249
warnings when checking the Rover code. Table
1
summarizes Jlint’s
output. The first
two
columns show how each type
of
problem and how many warnings
Jlint generated for them. The third, forth and fifth column show the result of the manual
source code analysis: how many actual, distinct faults, or at least serious problems,
in the code were found, how many warnings described such actual faults, and how
many were considered
to
be false positives. The last column shows the time spent
on
5

Citations
More filters
Proceedings ArticleDOI

Parfait: designing a scalable bug checker

TL;DR: The design of Parfait is presented, a static layered program analysis framework for bug checking, designed for scalability and precision by improving false positive rates and scale to millions of lines of code.
Proceedings ArticleDOI

SQuAVisiT: A Flexible Tool for Visual Software Analytics

TL;DR: The Software Quality Assessment and Visualization Toolset (SQuAVisiT), a flexible tool for visual software analytics that allows for integration of multiple programming languages and variety of analysis and visualization tools, is presented.
Proceedings ArticleDOI

A non-null annotation inferencer for Java bytecode

TL;DR: A non-null annotations inferencer for the Java bytecode language and a substantial improvement in the precision is shown and, despite being a whole-program analysis, production applications can be analyzed within minutes.
Journal ArticleDOI

RUGRAT: Evaluating program analysis and testing tools and compilers with large generated random benchmark applications

TL;DR: This work proposes a novel approach for generating random benchmarks for evaluating program analysis and testing tools and compilers that uses stochastic parse trees, where language grammar production rules are assigned probabilities that specify the frequencies with which instantiations of these rules will appear in the generated programs.
DissertationDOI

Combining Static and Dynamic Analysis to Find Multi-threading Faults Beyond Data Races

Cyrille Artho
TL;DR: A new kind of generic analysis has been implemented in the JNuke framework presented here, which can utilize the same algorithm in both a static and dynamic setting by abstracting differences between the two scenarios into a corresponding environment.
References
More filters
Book ChapterDOI

A Temporal Logic of Nested Calls and Returns

TL;DR: This work introduces a temporal logic of calls and returns (CaRet) for specification and algorithmic verification of correctness requirements of structured programs and presents a tableau construction that reduces the model checking problem to the emptiness problem for a Buchi pushdown system.
Book

The Java Virtual Machine Specification

Tim Lindholm, +1 more
TL;DR: In this article, the authors present a detailed overview of the Java Virtual Machine, including the internal structure of the class file format, the internal form of Fully Qualified Class and Interface names, and the implementation of new class instances.
Journal ArticleDOI

Eraser: a dynamic data race detector for multithreaded programs

TL;DR: A new tool, called Eraser, is described, for dynamically detecting data races in lock-based multithreaded programs, which uses binary rewriting techniques to monitor every shared-monory reference and verify that consistent locking behavior is observed.
Journal ArticleDOI

Model checking programs

TL;DR: A verification and testing environment for Java, called Java PathFinder (JPF), which integrates model checking, program analysis and testing, and uses state compression to handle big states and partial order and symmetry reduction, slicing, abstraction, and runtime analysis techniques to reduce the state space.
Proceedings ArticleDOI

Eraser: a dynamic data race detector for multi-threaded programs

TL;DR: Eraser as mentioned in this paper uses binary rewriting techniques to monitor every shared memory reference and verify that consistent locking behavior is observed in lock-based multi-threaded programs, which can be used to detect data races.