Applying Jlint to Space Exploration Software

doi:10.1007/978-3-540-24622-0_24

Cyrille

Artho‘

and Klaus Havelund2

Computer Systems Institute,

ETH

Zurich, Switzerland

Keseel Technology, NASA Ames Research Center, Moffett Field, California USA

Abstract.

Java is a very successful programming language which is also

be-

coming widespread in embedded systems, where software correctness is critical.

Jlint is a simple

but

highly efficient static analyzer that checks a Java program

for several common errors,

such

as

null

pointer exceptions, and overflow er-

rors. It also includes checks for multi-threading problems, such

as

deadlocks and

data races. The case study described here shows the effectiveness

of

Jhnt in find-

false positives in the muIti-threading warnings gives

an

insight

into

design

pat-

terns commonly used

in

multi-threaded code. The results show that a few analy-

sis techniques are sufficient to avoid almost all false positives. These techniques

include investigating all possible callers and a few code idioms. Verifying the

comect application of these patterns is still crucial, because their correct usage

is

not trivial.

k.?g

ccittii

faiiYt,

kc!nding

iiiiiYt-*&-e&ig

piobkiii;.

Analyzing the ieasaiis

fai

1

Introduction

Java is becoming more widespread

in

the

area of embedded systems,

both

as

a

scaled-

down “Micro Edition”

[20]

or

by having real-time extensions

[6,5].

In

such systems,

software failures

are

very costly, because the software cannot always be replaced on a

running system, and failures may have expensive

or

even catastrophic consequences..

These costs

are

obviously prohibitively high when

a

software-related problem causes

the

failure of

a

space craft

[14].

Therefore

an

automated tool which

can

detect faults easily, preferably early

in

the

lifecycle of software, can be very useful

to

find

defects. One tool that allows fault de-

tection easily,

even

in

incomplete systems,

is

Jlint.

Among

similar tools geared towards

Java, it is one of the most suitable with respect to

ease

of use

(no

annotations required)

and free availability (the tool is

Open

Source)

[I].

.

..

1.1

The

Java

programming

langgage

Java is

a

modem, object-oriented programtning language that has had a large success

in

the

past few

years.

It

was

one of

the

first

languages where the source code was not

compiled to machine code, but to a different form,

the

byrecode.

This bytecode

runs

in

a

dedicated environment,

the

virrual

machine.

In

order to guarantee

the

integrity of

the

system,

each

class

file containing bytecode is checked prior to execution

[II,

19,211.

The Java language allows each object

to

have any number offields, which are at-

tributes of

each

object.

These

may

be

static,

Le.,

shared among all instances of a certain

,

class, or dynamic, Le., each instance has its own fields.

In

contrast to that,

local

vari-

ables

are thread-local and only visible within one method.

Java allows inheritance: a method of a given class may be

overridden

by a method

of the same name. Similarly, fields in

a

subclass

shadow

those with the same name

in

the superclass.

In

general, these mechanisms work well for small code examples

but are dangerous in larger projects. Methods overriding other methods must ensure

they do

not

violate invariants of the Superclass. Similar problems occur with variable

shadowing. The programmer is not always aware that a variable with the same name

already exists

on

a different level, such as the superclass.

In

order to prevent incorrect programs from corrupting the system, Java’s virtual

machine has various safety mechanisms built in. Each variable access is guarded against

manipulating memory outside the allocated area.

In

particular, pointers must not

be

null

when dereferenced, and array indices must be in a valid range. If these properties

are violated,

an

exception

is thrown indicating a programming error. This is a highly

undesirable behavior

in

most cases. Ideally, such errors should be prevented by static

analysis, rather than caught at run-time.

Furthermore, Java offers mechanisms to write multi-threaded programs. The two

key mechanisms are locking primitives, using the

synchronized

keyword, and inter-

thread synchronization with the

wait

and

notify

methods.

A

method or block which

is declared

synchronized

is only entered after the exclusive lock for that critical sec-

tion

has

been obtained. Lock usage for shared data is specified by the programmer.

Incorrect lock usage using too many locks may lead to

deud1ock.s.

For

example, if two

threads each wait

on

a lock held by the other thread, both threads cannot continue their

execution.

On

the other hand, if a value is accessed with insufficient lock protection,

data

races

may occur:

two

threads may access the same value concurrently, and the

results

of

the operations are

no

longer deterministic.

Java’s message passing mechanisms for threads also

is

a source of problems.

A

call

to

wait

allows a thread to suspend

until

a condition becomes true, which must

be signaled by

notify

by

another thread. When calling

wait

the calling thread must

ensure that it owns

the

lock it

waits

on,

and

also

release any other locks before the call.

Otherwise, remaining locks held are unavailable to other threads, which may in

turn

block when trying to obtain them. This can prevent them from calling

notify

which

would allow the waiting thread to release

its

lock. This situation is also a deadlock.

1.2

Much

effort

has gone into fault-finding in Java programs, single-threaded and multi-

threaded. The approaches can be separated into

static

checkers,

which

check a program

at compile-time and try to approximate its run-time behavior, and

dynamic

checkers,

which

try

to catch and analyze anomaiies during program execution.

Several static analysis tools exist that examine

a

program for faults such as

null

pointer dereferences

or

data races. The ESC/Java

[9]

tool is, like nint,

also

based on

static analysis, or more generally

on

theorem proving- It, however, requires annotation

of

the

program. While it is more precise than Jlint, it is not nearly as fast and requires a

large effort from the user to fully exploit the power of this tool

[9].

2

Dynamic tools have the advantage of having more precise information available

in

the execution trace. The Eraser algorithm [22], which has been implemented in the

Visual Threads tool [12] to analyze

C

and

C++

programs, is an example of

a

such

an

algorithm that examines a program execution trace for locking patterns and variable

accesses in order to predict potential data races. It also checks for deadlocks and several

other errors.

The Java PathExplorer tool (JPaX)

[

161 performs deadlock analysis and the Eraser

data race analysis

on

Java programs. It furthermore recently has been extended with the

high-level data race detection algorithm described in [3]. This algorithm analyzes how

collections

of

variables are accessed by multiple threads.

More heavyweight dynamic approaches include model checking, which explores

all possible schedules

in

a program. Recently, model checkers have been developed

that apply directly to programs (instead of just models thereof).

This

includes the Java

PathFinder system

(JPF)

developed by

NASA

[15,24], and similar systems [lo, 8,17,4,

231. Such systems, however, suffer from the state space explosion problem. In [13] we

describe an extension of Java PathFiider which performs data race analysis (and dead-

lock analysis) in simulation mode, whereafter the model checker

is

used to demonstrate

whether the data race (deadlock) warnings are real or

not.

This paper focuses

on

applying Jlint

[2]

to the software for detecting errors stat-

ically. JLint uses static analysis and abstract interpretation

to

find difficult errors at

compile-time.

A

similar case study with Kit has been made before, applying it to large

projects [2]. The difference to this case study is that the other case study had scalability

in mind. Jlint had been applied to packages containing several hundred thousand lines

of code, generating hundreds of warning messages. Because of this, the warnings had

been evaluated selectively, omitting some hard-to-check deadlock warnings. In

this

case

study, an effort was made to analyze every single warning and also see what kinds of

design patterns cause false positives?

13

Outline

This text is organized

as

follows: Section

2

describes Jlint and how it was used for this

project. Sections

3

and 4 show the results

of

applying Jlint to space exploration program

code. Design patterns which are common among these

two

projects are analyzed

in

Section

5.

Section

6

summarizes the results and concludes.

2

Jlint

2.1

Tool

description

Jlint checks Java code and finds bugs, inconsistencies and synchronization problems

by

performing a data flow analysis, abstract interpretation, and building the lock graph. It

issues warnings about potential problems. These warnings do

not

imply that

an

actual

Design patterns

commonly

denote compositions

of

objects

in

software.

In

this

paper, the notion

of

composition is different.

It

indudes lock patterns

and

sometimes

only

applies to a small part

of

the program.

In

that context, we

also

use the term “code idiom”.

3

error exists. This makes Jlint unsound

as

a program prover. Moreover, Jlint can also

miss errors, making it incomplete. The reason for

this

is that the goal was to make Jlint

practical, scalable, and possible to implement it in

a

short time.

Typical warnings about possible faults issued by Jlint are

null

pointer dereferences,

array bounds overflows, and value overflows. The latter may occur if one multiplies two

32

bit integer values without converting them to

64

bit

fist

Many warnings that Jlint issues are code guidelines:

A

local variable should never

have the same name as a field of the same class or

a

superclass. When a method of a

given name

is

overridden, all its variants should be ovemdden,

in

order to guarantee

a

consistent behavior of the subclass.

I

Jlint also includes many analyses for multi-threaded programs. Some of Jlint’s

warnings for multi-threaded programs are overly cautious. For instance, possible data

race warnings for method calls or variable accesses do not necessarily imply a data

race. The reason for such false positives are both difficulties inherent to static analysis,

such

as

pointer aliasing across method calls, and limitations in Jlint itself, where its

algorithms could be refined with known techniques.

2.2 Warning

review

process

Jlint gives fairly descriptive warnings for each problem found. The context given is

limited to the class

in

which the error occurs,

the

line number, and fields used or meth-

ods called.

This

is always sufficient to find

the

source of simple warnings, which con-

cern

sequentid

properties

such

as

null

pointer dereferences. These warnings are easy

to

review and were considered in a first pass. The other warnings, concerning multi-

threading problems,

take

much more time

to

consider, and were evaluated in a second

phase.

The review process essentially checks whether

the

problems described in the warn-

ings cm actually occur at run-time.

In

simple cases, warnings may be ruled out given

the

algorithmic properties of the program. Complex cases include reviewing callers to

the method in question.

Data race and deadlock warnings fall

in

this

category. They require constructing a

part of

the

call graph including locks owned by callers when a method is called. If it

can be ensured that all calls to non-synchronized shared methods

are

made only through

methods that already employ lock protection then there cannot be

a

data race:

This review process can be rather time-consuming and took up to twelve minutes for

one problem instance in the experiments carried out. Many warnings occur in similar

contexts,

so

warnings referring

to

the same problem can usually be easily confirmed as

duplicates. This part of the review process was not yet automated

in

any way but could

be automated to a large extent with known techniques. Both cases studies were done

without prior knowledge of

the

program code. It can be assumed that the time to review

the

warnings is shorter for the author of

the

code, especially when reviewing data race

or deadlock warnings.

Methods

that

access

a

shared

field

are

also

considered “shared”

in

this

context. The

lock

used

for

ensuring

mutual exclusion

must

be

the

same lock

for

all

calls.

4

During the review process, Jlint’s warnings were categorized to see whether they

refer to the same problem. Such situations constitute calls to the same method from

different callers, the same variable used in different contexts, or the same design pattern

applied throughout the class.

In

a separate count, counting the number of distinct prob-

lems rather than individual warnings,

all

such cases were counted once. Furthermore,

the time required for this process was recorded. Note that the review activity was often

interrupted by other activities such as writing this paper. We believe this reduced the

overall time required because manual code reviews require much attention, and cannot

be canied out in one run without a degradation of the concentration required.

3

First

case

study:

Rover

code

The first case study is a software module, called the Executive, for controlling the move-

ment of the planetary wheeled rover

K9,

developed at NASA Ames Research Center.

The run time for analyzing the code with Jlint was

0.10

seconds

on

a PowerPC

G4

with

a clock frequency of

500

MHz.

3.1

K9

is a hardware platform for experimenting with rover technology for exploration of

the Martian surface. The Executive is

a

software module for controlling the rover, and is

essentially an interpreter

of

plans, where a plan is a special form of a program. Plans are

constructed from high-level constructs, such as sequential composition and condition-

als, but

no

while loops. The effect of while

loops

is achieved by assuming that plans are

generated

on

the fly during rover operation as environment conditions change. The low-

est level nodes of a plan

are

tasks to be directly executed by the rover hardware.

A

node

in

a plan can be further constrained by a set of conditions, which when failing during

exzcution, cailse the Executive

to

abort the execution of the subsequent sibling nodes,

unless specified otherwise through options. Examples of conditions are pre-conditions

and post-conditions, as well

as

invariants to be maintained during the execution of the

node. The examined Executive consists of

7,300

lines of Java code. This code was ex-

tracted by a colleague from the original rover code, written

in

35,000

lines of

Cti-.

The code is highly multi-threaded, and hence provides a risk for concurrency errors.

The Java version of the code was extracted as part of a different project, the purpose of

which was to compare various formal methods, such as model checking, static analysis,

runtime analysis, and simple testing

[7].

The code contained

a

number seeded of errors.

Description

of

the Rover project

3.2

Jlint evaluation

Jlint issues

249

warnings when checking the Rover code. Table

1

summarizes Jlint’s

output. The first

two

columns show how each type

of

problem and how many warnings

Jlint generated for them. The third, forth and fifth column show the result of the manual

source code analysis: how many actual, distinct faults, or at least serious problems,

in the code were found, how many warnings described such actual faults, and how

many were considered

to

be false positives. The last column shows the time spent

on

5

Applying Jlint to Space Exploration Software

Citations

Parfait: designing a scalable bug checker

SQuAVisiT: A Flexible Tool for Visual Software Analytics

A non-null annotation inferencer for Java bytecode

RUGRAT: Evaluating program analysis and testing tools and compilers with large generated random benchmark applications

Combining Static and Dynamic Analysis to Find Multi-threading Faults Beyond Data Races

References

A Temporal Logic of Nested Calls and Returns

The Java Virtual Machine Specification

Eraser: a dynamic data race detector for multithreaded programs

Model checking programs

Eraser: a dynamic data race detector for multi-threaded programs

Related Papers (5)

Towards automatic exception safety verification

Using Static Analysis to Find Bugs

Finding bugs is easy

Finding and preventing run-time error handling mistakes

Tracking down software bugs using automatic anomaly detection