What are the future works in this paper?

In the future, the authors would like to extend this approach to make it more rigorous and systematic. The authors also plan to make the testbeds and experimental results more readily available to VE developers and researchers. There are very few potential users of most VE applications who could be considered experts, so the current results are useful, but an understanding of how performance changes over time would have an added value. A set of guidelines based on the results is part of the answer to this problem, but the authors feel that it would also be useful to create an automated design guidance system that suggests interaction techniques by matching the requirements of a VE application to the testbed results.

(Open Access) Testbed evaluation of virtual environment interaction techniques (1999) | Doug A. Bowman

Q: What are the contributions in this paper?

In this paper, the authors present a systematic approach, testbed evaluation, for the assessment of interaction techniques for VEs. The authors present two testbed experiments, covering techniques for the common VE tasks of travel and object selection/manipulation. The results of these experiments allow us to form general guidelines for VE interaction and to provide an empirical basis for choosing interaction techniques in VE applications.

Doug A. Bowman

bowman@vt.edu

Department of Computer Science

Virginia Polytechnic Institute & State

University

Donald B. Johnson

Larry F. Hodges

{donny, hodges}@cc.gatech.edu

Graphics, Visualization, and Usability

Center

Georgia Institute of Technology

Presence, Vol. 10, No. 1, February 2001, 75–95

Testbed Evaluation of Virtual

Environment Interaction

Techniques

Abstract

As immersive virtual environment (VE) applications become more complex, it is

clear that we need a ﬁrm understanding of the principles of VE interaction. In par-

ticular, designers need guidance in choosing three-dimensional interaction tech-

niques. In this paper, we present a systematic approach, testbed evaluation, for the

assessment of interaction techniques for VEs. Testbed evaluation uses formal frame-

works and formal experiments with multiple independent and dependent variables

to obtain a wide range of performance data for VE interaction techniques. We

present two testbed experiments, covering techniques for the common VE tasks of

travel and object selection/manipulation. The results of these experiments allow us

to form general guidelines for VE interaction and to provide an empirical basis for

choosing interaction techniques in VE applications. Evaluation of a real-world VE

system based on the testbed results indicates that this approach can produce sub-

stantial improvements in usability.

1 Introduction

Applications of immersive virtual environments (VEs) are becoming both

more diverse and more complex. This complexity is not only evident in the

number of polygons being rendered in real time, the resolution of texture

maps, or the number of users immersed in the same virtual world, but also in

the interaction between the user(s) and the environment. Users need to navi-

gate freely through a three-dimensional space, manipulate virtual objects with

six degrees of freedom, or control attributes of a simulation, among many

other things.

However, interaction in three dimensions is not well understood (Herndon,

van Dam, & Gleicher, 1994). Users have difﬁculty controlling multiple de-

grees of freedom simultaneously, interacting in a volume rather than on a sur-

face, and understanding 3-D spatial relationships. These problems are magni-

ﬁed in an immersive VE, because standard input devices such as mice and

keyboards may not be usable (if the user is standing, for example), the display

resolution is often low (limiting the ability to display text, for example), and

3-D depth cues may be in conﬂict with one another (accommodation and con-

vergence, for example).

Therefore, the design of interaction techniques (ITs) and user interfaces for

VEs must be done with extreme care to produce useful and usable systems.

Because there is a lack of empirical data regarding VE interaction techniques,

Bowman et al. 75

techniques," Presence-Teleoperators and Virtual Environments, 2001, Vol. 10 No. 1, 75-95 doi: 10.1162/105474601750182333

we emphasize the need for formal evaluation of ITs,

leading to easily applied guidelines and principles.

In particular, we have found testbed evaluation to be

a powerful and useful tool to assess VE interaction.

Testbeds are representative sets of tasks and environ-

ments, and the performance of ITs can be quantiﬁed by

running them through the various parts of a testbed.

Testbed evaluations are distinguished from other types

of formal experiments because they combine multiple

tasks, multiple independent variables, and multiple re-

sponse measures to obtain a more complete picture of

the performance characteristics of an IT, and because

they produce application-independent results.

In this paper, we present our experience with this

type of evaluation. We will begin by discussing related

work and the design and evaluation methodology of

which testbed evaluation is a part. Two testbed experi-

ments are presented, evaluating interaction techniques

for the tasks of travel and selection/manipulation of

virtual objects. The results of these experiments were

applied to the design of a complex VE application. We

conclude with a discussion of the merits of this type of

evaluation.

2 Related Work

Most ITs for immersive VEs have been developed

in an ad hoc fashion or to meet the requirements of a

particular application. Such techniques may be very use-

ful, but they need to be evaluated formally. Work has

focused on a small number of “universal” VE tasks, such

as travel (Koller, Mine, & Hudson, 1996; Ware & Os-

borne, 1990), and object selection and manipulation

(Pierce et al., 1997; Poupyrev, Billinghurst, Weghorst,

& Ichikawa, 1996).

Evaluation of VE interaction has for the most part

been limited to usability studies (for example, Bowman,

Hodges, & Bolter, 1998). Such evaluations test com-

plete applications with a series of predeﬁned user tasks.

Usability studies can be a useful tool for the iterative

design of applications, but we feel that lower-level as-

sessments are necessary due to the newness of this re-

search area.

Another methodology that has been applied to VE

interaction is usability engineering (Hix et al., 1999).

This technique uses expert evaluation, guidelines, and

multiple design iterations to achieve a usable interface.

Again, it is focused on a particular application and not

ITs in general.

A number of guidelines for 3D/VE interaction

have been published (such as Kaur (1998)). Guide-

lines can be very useful to the application developer

as an easy way to check for potential problems. Un-

fortunately, most current guidelines for VEs are ei-

ther too general and therefore difﬁcult to apply, or

taken only from experience and intuition and not

from empirical results.

Testbeds for virtual environments are not new. The

VEPAB project (Lampton et al., 1994) produced a bat-

tery of tests to evaluate performance in VEs, including

tests of user navigation. Unlike our work, however, the

tasks involved were not based on a formal framework of

technique components and other factors affecting per-

formance. The most closely related work to the current

research is the manipulation assessment testbed (VRMAT)

developed by Poupyrev, Weghorst, Billinghurst, and

Ichikawa (1997).

3 Methodology

How does one design and validate testbeds for VE

interaction? It is important that these testbeds represent

generalized tasks and environments that can be found in

real VE applications. Also, we need to understand ITs at

a low level and standardize the measurement of perfor-

mance. For these reasons, we base our testbeds on a

systematic, formal framework for VE interaction tech-

niques (Bowman & Hodges, 1999). In this section, we

will brieﬂy discuss pieces of this methodology that are

relevant to the current work.

3.1 Taxonomies

The ﬁrst step in creating a formal framework for

design and evaluation is to establish a taxonomy of in-

teraction techniques for each of the universal interaction

76 PRESENCE: VOLUME 10, NUMBER 1

tasks. (Note the word taxonomy because we will employ

both of its accepted meanings: “the science of classiﬁca-

tion,” and “a speciﬁc classiﬁcation.”) Taxonomies parti-

tion the tasks into separable subtasks, each of which rep-

resents a decision that must be made by the designer of

a technique. In this sense, a taxonomy is the product of

a careful task analysis. For each of the lowest-level sub-

tasks, technique components (parts of an interaction

technique that complete that subtask) may be listed.

Figure 1 presents a taxonomy for the tasks of selection

and manipulation, including two levels of subtasks, and

multiple technique components for each of the lowest-

level subtasks. We have also created two taxonomies for

the task of travel.

The taxonomies must come from a deep and

through understanding of the interaction task and the

techniques that have been proposed for it. Therefore,

some initial qualitative evaluation of techniques

and/or design of new techniques for the task is al-

most always required before a useful taxonomy can be

constructed.

Let us consider a simple example. Suppose the inter-

action task is to change the color of a virtual object. (Of

course, this task could also be considered a combination

of other interaction tasks: select an object, select a color,

and give the “change color” command.) A taxonomy

for this task would include several subtasks. Selecting an

object whose color is to change, choosing the color, and

applying the color are subtasks that are directly task-

related. On the other hand, we might also include as-

pects such as the color model used or the feedback

given to the user, which would not be applicable for this

task in the physical world, but which are important con-

siderations for an IT.

We do not claim that any given taxonomy repre-

sents the “correct” partitioning of the task. Different

users have different conceptions of the subtasks that

are carried out to complete a task. Rather, we see our

taxonomies as practical tools that we use as a frame-

work for design and evaluation. Therefore, we are

concerned only with the utility of a taxonomy for

these tasks, and not its “correctness.” In fact, we have

developed two possible taxonomies for the task of

travel, both of which have been useful in determining

different aspects of performance. Rules and guidelines

have been set forth for creating proper taxonomies

(Fleishman & Quaintance, 1984), but we felt that the

categorical structure of these taxonomies did not lend

itself as well to design and evaluation as the simple

task analysis, because they do not allow guided design

or evaluation at the subtask level.

Taxonomies have many desirable properties. First,

they can be veriﬁed by ﬁtting known techniques into

them in the process of categorization. Second, they can

be used to design new techniques quickly, by combin-

ing one component for each of the lowest-level sub-

tasks. More relevant to testbed evaluation, they provide

a framework for assessing techniques at a more ﬁne-

grained level. Rather than evaluating two techniques for

the object-coloring task, then, we can evaluate six com-

ponents. This may lead to models of performance that

allow us to predict that a new combination of these

Figure 1. Taxonomy of selection/manipulation techniques.

Bowman et al. 77

components would perform better than either of the

techniques that were tested.

3.2 Performance Metrics

Quantifying the performance of VE interaction

techniques is a difﬁcult task, because performance is

not well deﬁned. It is relatively simple to measure and

quantify time for task completion and accuracy, but

these are not the only requirements of real VE appli-

cations.

VE developers are also concerned with notions such

as the naturalism of the interaction (how closely it mim-

ics the real world) and the degree of presence the user

feels. Usability-related issues such as ease of use, ease of

learning, and user comfort will also be important to an

interface’s success. Finally, task-related performance,

such as spatial orientation during navigation or expres-

siveness of manipulation, is often required.

We should remember that the reason we wish to

ﬁnd good ITs is so that our applications will be more

usable, and that VE applications have many different

requirements. In many applications, speed and accu-

racy are not the main concerns, and therefore these

should not always be the only response variables in

our evaluations.

Also, more than any other computing paradigm, vir-

tual environments involve the user—his or her senses

and body—in the task. Thus, it is essential that we focus

on user-centric performance measures. If an IT does not

make good use of the skills of the human being, or if it

causes fatigue or discomfort, it will not provide overall

usability despite its performance in other areas. In this

work, then, we will base our evaluations on multiple

performance measures that cover a wide range of appli-

cation and user requirements.

Therefore, in our work, we have a broad deﬁnition of

performance, and we will attempt to measure multiple

performance variables during testbed evaluation. For

those factors that are not objectively measurable, stan-

dard questionnaires (for example, Kennedy, Lane, Ber-

baum, and Lilienthal (1993) for simulator sickness, and

Witmer and Singer (1998) for presence) or subject self-

reports may need to be used.

3.3 Outside Factors Inﬂuencing

Performance

The interaction technique is not the sole determi-

nant of performance in a VE application. Rather, there

are multiple interacting factors. In particular, we have

identiﬁed four categories of outside factors that may

inﬂuence performance: characteristics of the task (such

as the required accuracy), environment (such as the

number of objects), user (such as spatial ability), and

system (such as stereo versus biocular viewing).

In our testbed experiments, we consider these factors

explicitly, varying those we feel to be most important

and holding the others constant. This leads to a much

richer understanding of performance. Often there are

too many possible outside factors to evaluate in a single

experiment. In this case, pilot studies can help to elimi-

nate some factors.

3.4 Testbed Evaluation

Our experimental evaluations of VE interaction

techniques have taken many forms, from simple ob-

servational user studies (Bowman & Hodges, 1997),

to usability evaluation (Bowman, Hodges et al.,

1998), to formal experiments (Bowman, Koller, &

Hodges, 1997). However, none of these methods is

able to examine the wide range of task conditions as

well as produce quantitative, general results. There-

fore, we propose the use of testbed evaluation as the

ﬁnal stage in the analysis of interaction techniques for

universal VE interaction tasks. This method addresses

the issues discussed above through the creation of

testbeds—environments and tasks that involve all of

the important aspects of a task, that test each compo-

nent of a technique, that consider outside inﬂuences

(factors other than the interaction technique) on per-

formance, and that have multiple performance mea-

sures.

As an example, consider a proving ground for auto-

mobiles. In this special environment, cars are tested in

cornering, braking, acceleration, and other tasks, over

multiple types of terrain, and in various weather condi-

tions. Task completion time is not the only performance

78 PRESENCE: VOLUME 10, NUMBER 1

variable considered. Rather, many quantitative and qual-

itative results are collected, such as accuracy, distance,

passenger comfort, and the user’s perception of the

“feel” of the steering.

3.5 Application of Testbed Results

Testbed evaluation produces a set of results that

characterize the performance of an interaction tech-

nique for a speciﬁed task. Performance is given in

terms of multiple performance metrics, with respect

to various levels of outside factors. These results be-

come part of a performance database for the interac-

tion task, with more information being added to the

database each time a new technique is run through

the testbed.

Testbed evaluation is not an end unto itself.

Rather, it has the goal of producing applications with

high levels of performance. Thus, the last step in our

methodology is to apply the performance results to

VE applications, with the goal of making them more

useful and usable. To choose interaction techniques

for applications appropriately, we must understand

the interaction requirements of the application. We

cannot simply declare one best technique, because

the technique that is best for one application will not

be optimal for another application with different re-

quirements. For example, a VE training system will

require a travel technique that maximizes the user’s

spatial awareness, but will not require a travel tech-

nique that maximizes point-to-point speed. On the

other hand, in a battle-planning system, speed of

travel may be the most important requirement.

Therefore, applications need to specify their interac-

tion requirements before the correct ITs can be chosen.

This speciﬁcation will be done in terms of the perfor-

mance metrics that we have already deﬁned as part of

our formal framework. Once the requirements are in

place, we can use the performance results from testbed

evaluation to recommend ITs that meet those require-

ments. These ITs, having been formally veriﬁed, should

increase the user’s performance levels and the applica-

tion’s usability.

4 Experiments

We present two experiments that bring together

the components of the formal methodology. The ﬁrst

testbed is designed to evaluate selection and manipula-

tion techniques, and the second is for travel techniques.

Each testbed is a set of tasks and environments that

measure the performance of various combinations of

technique components and outside factors for multiple

performance metrics.

Both testbeds were designed to test any technique

that could be created from its respective taxonomy.

However, exhaustive testbeds would be too immense to

carry out. Therefore, our testbeds have been simpliﬁed

to assess conditions based on a target application (see

section 5). Nevertheless, the tasks and environments are

not biased towards any particular set of techniques, and

others can be tested at any time with no loss of general-

ity. For both testbeds, the tasks used are simple and

general.

4.1 Selection and Manipulation

Testbed

We designed and implemented a limited testbed

that can evaluate selection and manipulation techniques

in a number of what we consider to be the most impor-

tant conditions. The analysis of importance is based on

our experiences with real applications, our more infor-

mal study of selection and manipulation, and the re-

quirements of our target application.

The testbed was designed to support the testing of

any technique that can be created from the taxonomy.

The tasks and environments are not biased towards any

particular set of techniques. We have evaluated nine

techniques, but others can be tested at any time with no

loss of generality.

In the selection phase, the user selects the correct ob-

ject from a group of objects. In the manipulation phase,

the user places the selected object within a target at a

given position and orientation. Figure 2 shows an exam-

ple trial. The user is to select the darker box in the cen-

ter of the 3 ⫻ 3 array of boxes, and then place it be-

tween the two wooden targets in the manipulation

Bowman et al. 79

Testbed evaluation of virtual environment interaction techniques

Figures

Citations

Special Section on Touching the 3rd Dimension: A survey of 3D object selection techniques for virtual environments

An Introduction to 3-D User Interface Design

A survey of usability evaluation in virtual environments: classification and comparison of methods

Multi-finger gestural interaction with 3d volumetric displays

Heuristic evaluation of virtual reality applications

References

Measuring Presence in Virtual Environments: A Presence Questionnaire

Simulator Sickness Questionnaire: An enhanced method for quantifying simulator sickness.

The go-go interaction technique: non-linear mapping for direct manipulation in VR

An evaluation of techniques for grabbing and manipulating remote objects in immersive virtual environments

Travel in immersive virtual environments: an evaluation of viewpoint motion control techniques

Related Papers (5)

The go-go interaction technique: non-linear mapping for direct manipulation in VR

3D User Interfaces: Theory and Practice

An evaluation of techniques for grabbing and manipulating remote objects in immersive virtual environments

Virtual reality on a WIM: interactive worlds in miniature

Travel in immersive virtual environments: an evaluation of viewpoint motion control techniques

Frequently Asked Questions (2)

Q1. What are the contributions in this paper?

Q2. What are the future works in this paper?