scispace - formally typeset
Open AccessJournal ArticleDOI

Assessing repository technology: where do we go from here?

Reads0
Chats0
TLDR
Three sample information retrieval systems, archie, autoLib, and WAIS, are compared as to their expressiveness and usefulness — first, in the general context of information retrieval, and then as prospective software reuse repositories.
Abstract
Three sample information retrieval systems, archie, autoLib, and Wide Area Information Service (WAIS), are compared with regard to their expressiveness and usefulness, first in the general context of information retrieval, and then as perspective software reuse repositories. While the representational capabilities of these systems are limited, they provide a useful foundation for future repository efforts, particularly from the perspective of repository distribution and coherent user interface design.

read more

Content maybe subject to copyright    Report

m
m
N93:12391
Assessing Repository Technology:
Where Do We Go From Here?*
David Eichmann t
Software Reuse Repository Lab (SoRReL)
Dept. of Statistics and Computer Science
West Virginia University
m
LJ
m
w
Send correspondence to:
David Eichmann
SoRReL
Dept. of Statistics and Computer Science
West Virginia University
Morgantown, WV 26506
email: eiehmann@cs.wvu.wvuetedu
n
w
* to appear in the International Journal of Software Engineering and Knowledge Engineering.
t This work was supported in part by NASA as Pan of the Repository Based Software Engineering project,
cooperative agreement NCC-9-16, project no. RICIS SE.43, subcontract no. 089 and in part by a grant from
MountainNct Inc.
1

l
Abstract
Three sample information retrieval_systems ,archie, autoLib, and
WAIS, are _rhpared as=to_their exp_ssivene_ and usefulness, first
in the general context of information retrieval, and then as prospec-
five software reuse repositories. While the representational capabil-
ities of these systems are limited, they provide a useful foundation
for future repository efforts, particularly from the perspective of re-
pository distribution and coherent user interface design.
m
Ii
!
!
I
m
w
U
m
I
W
g
=
imm
I
S
!
m
m
1
i
m
B
I
m
i
[]
I

m
n
w
r
w
w
1 - Introduction
As information becomes an .increasingly important sector of the global economy, the way in
which we access that information - and thereby the way in which we access and structure knowl.
edge - becomes a critical concern. The engineering of knowledge is quickly becoming an area of
research in its own fight, independent of its parent disciplines of artificial intelligence, database
systems, and information retrieval; consider the fl0e of the journal that you now hold in your hands.
t
Wegner recognized the value of knowledge engineering in his landmark article on the role of cap-
ital in software development:
"Knowledge engineering is a body of techniques for managing the complexity of knowledge.., itis
capital-intensive in the sense that reusability is aprimary consideration in the development of books,
expert systems, and oth_ stng'tutes for the management and use of knowledge." [10, p. 33]
Just as Wegner observed that the products of software engineering are capital, so are the products
of knowledge engineering a form of capital. Identification, structure, and locatability are critical to
the enabling of this knowledge capital. Innovation in this area is driven from two diverse perspec-
fives, the traditional perspective of researchers and a not-so-tradifional perspective of what might
be referred to as an information underground.
The goal of this information underground is not necessarily an extension of the state of the art,
but a rather more pragmatic development of an informational infrastructure [4]. The prototypes re-
suiting from this type of work propagate quickly over the Interact, immediately generating large
numbers of users. Even while still experimental, systems that provide distinct benefit frequently
need to limit access in order to maintain reasonable system performance for other users of the un-
derlying platforms.
My reference to this community as an underground is calculated, for even within the computer
science community (let alone the academic or commercial communities as a whole), only a small
percentage of individuals are aware of such information systems. This article was spurred by my
interest in software repositories, a number of conversations that I've had in recent months, and the
1

benefitI thinkcanbegainedby wideningtheforum for suchsystemstoalarger audience.
In particular, it is interesting to cvalUate_ Systems as an enab_g _chnology for software
reuse repositories. Repositories, and by implication, information retrieval mechanisms, play a crit-
ical role in successful reuse. This statement disagrees with the conventional wisdom [9], that reuse
is a social and managerial issue, and not a technical one. A closer examination of the conventional
wisdom leads to a recognition that without a repository with substantial representational capability
many of the social and managerial requirements cannot be supported.
This paper surveys a number of interesting information server projects, with an eye towards
enabling technologies. Section 2 lays down a typical scenario in which such systems are used.
Sample sessions for three systems appear in section 3, and an analysis appears in section 4. I con-
clude with remarks on the potential of future systems.
2 - A Scenario and User Profile
Consider aprogrammer involvedin a researchprojectin some reasonably sizeduniversity.I
choose this context not only for its personal familiarity, but also because
such projects typically take place in facilities with rich local and wide area network connectiv-
ity;
progranmaers typicallyhave a personalworkstationwith substantialdisplaycapabilities(e.g.,
X'Windows)i and
there are strong incentives in avoiding the redevelopment of capabilities available from other
projects, either local or remote.
In effect, the development environment is one which is typical, or will be within the next few years.
In addition, the social infrastructure and equipment infrastructure for a successful reuse program
arc present, if not an explicit charter for reuse, or a true repository.
Our programmer is now faced with a dilemma-- aware that there is a strong likelihood that a
m
u
I
J
I
i
w
i
J
I
I
J
!
m
w
J
J
M
_ i
w
2
m
i
g

n
w
w
w
m
i
needed tool or component already exists somewhere out on the network, but uncertain as to where
to begin the search in the thousands of systems that currently make up the Internet, or even how to
identify the needed artifact. Un_ recently the only choices included asking acquaintances for ad-
vice (although the study by Schwartz and Wood [7] demonstrated the amazing potential for even
ad hoe mechanisms such as this), poring over intermittently posted electronic digest news articles
for likely sounding names, or manually searching a few sites maintained by volunteers and acces-
sible through anonymous ftp. Obviously, our programmer is ripe for recruitment as a client of the
services provided by the information underground.
3 - Example Repositories
Early in the evolution of the Internet, system administrators began adapting file transfer facil-
ities into what today is referred to as anonymous ftp, comprised of publicly accessible accounts, a
limited file space, and a restricted command set. These facilities, while amazingly popular as a dis-
semination tool, presume a fair amount of user knowledge, not the least of which being where to
look for the sought-after artifact. This section describes three information systems, archie, WAIS,
and autoLib. Each of these systems has a distinct design focus, anonymous ftp access in archie,
document retrieval/display in WAIS, and a limited form of electronic library in autoLib. However,
the resulting systems have much in common, and their look and feel has several similarities. These
systems were selected for discussion because they were designed primarily as information retriev-
al systems, rather than as software repository systems.
w
3.1 - archie
The arehie system is "an on-line resource directory service for an intemetworked environment"
[3]. While archie isn't truly a repository per se, since it doesn't actually contain the artifacts that it
classifies, when treated as a whole with the diverse anonymous ftp sites that it reference_, it does
fit into our discussion. An:hie grew out of the efforts of Emtage and Deutsch to automate the cre-
ation and referencing of previously hand-maintained lists of anonymous ftp sites. A demon peri-
3
. =

Citations
More filters
Journal ArticleDOI

Advances in network information discovery and retrieval

TL;DR: The beginnings of network information discovery and retrieval are surveyed, how the Web has created a surprising level of integration of these systems, and where the current state of the art lies in creating globally accessible information spaces and supporting access to those information spaces are surveyed.
Book ChapterDOI

Recent efforts in internet repository services

TL;DR: Recent network information retrieval systems are compared for their expressiveness and usefulness — first, in the general context of information retrieval, and then as prospective software reuse repositories.
References
More filters
Book

Relevance weighting of search terms

TL;DR: This paper examines statistical techniques for exploiting relevance information to weight search terms using information about the distribution of index terms in documents in general and shows that specific weighted search methods are implied by a general probabilistic theory of retrieval.
Journal ArticleDOI

Relevance weighting of search terms

TL;DR: In this article, a series of relevance weighting functions is derived and is justified by theoretical considerations, in particular, it is shown that specific weighted search methods are implied by a general probabilistic theory of retrieval.
Journal ArticleDOI

The implementation of POSTGRES

TL;DR: The design and implementation decisions made for the three-dimensional data manager POSTGRES are discussed, and attention is restricted to the DBMS backend functions.
Journal ArticleDOI

Software reuse myths

TL;DR: This paper analyzes nine commonly believed software reuse myths and reveals certain technical, organizational, and psychological software engineering research issues and trends.
Journal ArticleDOI

Capital-Intensive Software Technology

TL;DR: Each section of this four-part article deals with a different aspect of capital-intensive software technology and presents an integrated view of the subject.