Showing papers on "Semantic Web Stack published in 1998"

PDF

Open Access

Proceedings Article•DOI•

Inferring Web communities from link topology

[...]

David Gibson¹, Jon Kleinberg², Prabhakar Raghavan³•Institutions (3)

University of California, Berkeley¹, Cornell University², IBM³

01 May 1998

TL;DR: This investigation shows that although the process by which users of the Web create pages and links is very difficult to understand at a “local” level, it results in a much greater degree of orderly high-level structure than has typically been assumed.

...read moreread less

Abstract: The World Wide Web grows through a decentralized, almost anarchic process, and this has resulted in a large hyperlinked corpus without the kind of logical organization that can be built into more tradit,ionally-created hypermedia. To extract, meaningful structure under such circumstances, we develop a notion of hyperlinked communities on the www t,hrough an analysis of the link topology. By invoking a simple, mathematically clean method for defining and exposing the structure of these communities, we are able to derive a number of themes: The communities can be viewed as containing a core of central, “authoritative” pages linked togh and they exhibit a natural type of hierarchical topic generalization that can be inferred directly from the pat,t,ern of linkage. Our investigation shows that although the process by which users of the Web create pages and links is very difficult to understand at a “local” level, it results in a much greater degree of orderly high-level structure than has typically been assumed.

...read moreread less

905 citations

Proceedings Article•

Learning to extract symbolic knowledge from the World Wide Web

[...]

Mark Craven, Daniel DiPasquo, Dayne Freitag¹, Andrew McCallum, Tom M. Mitchell, Kamal Nigam, Seán Slattery - Show less +3 more•Institutions (1)

Carnegie Mellon University¹

01 Jul 1998

TL;DR: The goal of the research described here is to automatically create a computer understandable world wide knowledge base whose content mirrors that of the World Wide Web, and several machine learning algorithms for this task are described.

...read moreread less

Abstract: The World Wide Web is a vast source of information accessible to computers, but understandable only to humans. The goal of the research described here is to automatically create a computer understandable world wide knowledge base whose content mirrors that of the World Wide Web. Such a knowledge base would enable much more effective retrieval of Web information, and promote new uses of the Web to support knowledge-based inference and problem solving. Our approach is to develop a trainable information extraction system that takes two inputs: an ontology defining the classes and relations of interest, and a set of training data consisting of labeled regions of hypertext representing instances of these classes and relations. Given these inputs, the system learns to extract information from other pages and hyperlinks on the Web. This paper describes our general approach, several machine learning algorithms for this task, and promising initial results with a prototype system.

...read moreread less

766 citations

Journal Article•DOI•

Web metadata: a matter of semantics

[...]

Ora Lassila¹•Institutions (1)

Nokia¹

01 Jul 1998-IEEE Internet Computing

TL;DR: This paper considers how the Resource Description Framework, with its focus on machine-understandable semantics, has the potential for saving time and yielding more accurate search results.

...read moreread less

Abstract: The sheer volume of information can make searching the Web frustrating. The paper considers how the Resource Description Framework, with its focus on machine-understandable semantics, has the potential for saving time and yielding more accurate search results. RDF, a foundation for processing metadata, provides interoperability between applications that exchange machine understandable information on the Web.

...read moreread less

192 citations

Reading Between the Lines: Using SHOE to Discover Implicit Knowledge from the Web

[...]

Jeff Heflin, James A. Hendler, Sean Luke¹•Institutions (1)

University of Maryland, College Park¹

01 Jan 1998

TL;DR: This paper describes how SHOE, a set of Simple HTML Ontological Extensions, can be used to discover implicit knowledge from the World-Wide Web through the use of context, inheritance and inference.

...read moreread less

Abstract: This paper describes how SHOE, a set of Simple HTML Ontological Extensions, can be used to discover implicit knowledge from the World-Wide Web (WWW) SHOE allows authors to annotate their pages with ontology-based knowledge about page contents In previous papers, we discussed how the semantic knowledge provided by SHOE allows users to issue queries that are much more sophisticated than keyword search techniques, including queries that require retrieval of information from many sources Here, we expand upon this idea by describing how SHOE’s ontologies allow agents to understand more than what is explicitly stated in Web pages through the use of context, inheritance and inference We use examples to illustrate the usefulness of these features to Web agents and query engines

...read moreread less

61 citations

Journal Article•DOI•

Semantic information retrieval

[...]

Annelise Mark Pejtersen

01 Apr 1998-Communications of The ACM

TL;DR: The aim is to demonstrate how the solution of basic problems in the design of digital libraries can benefit from human science theories and models when applying digital libraries to industrial engineering design.

...read moreread less

Abstract: several issues that call for cross-disciplinary cooperation with such human science disciplines as psychology, sociology, and information science. Therefore , an international research network called the Network for Engineering Design and Human Science was established in 1996 with participation by universities in Canada, Denmark, England, and the U.S., and Danfoss A/S, a Danish manufacturer. The aim is to demonstrate how the solution of basic problems in the design of digital libraries can benefit from human science theories and models when applying digital libraries to industrial engineering design. Let industrial designers learn from the experts in the human sciences as well as from their companies' collective memories. In industrial design, a product is simultaneously an object in the domains of marketing, technical design, manufacturing, user application, maintenance , and recycling (see Figure 1). Groups of designers with different professional backgrounds, domain expertise, and departmental affiliations cooperate across the organization, meeting in project groups to make design decisions about the product. Decisions cannot be made without detailed information about the product in the context of these various domains. Each participant is responsible for a particular knowledge domain of importance to the ultimate design, and work takes place among several departments and companies, all distributed geographically. This cooperation requires the non-trivial exploration of colleagues' work-domain knowledge and retrieval of information about previ

...read moreread less

51 citations

Proceedings Article•DOI•

A framework for consistent, replicated Web objects

[...]

Anne-Marie Kermarrec, Ihor Kuz¹, M. van Steen, Andrew S. Tanenbaum•Institutions (1)

Delft University of Technology¹

26 May 1998

TL;DR: A framework in which different caching and replication strategies can be devised independently per Web document, and a prototype in Java is developed to demonstrate the feasibility of implementing different strategies for different Web objects.

...read moreread less

Abstract: Despite the extensive use of caching techniques, the Web is overloaded. While the caching techniques currently used help some, it would be better to use different caching and replication strategies for different Web pages, depending on their characteristics. We propose a framework in which such strategies can be devised independently per Web document. A Web document is constructed as a worldwide, scalable distributed Web object. Depending on the coherence requirements for that document, the most appropriate caching or replication strategy can subsequently be implemented and encapsulated by the Web object. Coherence requirements are formulated from two different perspectives: that of the Web object, and that of clients using the Web object. We have developed a prototype in Java to demonstrate the feasibility of implementing different strategies for different Web objects.

...read moreread less

46 citations

Book Chapter•DOI•

Measuring Structural Similarity Among Web Documents: Preliminary Results

[...]

Isabel F. Cruz¹, Slava Borisov, Michael A. Marks¹, Timothy R. Webb¹•Institutions (1)

Worcester Polytechnic Institute¹

30 Mar 1998

TL;DR: It is noted that some structural properties can be identified with semantic properties of the data and provide measures for comparison between HTML documents.

...read moreread less

Abstract: When we describe a Web page informally, we often use phrases like “it looks like a newspaper site”, “there are several unordered lists” or “it's just a collection of links”. Unfortunately, no Web search or classification tools provide the capability to retrieve information using such informal descriptions that are based on the appearance, i.e., structure, of the Web page. In this paper, we take a look at the concept of structurally similar Web pages. We note that some structural properties can be identified with semantic properties of the data and provide measures for comparison between HTML documents.

...read moreread less

41 citations

Proceedings Article•

WebML: Querying the World-Wide Web for Resources and Knowledge.

[...]

Osmar R. Zaïane, Jiawei Han

01 Jan 1998

TL;DR: A declarative query language that would allow resource discovery on the Internet with interactive and progressively interactive inquiries and consents to the discovery of knowledge within the content of the documents and the structure of the hyperspace is proposed.

...read moreread less

Abstract: There is a massive increase of information available on electronic networks. This profusion of resources on the WorldWide Web gave rise to considerable interest in the research community. Traditional information retrieval techniques have been applied to the document collection on the Internet, and a myriad of search engines and tools have been proposed and implemented. However, the e ectiveness of these tools is not satisfactory. None of them is capable of discovering knowledge from the Internet. We propose a declarative query language that would allow resource discovery on the Internet with interactive and progressively re ned inquiries. The language also consents to the discovery of knowledge within the content of the documents and the structure of the hyperspace.

...read moreread less

40 citations

Proceedings Article•

Lexical Discovery with an Enriched Semantic Network.

[...]

Doug Beeferman

01 Jan 1998

TL;DR: This paper introduces a database system called FreeNet that facilitates the description and exploration of finite binary relations and describes the design and implementation of Lexical FreeNet, a semantic network that mixes WordNet-derived semantic relations with data-derived and phonetically-derived relations.

...read moreread less

Abstract: The study of lexical semantics has produced a systematic analysis of binary relationships between content words that has greatly benefited lexical search tools and natural language processing algorithms. We first introduce a database system called FreeNet that facilitates the description and exploration of finite binary relations. We then describe the design and implementation of Lexical FreeNet, a semantic network that mixes WordNet-derived semantic relations with data-derived and phonetically-derived relations. We discuss how Lexical FreeNet has aided in lexical discovery, the pursuit of linguistic and factual knowledge by the computer-aided exploration of lexical relations. 1 M o t i v a t i o n This paper discusses Lexical FreeNet, a database system designed to enhance lexical discovery. By this we mean the pursuit of linguistic and factual knowledge with the computer-aided exploration of lexical relations. Lexical FreeNet is a semantic network that leverages WordNet and other knowledge and data sources to facilitate the discovery of nontrivial lexical connections between words and concepts. A semantic network allied with the proper user interface can be a useful tool in its own right. By organizing words semantically rather than alphabetically, WordNet provides a means by which users can navigate its vocabulary logically, establishing connections between concepts and not simply character sequences. Exploring the WordNet hyponym tree starting at the word mammal, for instance, reveals to us that aardvarks are mammals; exploring WordNet's meronym relation at the word tv, mr*al reveals to us that mammals have ha i r . From these two explorations we can accurately conclude that aardvarks have hair. Lexical exploration need not be limited to one step at a time, however. Viewing a semantic network as a computational structure awaiting graph-theoretic queries gives us the freedom to demand services beyond mete lookup. "Does the aardvark have hair?", or "What is the closest connection between aardvarks and hair?" or "How interchangably can the words aardvark and a n t e a t e r be used?" are all reasonable questions with answers staring us in the 135 face. Of course, the idea of finding shortest paths in semantic networks (through so-called activationspreading or intersection search) is not new. But these questions have typically been asked of very limited graphs, networks for domains far narrower than the lexical space of English, say. We feel that formalizing how WordNet can be employed for this broader sort of lexical discovery is a good start. We also feel that it is necessary first to enrich the network with information that, as we shall see, cannot be easily gleaned from WordNet's current battery of relations. The very large electronic corpora and wide variety of linguistic resources that today's computing technology has enabled will in turn enable this. The remainder of this paper is organized as follows. We shall first describe in Section 2 the FreeNet database system for the expression and analysis of relational data. In Section 3 we'll describe the design and construction of an instance of this database called Lexical FreeNet. We'll conclude by providing examples of applications of Lexical FreeNet to lexical discovery.

...read moreread less

36 citations

Journal Article•DOI•

Formal models of Web queries

[...]

Alberto O. Mendelzon¹, Tova Milo²•Institutions (2)

University of Toronto¹, Tel Aviv University²

01 Dec 1998-Information Systems

TL;DR: A new formal model of query and computation on the Web is presented, focusing on two important aspects that distinguish the access to Web data from theAccess to a standard database system: the navigational nature of the access and the lack of concurrency control.

...read moreread less

27 citations

Proceedings Article•DOI•

TourisT: the application of a description logic based semantic hypermedia system for tourism

[...]

Joe Bullock¹, Carole Goble¹•Institutions (1)

University of Manchester¹

01 May 1998

TL;DR: The requirements of a tourism hypermedia system resulting from ethnographic studies of tourist advisers are presented, and it is concluded that an open semantic hypermedia (SH) approach is appropriate.

...read moreread less

Abstract: Web-based Public Information Systems of the kind common in tourism do not satisfy the needs of the customer because they do not offer a sufficiently flexible linking environment capable of emulating the mediation role of a tourist adviser. We present the requirements of a tourism hypermedia system resulting from ethnographic studies of tourist advisers, and conclude that an open semantic hypermedia (SH) approach is appropriate. We present a novel and powerful SH prototype based on the use of a semantic model expressed as a terminology. The terminological model is implemented by a Description Logic, GRAIL, capable of the automatic and dynamic multi-dimensional classification of concepts, and hence the web pages they describe, We show how GRAIL-Link has been used within the TourisT hypermedia system and conclude with a discussion.

...read moreread less

Book Chapter•DOI•

Join Processing in Web Databases

[...]

Sourav S. Bhowmick¹, Wee Keong Ng¹, Ee-Peng Lim¹, Sanjay Kumar Madria¹•Institutions (1)

Nanyang Technological University¹

24 Aug 1998

TL;DR: Issues in web join, a new operator that combines information from two web tables to yield a new web table, are discussed.

...read moreread less

Abstract: Recently, there has been increasing interests in data models and query languages for unstructured data in the World Wide Web. When web data is harnessed in a web warehouse, new and useful information can be derived through appropriate information manipulation. In our web warehousing project, we introduce a new operator called the web join. Like its relational counterpart, web join combines information from two web tables to yield a new web table. This paper discusses various issues in web join such as join semantics, joinability, and join evaluation.

...read moreread less

Journal Article•DOI•

Utilizing the multiple facets of WWW contents

[...]

Yakov A. Kogan¹, David Michaeli¹, Yehoshua Sagiv¹, Oded Shmueli²•Institutions (2)

Hebrew University of Jerusalem¹, Technion – Israel Institute of Technology²

15 Dec 1998

TL;DR: A new type of tags for denoting the semantics of data stored in HTML pages,implemented as HTML comments, superimpose on HTML pages semistructured objects in the style of the OEM model.

...read moreread less

Abstract: Current query languages for the Web (e.g., W3QL, WebLog and WebSQL) explore the structure of the Web. However, usually, the structure of the Web has little to do with the semantics of the data. Therefore, it is practically difficult to pose database queries over the Web. We introduce a new type of tags for denoting the semantics of data stored in HTML pages. These semantic tags (implemented as HTML comments) superimpose on HTML pages semistructured objects in the style of the OEM model. The paper discusses two implemented tools for fully utilizing the semantics. The first is a visualization tool for displaying both the HTML reading of Web pages and the OEM reading of Web pages. The second tool is a query language, similar to LOREL, that can query the HTML structure and/or the OEM reading. The above formalism and tools provide data-modeling capabilities for the Web that fit its heterogeneous nature. Real database queries, taking the OEM point of view, can be formulated, including queries about the schema as well as queries about the HTML structure of Web pages. Therefore, the query language is not restricted to portions of the Web in which semantic tags are used.

...read moreread less

Book Chapter•DOI•

Signature File Methods for Semantic Query Caching

[...]

Boris Chidlovskii¹, Uwe M. Borghoff²•Institutions (2)

Xerox¹, Bundeswehr University Munich²

21 Sep 1998

TL;DR: This work studies the problem of semantic caching of Web queries and develops a caching mechanism for conjunctiveWeb queries based on signature files that extends this processing to more complex cases of semantic intersection.

...read moreread less

Abstract: In digital libraries accessing distributed Web-based bibliographic repositories, performance is a major issue. Efficient query processing requires an appropriate caching mechanism. Unfortunately, standard page-based as well as tuple-based caching mechanisms designed for conventional databases are not efficient on the Web, where keyword-based querying is often the only way to retrieve data. Therefore, we study the problem of semantic caching of Web queries and develop a caching mechanism for conjunctiveWeb queries based on signature files. We propose two implementation choices. A first algorithm copes with the relation of semantic containment between a query and the corresponding cache items. A second algorithm extends this processing to more complex cases of semantic intersection. We report results of experiments and show how the caching mechanism is successfully realized in the Knowledge Broker system.

...read moreread less

Proceedings Article•

Conjunctive Point Predicate-based Semantic Caching for Wrappers in Web Databases.

[...]

Dongwon Lee, Wesley W. Chu

01 Jan 1998

TL;DR: In this scheme, tasks for query translation/capability mapping (named as query naturalization) between wrappers and web sources and tasks for semantic caching are seamlessly integrated, resulting in easier query optimization.

...read moreread less

Abstract: A semantic caching scheme suitable for web database environments is proposed. In our scheme, tasks for query translation/capability mapping (named as query naturalization) between wrappers and web sources and tasks for semantic caching are seamlessly integrated, resulting in easier query optimization. A semantic cache consists of three components: 1) semantic view , a description of the contents in the cache using sub-expressions of the previous queries, 2) semantic index , an index for the tuple IDs that satisfy the semantic view, and 3) physical storage, a storage containing the tuples (or objects) that are shared by all semantic views in the cache. Types of matching between the native query and cache query are discussed. Algorithms for nding the optimal match of the input query in semantic cache and for cache replacement are presented. The proposed techniques are being implemented in a cooperative web database (CoWeb) prototype at UCLA.

...read moreread less

Search, Analysis, and Integration of Web Documents: A Case Study with FLORID.

[...]

Rainer Himmeröder, Paul-Thomas Kandzia, Bertram Ludäscher, Wolfgang May, Georg Lausen - Show less +1 more

01 Jan 1998

TL;DR: Since the functionality of wrappers and mediators is integrated into a single declarative language the development of advanced applications based on the Web as an information source is signi cantly simpli ed.

...read moreread less

Abstract: Languages supporting deduction and object orientation seem particularly promising for querying and reasoning about structure and contents of the Web and for the integration of information from heterogeneous sources Florid an implementation of the deductive object oriented language F logic has been extended to provide a declarative semantics for querying the Web This extension allows extraction and restructuring of data from the Web and a seamless integration with local data Since the functionality of wrappers and mediators is integrated into a single declarative language the development of advanced applications based on the Web as an information source is signi cantly simpli ed This claim is substantiated using a comprehensive example

...read moreread less

Journal Article•DOI•

What practices are being adopted on the Web

[...]

Volker Turau

01 May 1998-IEEE Computer

TL;DR: The goal of the Empirical Web Analysis (EWA) project was to investigate the discrepancy between commercial and academic Web design, paying special attention to these new features of industry related Web pages.

...read moreread less

Abstract: Frequent users of the Web will have noticed an emerging discrepancy between university Web pages and commercial sites. An abundance of animated GIFs, image maps, fancy background images, frames and advanced font handling are characteristic of industry related Web pages. Web designers in academia still remain closer to the original purpose of HTML: to delineate the structure and semantics of a document and not its presentation in a browser. The goal of the Empirical Web Analysis (EWA) project was to investigate the discrepancy between commercial and academic Web design, paying special attention to these new features.

...read moreread less

Proceedings Article•DOI•

Mathematics and equations on the WWW

[...]

M.O. Hagler¹•Institutions (1)

Texas Tech University¹

04 Nov 1998

TL;DR: With the release and widespread support of XML (extensible markup language) and the development of MathML, Web pages not only can display mathematics and equations in TeX-like fashion, but, beyond that, retain the meaning of the equations so that they can be opened and processed by a variety of mathematical software applications.

...read moreread less

Abstract: One of the ironies of the World Wide Web (WWW or simply the Web) is that even though it was initially conceived and implemented for use by physicists, it provided no special capabilities for mathematics and equations. With the release and widespread support of XML (extensible markup language) and the development of MathML, Web pages not only can display mathematics and equations in TeX-like fashion, but, beyond that, retain the meaning of the equations so that they can be opened and processed by a variety of mathematical software applications. The Web thus can expand the scope of its inherent intense interactivity to include equations and mathematics, as well as text and multimedia.

...read moreread less

Proceedings Article•DOI•

The IDEA Web lab

[...]

Stefano Ceri¹, Piero Fraternali², Stefano Paraboschi²•Institutions (2)

Polytechnic University of Milan¹, Instituto Politécnico Nacional²

01 Jun 1998

TL;DR: The IDEA Web Laboratory (Web Lab), a Web-based software design environment available on the Internet, is presented, which demonstrates a novel approach to the software production process on the Web.

...read moreread less

Abstract: With the spreading of the World Wide Web as a uniform and ubiquitous interface to computer applications and information, novel opportunities are offered for introducing significant changes in all organizations and their processes. This demo presents the IDEA Web Laboratory (Web Lab), a Web-based software design environment available on the Internet, which demonstrates a novel approach to the software production process on the Web.

...read moreread less

Journal Article•

Structure-based queries over the World Wide Web

[...]

Tao Guan¹, Miao Liu¹, Lawrence V. Saxton¹•Institutions (1)

University of Regina¹

01 Jan 1998-Lecture Notes in Computer Science

TL;DR: NetQL as discussed by the authors is a query language designed for local structure-based queries, which not only exploits the topology of web pages given by hyperlinks, but also supports queries involving information inside pages.

...read moreread less

Abstract: With the increasing importance of the World Wide Web as an information repository, how to locate documents of interest becomes more and more significant. The current practice is to send keywords to search engines. However, these search engines lack the capability to take the structure of the Web into consideration. We thus present a novel query language, NetQL and its implementation, for accessing the World Wide Web. Rather than working on global text-full search, NetQL is designed for local structure-based queries. It not only exploits the topology of web pages given by hyperlinks, but also supports queries involving information inside pages. A novel approach to extract information from web pages is presented. In addition, the methods to control the complexity of query processing are also addressed in this paper.

...read moreread less

Journal Article•DOI•

Science and the Web

[...]

Yves Poumay¹•Institutions (1)

Université de Namur¹

22 May 1998-Science

Proceedings Article•

The Web is not a Database.

[...]

Alberto O. Mendelzon

01 Jan 1998

Proceedings Article•

An Extraction Language for the Web.

[...]

Danilo Montesi, Alberto Trombetta

01 Jan 1998

Journal Article•DOI•

Implicit access to semantic information

[...]

Andrew W. Young

01 Aug 1998-Neurocase

Journal Article•

Developing web pages : points to consider

[...]

A. Boudreau

01 Jan 1998-Multimedia information & technology

Proceedings Article•

Semantic Integration of Information based on the Multi Aspect Semantic Model

[...]

Jeong Oog Lee¹, Doo Kwon Baik•Institutions (1)

Korea University¹

01 Jan 1998

A Framework for Managing Semantic Constraints in Software Repositories

[...]

Atsushi Sawada, Naruki Mitsuda, Tsuneo Ajisaka

15 Nov 1998

Journal Article•

Information coupling in web databases

[...]

Sourav S. Bhowmick¹, Wee Keong Ng¹, Ee-Peng Lim¹•Institutions (1)

Nanyang Technological University¹

01 Jan 1998-Lecture Notes in Computer Science

TL;DR: This paper discusses and shows how two web operators are used to associate related web information from the WWW and also from multiple web tables in a web warehouse.

...read moreread less

Abstract: Web information coupling refers to an association of topically related web documents. This coupling is initiated explicitly by a user in a web warehouse specially designed for web information. Web information coupling provides the means to derive additional, useful information from the WWW. In this paper, we discuss and show how two web operators, i.e., global web coupling and local web coupling, are used to associate related web information from the WWW and also from multiple web tables in a web warehouse. This paper discusses various issues in web coupling such as coupling semantics, coupling-compability, and coupling evaluation.

...read moreread less