Journal•ISSN: 1432-1300

International Journal on Digital Libraries

Springer Science+Business Media

About: International Journal on Digital Libraries is an academic journal published by Springer Science+Business Media. The journal publishes majorly in the area(s): Digital library & Computer science. It has an ISSN identifier of 1432-1300. Over the lifetime, 509 publications have been published receiving 13889 citations.

...read moreread less

Topics: Digital library, Computer science, Metadata, Digital preservation, The Internet ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

The Lorel Query Language for Semistructured Data

[...]

Serge Abiteboul¹, Dallan Quass¹, Jason G. McHugh¹, Jennifer Widom¹, Janet L. Wiener¹ - Show less +1 more•Institutions (1)

Stanford University¹

01 Apr 1997-International Journal on Digital Libraries

TL;DR: The main novelties of the Lorel language are the extensive use of coercion to relieve the user from the strict typing of OQL, which is inappropriate for semistructured data; and powerful path expressions, which permit a flexible form of declarative navigational access and are particularly suitable when the details of the structure are not known to the user.

...read moreread less

Abstract: language, designed for querying semistructured data. Semistructured data is becoming more and more prevalent, e.g., in structured documents such as HTML and when performing simple integration of data from multiple sources. Traditional data models and query languages are inappropriate, since semistructured data often is irregular: some data is missing, similar concepts are represented using different types, heterogeneous sets are present, or object structure is not fully known. Lorel is a user-friendly language in the SQL/OQL style for querying such data effectively. For wide applicability, the simple object model underlying Lorel can be viewed as an extension of the ODMG data model and the Lorel language as an extension of OQL. The main novelties of the Lorel language are: (i) the extensive use of coercion to relieve the user from the strict typing of OQL, which is inappropriate for semistructured data; and (ii) powerful path expressions, which permit a flexible form of declarative navigational access and are particularly suitable when the details of the structure are not known to the user. Lorel also includes a declarative update language. Lorel is implemented as the query language of the Lore prototype database management system at Stanford. Information about Lore can be found at http://www-db.stanford.edu/lore. In addition to presenting the Lorel language in full, this paper briefly describes the Lore system and query processor. We also briefly discuss a second implementation of Lorel on top of a conventional object-oriented database management system, the O2 system.

...read moreread less

1,257 citations

Journal Article•DOI•

Automatic recognition of multi-word terms:. the C-value/NC-value method

[...]

Katerina T. Frantzi¹, Sophia Ananiadou¹, Hideki Mima²•Institutions (2)

University of Manchester¹, University of Tokyo²

01 Aug 2000-International Journal on Digital Libraries

TL;DR: This paper presents a domain-independent method for the automatic extraction of multi-word terms, from machine-readable special language corpora, using C-value/NC-value, which enhances the common statistical measure of frequency of occurrence for term extraction, making it sensitive to a particular type ofMulti- word terms, the nested terms.

...read moreread less

Abstract: Technical terms (henceforth called terms ), are important elements for digital libraries. In this paper we present a domain-independent method for the automatic extraction of multi-word terms, from machine-readable special language corpora. The method, (C-value/NC-value ), combines linguistic and statistical information. The first part, C-value, enhances the common statistical measure of frequency of occurrence for term extraction, making it sensitive to a particular type of multi-word terms, the nested terms. The second part, NC-value, gives: 1) a method for the extraction of term context words (words that tend to appear with terms); 2) the incorporation of information from term context words to the extraction of terms.

...read moreread less

849 citations

Journal Article•DOI•

Research-paper recommender systems: a literature survey

[...]

Joeran Beel, Bela Gipp¹, Stefan Langer², Corinna Breitinger³•Institutions (3)

University of Konstanz¹, Otto-von-Guericke University Magdeburg², Linnaeus University³

01 Nov 2016-International Journal on Digital Libraries

TL;DR: Several actions could improve the research landscape: developing a common evaluation framework, agreement on the information to include in research papers, a stronger focus on non-accuracy aspects and user modeling, a platform for researchers to exchange information, and an open-source framework that bundles the available recommendation approaches.

...read moreread less

Abstract: In the last 16 years, more than 200 research articles were published about research-paper recommender systems. We reviewed these articles and present some descriptive statistics in this paper, as well as a discussion about the major advancements and shortcomings and an overview of the most common recommendation concepts and approaches. We found that more than half of the recommendation approaches applied content-based filtering (55 %). Collaborative filtering was applied by only 18 % of the reviewed approaches, and graph-based recommendations by 16 %. Other recommendation concepts included stereotyping, item-centric recommendations, and hybrid recommendations. The content-based filtering approaches mainly utilized papers that the users had authored, tagged, browsed, or downloaded. TF-IDF was the most frequently applied weighting scheme. In addition to simple terms, n-grams, topics, and citations were utilized to model users' information needs. Our review revealed some shortcomings of the current research. First, it remains unclear which recommendation concepts and approaches are the most promising. For instance, researchers reported different results on the performance of content-based and collaborative filtering. Sometimes content-based filtering performed better than collaborative filtering and sometimes it performed worse. We identified three potential reasons for the ambiguity of the results. (A) Several evaluations had limitations. They were based on strongly pruned datasets, few participants in user studies, or did not use appropriate baselines. (B) Some authors provided little information about their algorithms, which makes it difficult to re-implement the approaches. Consequently, researchers use different implementations of the same recommendations approaches, which might lead to variations in the results. (C) We speculated that minor variations in datasets, algorithms, or user populations inevitably lead to strong variations in the performance of the approaches. Hence, finding the most promising approaches is a challenge. As a second limitation, we noted that many authors neglected to take into account factors other than accuracy, for example overall user satisfaction. In addition, most approaches (81 %) neglected the user-modeling process and did not infer information automatically but let users provide keywords, text snippets, or a single paper as input. Information on runtime was provided for 10 % of the approaches. Finally, few research papers had an impact on research-paper recommender systems in practice. We also identified a lack of authority and long-term research interest in the field: 73 % of the authors published no more than one paper on research-paper recommender systems, and there was little cooperation among different co-author groups. We concluded that several actions could improve the research landscape: developing a common evaluation framework, agreement on the information to include in research papers, a stronger focus on non-accuracy aspects and user modeling, a platform for researchers to exchange information, and an open-source framework that bundles the available recommendation approaches.

...read moreread less

648 citations

Journal Article•DOI•

Querying the World Wide Web

[...]

Alberto O. Mendelzon¹, George A. Mihaila¹, Tova Milo²•Institutions (2)

University of Toronto¹, Tel Aviv University²

01 Apr 1997-International Journal on Digital Libraries

TL;DR: This paper proposes a query language, WebSQL, that takes advantage of multiple index servers without requiring users to know about them, and that integrates textual retrieval with structure and topology-based queries.

...read moreread less

Abstract: The World Wide Web is a large, heterogeneous, distributed collection of documents connected by hypertext links. The most common technology currently used for searching the Web depends on sending information retrieval requests to "index servers". One problem with this is that these queries cannot exploit the structure and topology of the document network. The authors propose a query language, WebSQL, that takes advantage of multiple index servers without requiring users to know about them, and that integrates textual retrieval with structure and topology-based queries. They give a formal semantics for WebSQL using a calculus based on a novel "virtual graph" model of a document network. They propose a new theory of query cost based on the idea of "query locality," that is, how much of the network must be visited to answer a particular query. Finally, they describe a prototype implementation of WebSQL written in Java.

...read moreread less

402 citations

Journal Article•DOI•

Content-based image indexing and searching using Daubechies' wavelets

[...]

James Z. Wang¹, Gio Wiederhold¹, Oscar Firschein¹, Sha Xin Wei¹•Institutions (1)

Stanford University¹

03 Mar 1998-International Journal on Digital Libraries

TL;DR: WBIIS (Wavelet-Based Image Indexing and Searching), a new image indexing and retrieval algorithm with partial sketch image searching capability for large image databases, which performs much better in capturing coherence of image, object granularity, local color/texture, and bias avoidance than traditional color layout algorithms.

...read moreread less

Abstract: This paper describes WBIIS (Wavelet-Based Image Indexing and Searching), a new image indexing and retrieval algorithm with partial sketch image searching capability for large image databases. The algorithm characterizes the color variations over the spatial extent of the image in a manner that provides semantically meaningful image comparisons. The indexing algorithm applies a Daubechies' wavelet transform for each of the three opponent color components. The wavelet coefficients in the lowest few frequency bands, and their variances, are stored as feature vectors. To speed up retrieval, a two-step procedure is used that first does a crude selection based on the variances, and then refines the search by performing a feature vector match between the selected images and the query. For better accuracy in searching, two-level multiresolution matching may also be used. Masks are used for partial-sketch queries. This technique performs much better in capturing coherence of image, object granularity, local color/texture, and bias avoidance than traditional color layout algorithms. WBIIS is much faster and more accurate than traditional algorithms. When tested on a database of more than 10 000 general-purpose images, the best 100 matches were found in 3.3 seconds.

...read moreread less

346 citations

Collapse

Performance

Metrics

517

Papers

13,889

Citations

No. of papers from the Journal in previous years
Year	Papers
2023	31
2022	25
2021	38
2020	24
2019	25
2018	27