scispace - formally typeset
Search or ask a question

Showing papers presented at "ACM international conference on Digital libraries in 2007"


Book ChapterDOI
13 Feb 2007
TL;DR: An implementation framework called MESSIF is described that provides a number of modules from basic storage management, over a wide support for distributed processing, to automatic collecting of performance statistics, due to its open and modular design.
Abstract: The similarity search has become a fundamental computational task in many applications. One of the mathematical models of the similarity - the metric space - has drawn attention of many researchers resulting in several sophisticated metric-indexing techniques. An important part of a research in this area is typically a prototype implementation and subsequent experimental evaluation of the proposed data structure. This paper describes an implementation framework called MESSIF that eases the task of building such prototypes. It provides a number of modules from basic storage management, over a wide support for distributed processing, to automatic collecting of performance statistics. Due to its open and modular design it is also easy to implement additional modules, if necessary. The MESSIF also offers several ready to use generic clients that allow to control and test the index structures.

68 citations


Book ChapterDOI
13 Feb 2007
TL;DR: IFLA and ICOM are merging their core ontologies, an important step towards semantic interoperability of metadata schemata across all archives, libraries and museums, which opens new prospects for advanced global information integration services.
Abstract: Even though the Dublin Core Metadata Element Set is well accepted as a general solution, it fails to describe more complex information assets and their cross-correlation. These include data from political history, history of arts and sciences, archaeology or observational data from natural history or geosciences. Therefore IFLA and ICOM are merging their core ontologies, an important step towards semantic interoperability of metadata schemata across all archives, libraries and museums. It opens new prospects for advanced global information integration services. The first draft of the combined model was published in June 2006.

48 citations


Book ChapterDOI
13 Feb 2007
TL;DR: The DelosDLMS combines text and audio-visual searching, offers new information visualization and relevance feedback tools, provides novel interfaces, allows retrieved information to be annotated and processed, integrates and processes sensor data streams, and is easily configured and adapted while being reliable and scalable.
Abstract: DelosDLMS is a prototype of a next-generation Digital Library (DL) management system. It is realized by combining various specialized DL functionalities provided by partners of the DELOS network of excellence. Currently, DelosDLMS combines text and audio-visual searching, offers new information visualization and relevance feedback tools, provides novel interfaces, allows retrieved information to be annotated and processed, integrates and processes sensor data streams, and finally, from a systems engineering point of view, is easily configured and adapted while being reliable and scalable. The prototype is based on the OSIRIS/ISIS platform, a middleware environment developed by ETH Zurich and now being extended at the University of Basel.

32 citations


Book ChapterDOI
13 Feb 2007
TL;DR: This paper describes the evaluation methodology developed in INEX, with particular focus on how evaluation metrics and the notion of relevance are treated.
Abstract: Evaluating the effectiveness of XML retrieval requires building test collections where the evaluation paradigms are provided according to criteria that take into account structural aspects. The INitiative for the Evaluation of XML retrieval (INEX) was set up in 2002, and aimed to establish an infrastructure and to provide means, in the form of large test collections and appropriate scoring methods, for evaluating the effectiveness of content-oriented XML retrieval. This paper describes the evaluation methodology developed in INEX, with particular focus on how evaluation metrics and the notion of relevance are treated.

27 citations


Book ChapterDOI
13 Feb 2007
TL;DR: The medium/long-term aim is to create a large-scale Digital Library System (DLS) of scientific data which supports services for the creation, interpretation and use of multidisciplinary and multilingual digital content.
Abstract: Information Retrieval system evaluation campaigns produce valuable scientific data, which should be preserved carefully so that they can be available for further studies. A complete record should be maintained of all analyses and interpretations in order to ensure that they are reusable in attempts to replicate particular results or in new research and so that they can be referred to or cited at any time. In this paper, we describe the data curation approach for the scientific data produced by evaluation campaigns. The medium/long-term aim is to create a large-scale Digital Library System (DLS) of scientific data which supports services for the creation, interpretation and use of multidisciplinary and multilingual digital content.

25 citations


Book ChapterDOI
13 Feb 2007
TL;DR: XS2OWL is developed, a model and a system that enable the automatic transformation of XML Schemas in OWL-DL and enables the consistent transformation of the derived knowledge from OWl-DL to XML constructs that obey the original XMLschemas.
Abstract: The domination of XML in the Internet for data exchange has led to the development of standards with XML Schema syntax for several application domains. Advanced semantic support, provided by domain ontologies and semantic Web tools like logic-based reasoners, is still very useful for many applications. In order to provide it, interoperability between XML Schema and OWL is necessary so that XML schemas can be converted to OWL. This way, the semantics of the standards can be enriched with domain knowledge encoded in OWL domain ontologies and further semantic processing may take place. In order to achieve interoperability between XML Schema and OWL, we have developed XS2OWL, a model and a system that are presented in this paper and enable the automatic transformation of XML Schemas in OWL-DL. XS2OWL also enables the consistent transformation of the derived knowledge (individuals) from OWL-DL to XML constructs that obey the original XML Schemas.

23 citations


Book ChapterDOI
13 Feb 2007
TL;DR: The first main contribution of this paper is the investigation of task inference theoretical issues, and it is shown how the use of the Personal Ontology helps for computing simple task inference.
Abstract: The goal of DELOS Task 4.8 Task-centered Information Management is to provide the user with a Task-centered Information Management system (TIM), which automates user's most frequent activities, by exploiting the collection of personal documents. In previous work we have explored the issue of managing personal data by enriching them with semantics according to a Personal Ontology, i.e. a user-tailored description of her domain of interest. Moreover, we have proposed a task specification language and a top-down approach to task inference, where the user specifies main aspects of the tasks using forms of declarative scripting. Recently, we have addressed new challenging issues related to TIM user's task inference. More precisely, the first main contribution of this paper is the investigation of task inference theoretical issues. In particular, we show how the use of the Personal Ontology helps for computing simple task inference. The second contribution is an architecture for the system that implements simple task inference. In the current phase we are implementing a prototype for TIM whose architecture is the one presented in this paper.

21 citations


Book ChapterDOI
13 Feb 2007
TL;DR: A general methodology for gathering and mining information from Web log files by abstracting from the analysis of logs which use a well-defined standard format, such as the Extended Log File Format proposed by W3C.
Abstract: In this paper, a general methodology for gathering and mining information from Web log files is proposed. A series of tools to retrieve, store, and analyze the data extracted from log files have been designed and implemented. The aim is to form general methods by abstracting from the analysis of logs which use a well-defined standard format, such as the Extended Log File Format proposed by W3C. The methodology has been experimented on the Web log files of The European Library portal; the experimental analyses led to personal, technical, geographical and temporal findings about the usage and traffic load. Considerations about a more accurate tracking of users and users profiles, and a better management of crawler accesses using authentication are presented.

18 citations


Book ChapterDOI
13 Feb 2007
TL;DR: A semantic relatedness measure for OWL domain ontologies that concludes to the semantic ranking of ontological, grammatically-related structures is proposed and evaluated.
Abstract: We present in this paper the design and implementation of the OntoNL Framework, a natural language interface generator for knowledge repositories, as well as a natural language system for interactions with multimedia repositories which was built using the OntoNL Framework. The system allows the users to specify natural language requests about the multimedia content with rich semantics that result to digital content delivery. We propose and evaluate a semantic relatedness measure for OWL domain ontologies that concludes to the semantic ranking of ontological, grammatically-related structures. This procedure is used to disambiguate in a particular domain of context and represent in an ontology query language, natural language expressions. The ontology query language that we use is the SPARQL. The construction of the queries is automated and also dependent on the semantic relatedness measurement of ontology concepts. We also present the results of experimentation with the system.

15 citations


Book ChapterDOI
13 Feb 2007
TL;DR: This paper validates this reference architecture by describing the structure of two current systems, DILIGENT and DRIVER, facing the problem to deliver large-scale digital libraries in two different contexts and with diverse technologies.
Abstract: A reference architecture for a given domain provides an architectural template which can be used as a starting point for designing the software architecture of a system in that domain. Despite the popularity of tools and systems commonly termed "Digital Library", very few attempts exist to set the foundation governing their development thus making integration and reuse of third party assets and results very difficult. This paper presents a reference architecture for the Digital Library domain characterised by many, multidisciplinary and distributed players, both resource providers and consumers, whose requirements evolve along the time. The paper validates this reference architecture by describing the structure of two current systems, DILIGENT and DRIVER, facing the problem to deliver large-scale digital libraries in two different contexts and with diverse technologies.

14 citations


Book ChapterDOI
13 Feb 2007
TL;DR: This work describes in detail the new MPEG-7 Perceptual 3D Shape Descriptor and provides a set of tests with different 3D objects databases, mainly with the Princeton Shape Benchmark.
Abstract: In this work, we describe in detail the new MPEG-7 Perceptual 3D Shape Descriptor and provide a set of tests with different 3D objects databases, mainly with the Princeton Shape Benchmark. With this purpose we created a function library called Retrieval-3D and fixed some bugs of the MPEG-7 eXperimentation Model (XM). We explain how to match the Attributed Relational Graph (ARG) of every 3D model with the modified nested Earth Mover's Distance (mnEMD). Finally we compare our results with the best found in literature, including the first MPEG-7 3D descriptor, i.e. the Shape Spectrum Descriptor.

Book ChapterDOI
13 Feb 2007
TL;DR: The paper presents the ISIS/OSIRIS system which consists of a generic infrastructure for the reliable execution of distributed service-based applications (OSIRis) and a set of dedicated Digital Library application services (ISIS) that provide, among others, content-based search in multimedia collections.
Abstract: Future information spaces such as Digital Libraries require new infrastructures that allow to use and to combine various kinds of functions in a unified and reliable way. The paradigm of service-oriented architectures (SoA) allows providing application functionality in a modular, self-contained way and to individually combine this functionality. The paper presents the ISIS/OSIRIS system which consists of a generic infrastructure for the reliable execution of distributed service-based applications (OSIRIS) and a set of dedicated Digital Library application services (ISIS) that provide, among others, content-based search in multimedia collections.

Book ChapterDOI
13 Feb 2007
TL;DR: Results from experiments using human labellers are presented to assist in genre characterisation and the prediction of obstacles which need to be overcome by an automated system, and to contribute to the process of creating a solid testbed corpus for extending automated genre classification and testing metadata extraction tools across genres.
Abstract: This paper examines genre classification of documents and its role in enabling the effective automated management of digital documents by digital libraries and other repositories. We have previously presented genre classification as a valuable step toward achieving automated extraction of descriptive metadata for digital material. Here, we present results from experiments using human labellers, conducted to assist in genre characterisation and the prediction of obstacles which need to be overcome by an automated system, and to contribute to the process of creating a solid testbed corpus for extending automated genre classification and testing metadata extraction tools across genres. We also describe the performance of two classifiers based on image and stylistic modeling features in labelling the data resulting from the agreement of three human labellers across fifteen genre classes.

Book ChapterDOI
13 Feb 2007
TL;DR: The aim of this work is to integrate different archive realities in order to provide unique public access to archival information and propose a non-intrusive, flexible and scalable solution that preserves archives identity and autonomy.
Abstract: We present a solution to the problem of sharing metadata between different archives spread across a geographic region. In particular we consider the Italian Veneto Region archives. Initially we analyze the Veneto Region information system based on a domain gateway system called "SIRV-INTEROP project" and we propose a solution to provide advanced services against the regional archives. We deal with these issues in the context of the SIAR - Regional Archival Information System - project. The aim of this work is to integrate different archive realities in order to provide unique public access to archival information. Moreover we propose a non-intrusive, flexible and scalable solution that preserves archives identity and autonomy.

Book ChapterDOI
13 Feb 2007
TL;DR: This paper developed a semantic recommender system, called ITem Recommender, able to disambiguate documents before using them to learn the user profile, and developed a Conference Participant Advisor service, which relies on the profiles learned by ITemRecommender to build a personalized conference program.
Abstract: This paper describes the possible use of advanced content-based recommendation methods in the area of Digital Libraries. Content-based recommenders analyze documents previously rated by a target user, and build a profile exploited to recommend new interesting documents. One of the main limitations of traditional keyword-based approaches is that they are unable to capture the semantics of the user interests, due to the natural language ambiguity. We developed a semantic recommender system, called ITem Recommender, able to disambiguate documents before using them to learn the user profile. The Conference Participant Advisor service relies on the profiles learned by ITem Recommender to build a personalized conference program, in which relevant talks are highlighted according to the participant's interests.

Book ChapterDOI
13 Feb 2007
TL;DR: The DELOS Digital Preservation Testbed is used to evaluate various alternatives with respect to specific requirements for the preservation of master theses and presents the results.
Abstract: Digital preservation has turned into a pressing challenge for institutions having the obligation to preserve digital objects over years. A range of tools exist today to support the variety of preservation strategies such as migration or emulation. Heterogeneous content, complex preservation requirements and goals, and untested tools make the selection of a preservation strategy very difficult. The Austrian National Library will have to preserve electronic theses and dissertations provided as PDF files and are thus investigating potential preservation solutions. The DELOS Digital Preservation Testbed is used to evaluate various alternatives with respect to specific requirements. It provides an approach to make informed and accountable decisions on which solution to implement in order to preserve digital objects for a given purpose. We analyse the performance of various preservation strategies with respect to the specified requirements for the preservation of master theses and present the results.

Book ChapterDOI
13 Feb 2007
TL;DR: A feasibility study investigating possible approaches to extend The European Library with multilingual information access finds that both approaches address the specific characteristics of TEL well, and that there is considerable potential for a combination of the two alternatives.
Abstract: A feasibility study was conducted within the confines of the DELOS Network of Excellence with the aim of investigating possible approaches to extend The European Library (TEL) with multilingual information access, i.e. the ability to use queries in one language to retrieve items in different languages. TEL uses a loose coupling of different search systems, and deals with very short information items. We address these two characteristics with two different approaches: the "isolated query translation" approach, and the "pseudo-translated expanded records" approach. The former approach has been studied together with its implications on the user interface, while the latter approach has been evaluated using a test collection of over 150,000 records from the TEL central index. We find that both approaches address the specific characteristics of TEL well, and that there is considerable potential for a combination of the two alternatives.

Book ChapterDOI
13 Feb 2007
TL;DR: This work outlines basic results on the problem and shows how intensions can be exploited for carrying out basic tasks on collections, establishing a connection between Digital Library management and data integration.
Abstract: Digital Libraries collections are an abstraction mechanism, endowed with an extension and an intension, similarly to predicates in logic. The extension of a collection is the set of objects that are members of the collection at a given point in time, while the intension is a description of the meaning of the collection, that is the peculiar property that the members of the collection possess and that distinguishes the collection from other collections. This view reconciles the many types of collections found in Digital Library systems, but raises several problems, among which how to automatically derive the intension from a given extension. This problem must be solved e.g. for the creation of a collection from a set of documents. We outline basic results on the problem and then show how intensions can be exploited for carrying out basic tasks on collections, establishing a connection between Digital Library management and data integration.

Book ChapterDOI
13 Feb 2007
TL;DR: Novel approaches to reliable DSM within a DL infrastructure are presented, including information filtering operators, a declarative query engine called MXQuery, and efficient operator checkpointing to maintain high result quality of DSM.
Abstract: Data Stream Management (DSM) addresses the continuous processing of sensor data. DSM requires the combination of stream operators, which may run on different distributed devices, into stream processes. Due to the recent advantages in sensor technologies and wireless communication, the amount of information generated by DSM will increase significantly. In order to efficiently deal with this streaming information, Digital Library (DL) systems have to merge with DSM systems. Especially in healthcare, the continuous monitoring of patients at home (telemonitoring) will generate a significant amount of information stored in an e-health digital library (electronic patient record). In order to stream-enable DL systems, we present an integrated data stream management and Digital Library infrastructure in this work. A vital requirement for healthcare applications is however that this infrastructure provides a high degree of reliability. In this paper, we present novel approaches to reliable DSM within a DL infrastructure. In particular, we propose information filtering operators, a declarative query engine called MXQuery, and efficient operator checkpointing to maintain high result quality of DSM. Furthermore, we present a demonstrator implementation of the integrated DSM and DL infrastructure, called OSIRIS-SE. OSIRIS-SE supports flexible and efficient failure handling to ensures complete and consistent continuous data stream processing and execution of DL processes even in the case of multiple failures.

Book ChapterDOI
13 Feb 2007
TL;DR: From a technical perspective, the paper addresses how services can be made available in a distributed way, how distributed P2P infrastructures for the management of EHRs can be evaluated, and how novel contentbased access can be provided for multimedia Ehrs.
Abstract: Digital Libraries (DLs) in eHealth are composed of electronic artefacts that are generated and owned by different healthcare providers. A major characteristic of eHealth DLs is that information is under the control of the organisation where data has been produced. The electronic health record (EHR) of patients therefore consists of a set of distributed artefacts and cannot be materialised for organisational reasons. Rather, the EHR is a virtual entity. The virtual integration of an EHR is done by encompassing services provided by specialised application systems into processes. This paper reports, from an application point of view, on national and European attempts to standardise electronic health EHR. From a technical perspective, the paper addresses how services can be made available in a distributed way, how distributed P2P infrastructures for the management of EHRs can be evaluated, and how novel contentbased access can be provided for multimedia EHRs.

Book ChapterDOI
13 Feb 2007
TL;DR: This work reports on three research results achieved during the first three years of activities carried out under the DELOS Network of Excellence, which aims to open the way to 3D objects retrieval based on similarity of object parts.
Abstract: In this work, we report on three research results achieved during the first three years of activities carried out under the task 3.8 of the DELOS Network of Excellence. First, two approaches for 3D objects description and matching for the purpose of 3D objects retrieval have been defined. An approach based on curvature correlograms is used to globally represent and compare 3D objects according to the local similarity of their surface curvature. Differently, a view based approach using spin image signatures is used to capture local and global information of 3D models by using a large number of views of the object which are then grouped according to their similarities. These approaches have been integrated in task prototypes and are now under integration into the DELOS DLMS. To open the way to 3D objects retrieval based on similarity of object parts, a method for the automatic decomposition of 3D objects has been defined. This approach exploits Reeb-graphs in order to capture topological information identifying the main protrusions of a 3D object.

Book ChapterDOI
13 Feb 2007
TL;DR: An expert evaluation for user requirement elicitation of an annotation system - The Digital Library Annotation Service, DiLAS that facilitates collaborative information access and sharing is described and the preliminary result is a set of requirements that will inform the next stage of the diLAS.
Abstract: We describe an expert evaluation for user requirement elicitation of an annotation system - The Digital Library Annotation Service, DiLAS, that facilitates collaborative information access and sharing. An analytical evaluation was conducted as a Participatory Group Evaluation, which involved presentation beyond the written papers of the objectives and rationale behind the development of the prototype. The empirical evaluation of DiLAS consisted of two experiments. The first evaluation experiment was a bottom up evaluation of the usability of the interface using a qualitative approach. The second stage of our evaluation moved towards a broader work context with a User and Work Centred Evaluation involving an entire, collaborative task situation, which required knowledge sharing on a common real life work task. This paper describes a first evaluation stage in an iterative evaluation process, and the preliminary result is a set of requirements that will inform the next stage of the DiLAS.

Book ChapterDOI
13 Feb 2007
TL;DR: The objective of the work reported here is to provide an automatic, context-of-capture categorization, structure detection and segmentation of news broadcasts employing a multimodal semantic based approach.
Abstract: The objective of the work reported here is to provide an automatic, context-of-capture categorization, structure detection and segmentation of news broadcasts employing a multimodal semantic based approach. We assume that news broadcasts can be described with context-free grammars that specify their structural characteristics. We propose a system consisting of two main types of interoperating units: The recognizer unit consisting of several modules and a parser unit. The recognizer modules (audio, video and semantic recognizer) analyze the telecast and each one identifies hypothesized instances of features in the audiovisual input. A probabilistic parser analyzes the identifications provided by the recognizers. The grammar represents the possible structures a news telecast may have, so the parser can identify the exact structure of the analyzed telecast.

Book ChapterDOI
13 Feb 2007
TL;DR: A novel approach which processes image similarity search queries by using a technique that takes inspiration from text retrieval, and offers higher performance, in terms of efficiency, given the possibility of using inverted files to support similarity searching.
Abstract: Image similarity is typically evaluated by using low level features such as color histograms, textures, and shapes. Image similarity search algorithms require computing similarity between low level features of the query image and those of the images in the database. Even if state of the art access methods for similarity search reduce the set of features to be accessed and compared to the query, similarity search has still an high cost. In this paper we present a novel approach which processes image similarity search queries by using a technique that takes inspiration from text retrieval. We propose an approach that automatically indexes images by using visual terms chosen from a visual lexicon. Each visual term represents a typology of visual regions, according to various criteria. The visual lexicon is obtained by analyzing a training set of images, to infer which are the relevant typology of visual regions.We have defined a weighting and matching schema that are able respectively to associate visual terms with images and to compare images by means of the associated terms. We show that the proposed approach do not lose performance, in terms of effectiveness, with respect to other methods existing in literature, and at the same time offers higher performance, in terms of efficiency, given the possibility of using inverted files to support similarity searching.

Book ChapterDOI
13 Feb 2007
TL;DR: This paper presents the work of the group for the DELOS NoE during the year 2006, which worked on two approaches for 3D-model indexing and analyzing: view-based approach and structural approach.
Abstract: 3D-mesh models are widely used to represent real objects in synthesized scenes for multimedia or cultural heritage applications, medical or military simulations, video games and so on. Indexing and analyzing these 3D data is a key issue to enable an effective usage of 3D-mesh model for designers and even for final users. The researches of our group mainly focus on these problems. In this paper, we present the work of our group for the DELOS NoE during the year 2006. We have worked on two approaches for 3D-model indexing and analyzing: view-based approach and structural approach. View-based approaches for 3D-model indexing are a very intuitive way to retrieve 3D- models among wide collections by using 2D natural views (as a human uses to represent 3D-objects). Structural analysis of a 3D-mesh model gives a structural decomposition of the object from a raw boundary representation of it, then this decomposition can be used to segment or index 3D-models.

Book ChapterDOI
13 Feb 2007
TL;DR: This paper proposes to use the Daffodil system as an experimental framework for the evaluation and research of interactive IR and digital libraries, and provides a user-friendly graphical interface and facilitating services for log generation and analysis.
Abstract: Evaluation of digital libraries assesses their effectiveness, quality and overall impact. In this paper we propose to use the Daffodil system as an experimental framework for the evaluation and research of interactive IR and digital libraries. The system already provides a rich set of working services and available information sources. These services and sources can be used as a foundation for further research going beyond basic functionalities. Besides the services and sources, the system supports a logging scheme for comparison of user behavior. In addition, the system can easily be extended regarding both services and sources. Daffodil's highly flexible and extensible agent-based architecture allows for easy integration of additional components, and access to all existing services. Finally, the system provides a user-friendly graphical interface and facilitating services for log generation and analysis. The experimental framework can serve as a joint theoretical and practical platform for the evaluation of DLs, with the long-term goal of creating a community centered on interactive IR and DL evaluation.

Book ChapterDOI
13 Feb 2007
TL;DR: A framework for supporting pedagogy-driven personalization in eLearning applications has been developed that performs automatic creation of personalized learning experiences using reusable (audiovisual) learning objects, taking into account the learner profiles and a set of abstract training scenarios (pedagogical templates).
Abstract: One of the most important applications of Digital Libraries (DL) is learning. In order to enable the development of eLearning applications that easily exploit DL contents it is crucial to bridge the interoperability gap between DL and eLearning applications. For this purpose, a generic interoperability framework has been developed that could also be applied to other types of applications which are built on top of DL, although this paper focuses on eLearning applications. In this context, a framework for supporting pedagogy-driven personalization in eLearning applications has been developed that performs automatic creation of personalized learning experiences using reusable (audiovisual) learning objects, taking into account the learner profiles and a set of abstract training scenarios (pedagogical templates). From a technical point of view, all the framework components have been organized into a service-oriented Architecture that Supports Interoperability between Digital Libraries and ELearning Applications (ASIDE). A prototype of the ASIDE Framework has been implemented.

Book ChapterDOI
13 Feb 2007
TL;DR: This paper discusses the system architecture that combines searching and information filtering abilities of MinervaLight and shows the different facets of an approximate pub/sub system for subscriptions that is high scalable, efficient, and notifies the subscribers about the most interesting publications in the P2P network of digital libraries.
Abstract: We present a new architecture for efficient search and approximate information filtering in a distributed Peer-to-Peer (P2P) environment of Digital Libraries. The MinervaLight search system uses P2P techniques over a structured overlay network to distribute and maintain a directory of peer statistics. Based on the same directory, the MAPS information filtering system provides an approximate publish/subscribe functionality by monitoring the most promising digital libraries for publishing appropriate documents regarding a continuous query. In this paper, we discuss our system architecture that combines searching and information filtering abilities. We show the system components of MinervaLight and explain the different facets of an approximate pub/sub system for subscriptions that is high scalable, efficient, and notifies the subscribers about the most interesting publications in the P2P network of digital libraries. We also compare both approaches in terms of common properties and differences to show an overview of search and pub/sub using the same infrastructure.

Book ChapterDOI
13 Feb 2007
TL;DR: A new approach is proposed that enables the automatic creation of multiple test collections without human effort and takes advantage of the human relevance assessments contained in an already existing test collection and it introduces content-level annotations in that collection.
Abstract: This study addresses the lack of an adequate test collection that can be used to evaluate search systems that exploit annotations to increase the retrieval effectiveness of an information search tool. In particular, a new approach is proposed that enables the automatic creation of multiple test collections without human effort. This approach takes advantage of the human relevance assessments contained in an already existing test collection and it introduces content-level annotations in that collection.

Book ChapterDOI
13 Feb 2007
TL;DR: The architecture of a largely distributed Digital Library that is based on the Peer-to-Peer computing paradigm is presented and a solution based on schema mappings and query reformulation is proposed to satisfy the three goals.
Abstract: We present the architecture of a largely distributed Digital Library that is based on the Peer-to-Peer computing paradigm. The three goals of the architecture are: (i) increased node autonomy, (ii) flexible location of data, and (iii) efficient query evaluation. To satisfy these goals we propose a solution based on schema mappings and query reformulation. We identify the problems involved in developing a system based on the proposed architecture and present ways of tackling them. A prototype implementation provides encouraging results.