Showing papers on "Human–computer information retrieval published in 2018"

PDF

Open Access

Journal Article•DOI•

Introduction to the special issue on bibliometric-enhanced information retrieval and natural language processing for digital libraries (BIRNDL)

[...]

Philipp Mayr¹, Ingo Frommholz², Guillaume Cabanac³, Muthu Kumar Chandrasekaran, Kokil Jaidka⁴, Min-Yen Kan, Dietmar Wolfram⁵ - Show less +3 more•Institutions (5)

Leibniz Association¹, University of Bedfordshire², University of Toulouse³, University of Pennsylvania⁴, University of Wisconsin–Milwaukee⁵

01 Sep 2018-International Journal on Digital Libraries

TL;DR: This special issue on bibliometric-enhanced information retrieval and natural language processing for digital libraries (BIRNDL) was compiled after the first joint BIRNDL workshop that was held at the joint conference on digital libraries 2016 in Newark, New Jersey, USA.

...read moreread less

Abstract: The large scale of scholarly publications poses a challenge for scholars in information seeking and sensemaking. Bibliometric, information retrieval (IR), text mining, and natural language processing techniques can assist to address this challenge, but have yet to be widely used in digital libraries (DL). This special issue on bibliometric-enhanced information retrieval and natural language processing for digital libraries (BIRNDL) was compiled after the first joint BIRNDL workshop that was held at the joint conference on digital libraries (JCDL 2016) in Newark, New Jersey, USA. It brought together IR and DL researchers and professionals to elaborate on new approaches in natural language processing, information retrieval, scientometric, and recommendation techniques that can advance the state of the art in scholarly document understanding, analysis, and retrieval at scale. This special issue includes 14 papers: four extended papers originating from the first BIRNDL workshop 2016 and the BIR workshop at ECIR 2016, four extended system reports of the CL-SciSumm Shared Task 2016 and six original research papers submitted via the open call for papers.

...read moreread less

35 citations

Journal Article•DOI•

RETRIEVAL—An Online Performance Evaluation Tool for Information Retrieval Methods

[...]

George Ioannakis¹, Anestis Koutsoudis, Ioannis Pratikakis¹, Christodoulos Chamzas¹•Institutions (1)

Democritus University of Thrace¹

01 Jan 2018-IEEE Transactions on Multimedia

TL;DR: This work presents RETRIEVAL, a Web-based integrated information retrieval performance evaluation platform that offers a number of metrics that are popular within the scientific community so as to compose an efficient framework for implementing performance evaluation.

...read moreread less

Abstract: Performance evaluation is one of the main research topics in information retrieval. Evaluation metrics are used to quantify various performance aspects of a retrieval method. These metrics assist in identifying the optimum method for a specific retrieval challenge but also to allow its parameters fine-tuning in order to achieve a robust operation for a given set of requirements specification. In this work, we present RETRIEVAL, a Web-based integrated information retrieval performance evaluation platform. It offers a number of metrics that are popular within the scientific community, so as to compose an efficient framework for implementing performance evaluation. We discuss the functionality of RETRIEVAL by citing important aspects such as the data input approaches, the user-level performance metrics parameterization, the evaluation scenarios, the interactive plots, and the performance reports repository that offers both archiving and download functionalities.

...read moreread less

19 citations

Journal Article•DOI•

Searching for evidence or approval? A commentary on database search in systematic reviews and alternative information retrieval methodologies.

[...]

Aogán Delaney, Péter Tamás¹•Institutions (1)

Wageningen University and Research Centre¹

01 Mar 2018-Research Synthesis Methods

TL;DR: This commentary reminds readers of factors that call into question the appropriateness of default reliance on database searches particularly as systematic review is adapted for use in new and lower consensus fields.

...read moreread less

Abstract: Despite recognition that database search alone is inadequate even within the health sciences, it appears that reviewers in fields that have adopted systematic review are choosing to rely primarily, or only, on database search for information retrieval. This commentary reminds readers of factors that call into question the appropriateness of default reliance on database searches particularly as systematic review is adapted for use in new and lower consensus fields. It then discusses alternative methods for information retrieval that require development, formalisation, and evaluation. Our goals are to encourage reviewers to reflect critically and transparently on their choice of information retrieval methods and to encourage investment in research on alternatives.

...read moreread less

18 citations

Proceedings Article•DOI•

A Comparative User Study of Interactive Multilingual Search Interfaces

[...]

Chenjun Ling¹, Ben Steichen², Alexander G. Choulos³•Institutions (3)

Santa Clara University¹, California State Polytechnic University, Pomona², Purdue University³

01 Mar 2018

TL;DR: In this paper, a lab-based user study involving 25 participants interacting with a set of four different interactive multilingual search user interfaces was conducted to understand and support multilingual user abilities and preferences.

...read moreread less

Abstract: While the number of polyglot Web users across the globe has increased dramatically, little human-centered research has been conducted to better understand and support multilingual user abilities and preferences. In particular, in the fields of cross-language and multilingual search, the majority of research has focused primarily on improving retrieval and translation accuracy, while paying comparably less attention to multilingual user interaction aspects. By contrast, this paper specifically focuses on multilingual search user interface preferences and behaviors, through a lab-based user study involving 25 participants interacting with a set of four different interactive multilingual search user interfaces. User preference results confirm that multilingual search users generally have strong preferences towards interfaces that provide clear language separation, and that the traditional approach of interleaving results, as typically used in prior research, is least preferred. In addition, an analysis of user interaction behaviors shows that multilingual users make significant use of each of their languages, and that there are several interaction behavior differences depending on interface and task type.

...read moreread less

9 citations

Journal Article•DOI•

Beyond traditional collaborative search: understanding the effect of awareness on multi-level collaborative information retrieval

[...]

Nyi Nyi Htun¹, Martin Halvey, Lynne Baillie²•Institutions (2)

Glasgow Caledonian University¹, Heriot-Watt University²

01 Jan 2018-Information Processing and Management

TL;DR: Two separate user studies using a total of 5 different collaborative search interfaces and 3 information access scenarios found that being able to easily identify different team members and their actions is important for users in Multi-Level Collaborative Information Retrieval (MLCIR).

...read moreread less

Abstract: Although there has been a great deal of research into Collaborative Information Retrieval (CIR) and Collaborative Information Seeking (CIS), the majority has assumed that team members have the same level of unrestricted access to underlying information. However, observations from different domains (e.g. healthcare, business, etc.) have suggested that collaboration sometimes involves people with differing levels of access to underlying information. This type of scenario has been referred to as Multi-Level Collaborative Information Retrieval (MLCIR). To the best of our knowledge, no studies have been conducted to investigate the effect of awareness, an existing CIR/CIS concept, on MLCIR. To address this gap in current knowledge, we conducted two separate user studies using a total of 5 different collaborative search interfaces and 3 information access scenarios. A number of Information Retrieval (IR), CIS and CIR evaluation metrics, as well as questionnaires were used to compare the interfaces. Design interviews were also conducted after evaluations to obtain qualitative feedback from participants. Results suggested that query properties such as time spent on query, query popularity and query effectiveness could allow users to obtain information about team's search performance and implicitly suggest better queries without disclosing sensitive data. Besides, having access to a history of intersecting viewed, relevant and bookmarked documents could provide similar positive effect as query properties. Also, it was found that being able to easily identify different team members and their actions is important for users in MLCIR. Based on our findings, we provide important design recommendations to help develop new CIR and MLCIR interfaces.

...read moreread less

7 citations

Book Chapter•DOI•

A Tutorial on Information Retrieval Using Query Expansion

[...]

Mohamed Yehia Dahab¹, Sara Alnofaie¹, Mahmoud Kamel¹•Institutions (1)

King Abdulaziz University¹

01 Jan 2018

TL;DR: This tutorial gives an overview of information retrieval models which are based on query expansion along with practical details and description on methods of implementation.

...read moreread less

Abstract: Most successful information retrieval techniques which has the ability to expand the original query with additional terms that best represent the actual user need. This tutorial gives an overview of information retrieval models which are based on query expansion along with practical details and description on methods of implementation. Toy examples with data are provided to assist the reader to grasp the main idea behind the query expansion (QE) techniques such as Kullback-Leibler Divergence (KLD) and the candidate expansion terms based on WordNet. The tutorial uses spectral analysis which one of the recent information retrieval techniques that considers the term proximity.

...read moreread less

7 citations

Journal Article•DOI•

A proposal of a temporal semantics aware linked data information retrieval framework

[...]

Md-Mizanur Rahoman¹, Ryutaro Ichise²•Institutions (2)

Begum Rokeya University¹, National Institute of Informatics²

01 Jun 2018

TL;DR: This work proposes a keyword-based linked data information retrieval framework that can incorporate temporal features and give more concise results and the evaluation of the system performance indicates that it is promising.

...read moreread less

Abstract: Temporal features, such as an explicit date and time or a time-specific event, employ concise semantics for any kind of information retrieval. Therefore, temporal features should be suitable for linked data information retrieval. However, we have found that most linked data information retrieval techniques pay little attention to the power of temporal feature inclusion. We propose a keyword-based linked data information retrieval framework ` that can incorporate temporal features and give more concise results. The evaluation of our system performance indicates that it is promising.

...read moreread less

6 citations

Posted Content•

Data Requirements for Evaluation of Personalization of Information Retrieval - A Position Paper

[...]

Nicholas J. Belkin, Daniel Hienert, Philipp Mayr, Chirag Shah

07 Sep 2018-arXiv: Information Retrieval

TL;DR: A model of IR is presented demonstrating why some types of data concerning searcher and system behavior are important and are at least necessary, if not necessarily sufficient, for meaningful evaluation of personalization of IR.

...read moreread less

Abstract: Two key, but usually ignored, issues for the evaluation of methods of personalization for information retrieval are: that such evaluation must be of a search session as a whole; and, that people, during the course of an information search session, engage in a variety of activities, intended to accomplish differ- ent goals or intentions. Taking serious account of these factors has major impli- cations for not only evaluation methods and metrics, but also for the nature of the data that is necessary both for understanding and modeling information search, and for evaluation of personalized support for information retrieval (IR). In this position paper, we: present a model of IR demonstrating why these fac- tors are important; identify some implications of accepting their validity; and, on the basis of a series of studies in interactive IR, identify some types of data concerning searcher and system behavior that we claim are, at least, necessary, if not necessarily sufficient, for meaningful evaluation of personalization of IR.

...read moreread less

4 citations

Book Chapter•DOI•

A New Semantic Distance Measure for the VSM-Based Information Retrieval Systems

[...]

Aya M. Al-Zoghby¹•Institutions (1)

Mansoura University¹

01 Jan 2018

TL;DR: This study proposes an Arabic semantic-based search approach that is based on the Vector Space Model (VSM), which uses the Universal WordNet (UWN) ontology to build a rich index of concepts, Concept-Space (CS), which replaces the traditional index of terms, Term- space (TS), and enhances the Semantic VSM capability.

...read moreread less

Abstract: One of the main reasons for adopting the Semantic Web technology in search systems is to enhance the performance of the retrieval process. A semantic-based search is characterized by finding the contents that are semantically associated with the concepts of the query rather than those which are exactly matching the query’s keywords. There is a growing interest in searching the Arabic content worldwide due to its importance for culture, religion, and economics. However, the Arabic language; across all of its linguistics levels; is morphologically and syntactically rich. This linguistic nature of Arabic makes the effective search of its content be a challenge. In this study, we propose an Arabic semantic-based search approach that is based on the Vector Space Model (VSM). VSM has proved its success, and many studies have been focused on refining its old-style version. Our proposed approach uses the Universal WordNet (UWN) ontology to build a rich index of concepts, Concept-Space (CS), which replaces the traditional index of terms, Term-Space (TS) and enhances the Semantic VSM capability. As a consequence, we proposed a new incidence indicator to calculate the Significance Level of a Concept (SLC) in the document. The new indicator is used to evaluate the performance of the retrieval process semantically instead of the traditional syntactic retrieval that is based on the traditional incidence indicator; Term Frequency (TF). This new indicator has motivated us to develop a new formula to calculate the Semantic Weight of the Concept (SWC). The SWC is necessary for determining the Semantic Distance (SD) of two vectors. As a proof of concept, a prototype is applied on a full dump of the Arabic Wikipedia. Since documents are indexed by their concepts and, hence, classified semantically, we were able to search Arabic documents efficiently. The experimental results regarding the Precision, Recall, and F-measure presented a noticeable improvement in performance.

...read moreread less

3 citations

Book•DOI•

Emerging Ideas on Information Filtering and Retrieval

[...]

Cristian Lai, Alessandro Giuliani, Giovanni Semeraro

01 Jan 2018

TL;DR: Density is improved by combining a few documents in one line of the matrix to reduce the filter size and to address the problem of document removal in Matrix Bloom filters.

...read moreread less

Abstract: Data leak prevention systems become a must-have component of enterprise information security. To minimize the communication delay, these systems require quick mechanisms for massive document comparison. Bloom filters have been proven to be a fast tool for membership checkup. Taking into account specific needs of fast text comparison this chapter proposes modifications to the Matrix Bloom filters. Approach proposed in this chapter allows improving density inMatrix Bloom filters with the help of special index to track documents uploaded into the system. Density is improved by combining a few documents in one line of the matrix to reduce the filter size and to address the problem of document removal. Special attention is paid to the negative impact of filter-to-filter comparison in matrix filters. Theoretical evaluation of the threshold for false positive results is provided. The experiment provided in the chapter outlines advantages and applicability of the proposed approach.

...read moreread less

2 citations

Proceedings Article•

Enterprise information retrieval: a survey

[...]

Hamid Turab Mirza¹•Institutions (1)

COMSATS Institute of Information Technology¹

30 Mar 2018

TL;DR: Effective enterprise searching is a challenge for the researchers and the commercial companies, however it is realized that the solution for which will deliver enormous benefits is realized.

...read moreread less

Abstract: Efficient retrieval of the relevant information is a critical success factor for many enterprises. Despite of all the advancement in the web search technology, enterprise searching is still faced with many challenges and problems. Boundaries of the enterprise search are broad and expectations of the users are quite high, in addition to many challenges faced one of the major problems is the difference between the nature of web and enterprise searching. Many solutions have been proposed and techniques have been devised to improve the enterprise search, but still effective enterprise searching is a challenge for the researchers and the commercial companies, however it is realized that the solution for which will deliver enormous benefits.

...read moreread less

Book Chapter•DOI•

Advice from the Oracle: Really Intelligent Information Retrieval

[...]

Michael J. Kurtz¹•Institutions (1)

Harvard University¹

02 Jan 2018-arXiv: Artificial Intelligence

TL;DR: This article will attempt to show some of the aspects of human intelligence, as related to information retrieval, by the device of a semi-imaginary Oracle.

...read moreread less

Abstract: What is "intelligent" information retrieval? Essentially this is asking what is intelligence, in this article I will attempt to show some of the aspects of human intelligence, as related to information retrieval. I will do this by the device of a semi-imaginary Oracle. Every Observatory has an oracle, someone who is a distinguished scientist, has great administrative responsibilities, acts as mentor to a number of less senior people, and as trusted advisor to even the most accomplished scientists, and knows essentially everyone in the field. In an appendix I will present a brief summary of the Statistical Factor Space method for text indexing and retrieval, and indicate how it will be used in the Astrophysics Data System Abstract Service. 2018 Keywords: Personal Digital Assistant; Supervised Topic Models

...read moreread less

Journal Article•DOI•

A learning framework for information block search based on probabilistic graphical models and Fisher Kernel

[...]

Tak-Lam Wong¹, Haoran Xie¹, Wai Lam², Fu Lee Wang³•Institutions (3)

University of Hong Kong¹, The Chinese University of Hong Kong², Caritas Institute of Higher Education³

01 Sep 2018-International Journal of Machine Learning and Cybernetics

TL;DR: This work has developed a novel learning framework for retrieving precise information blocks from Web pages given a query, which may contain some search terms and prior information such as the layout format of the data.

...read moreread less

Abstract: Contrary to traditional Web information retrieval methods that can only return a ranked list of Web pages and only allow search terms in the query, we have developed a novel learning framework for retrieving precise information blocks from Web pages given a query, which may contain some search terms and prior information such as the layout format of the data. There are two challenging sub-tasks for this problem. One challenge is information block detection, where a Web page is automatically segmented into blocks. Another challenge is to find the information blocks relevant to the query. Existing page segmentation methods, which make use of only visual layout information or only content information, do not consider the query information, leading to a solution having conflict with the information need expressed by the query. Our framework aims at modeling the query and the block features to capture both keyword information and prior information via a probabilistic graphical model. Fisher Kernel, which can effectively incorporate the graphical model, is then employed to accomplish the two sub-tasks in a unified manner, optimizing the final goal of block retrieval performance. We have conducted experiments on benchmark datasets and read-world data. Comparisons between existing methods have been conducted to evaluate the effectiveness of our framework.

...read moreread less

DOI•

Data Structures in Web Information Retrieval

[...]

Monika Henzinger

07 Mar 2018