Showing papers on "Human–computer information retrieval published in 2016"

PDF

Open Access

Information Retrieval Implementing And Evaluating Search Engines

[...]

01 Jan 2016

TL;DR: This information retrieval implementing and evaluating search engines helps people to enjoy a good book with a cup of coffee in the afternoon, instead they cope with some malicious virus inside their desktop computer.

...read moreread less

Abstract: Thank you for downloading information retrieval implementing and evaluating search engines. Maybe you have knowledge that, people have look hundreds times for their chosen novels like this information retrieval implementing and evaluating search engines, but end up in harmful downloads. Rather than enjoying a good book with a cup of coffee in the afternoon, instead they cope with some malicious virus inside their desktop computer.

...read moreread less

96 citations

Journal Article•DOI•

Social networks and information retrieval, how are they converging? A survey, a taxonomy and an analysis of social information retrieval approaches and platforms

[...]

Mohamed Reda Bouadjenek¹, Mohamed Reda Bouadjenek², Hakim Hacid³, Mokrane Bouzeghoub⁴•Institutions (4)

University of Montpellier¹, University of Melbourne², Zayed University³, Versailles Saint-Quentin-en-Yvelines University⁴

01 Mar 2016-Information Systems

TL;DR: To review some of the most important contributions in this domain to understand the principles of SIR, a taxonomy to categorize these contributions, and an analysis of some of these contributions and tools with respect to several criteria are proposed.

...read moreread less

92 citations

Proceedings Article•DOI•

Large-scale 3D shape retrieval from ShapeNet core55

[...]

Manolis Savva¹, Fisher Yu², Hao Su¹, Masaki Aono³, Baoquan Chen⁴, Daniel Cohen-Or⁵, W. Deng⁶, Hang Su⁷, Song Bai⁸, Xiang Bai⁸, Noa Fish⁵, J. Han⁶, Evangelos Kalogerakis⁷, Erik Learned-Miller⁷, Yangyan Li⁵, M. Liao⁸, Subhransu Maji⁷, Atsushi Tatsuma³, Y. Wang⁶, N. Zhang⁶, Zhichao Zhou⁸ - Show less +17 more•Institutions (8)

Stanford University¹, Princeton University², Toyohashi University of Technology³, Shandong University⁴, Tel Aviv University⁵, Beijing University of Posts and Telecommunications⁶, University of Massachusetts Amherst⁷, Huazhong University of Science and Technology⁸

08 May 2016

TL;DR: This track aims to provide a benchmark to evaluate large-scale shape retrieval based on the ShapeNet dataset, using ShapeNet Core55, which provides more than 50 thousands models over 55 common categories in total for training and evaluating several algorithms.

...read moreread less

Abstract: With the advent of commodity 3D capturing devices and better 3D modeling tools, 3D shape content is becoming increasingly prevalent. Therefore, the need for shape retrieval algorithms to handle large-scale shape repositories is more and more important. This track aims to provide a benchmark to evaluate large-scale shape retrieval based on the ShapeNet dataset. We use ShapeNet Core55, which provides more than 50 thousands models over 55 common categories in total for training and evaluating several algorithms. Five participating teams have submitted a variety of retrieval methods which were evaluated on several standard information retrieval performance metrics. We find the submitted methods work reasonably well on the track benchmark, but we also see significant space for improvement by future algorithms. We release all the data, results, and evaluation code for the benefit of the community.

...read moreread less

64 citations

Journal Article•DOI•

A generic framework for ontology-based information retrieval and image retrieval in web data

[...]

V. Vijayarajan¹, M. Dinakaran¹, Priyam Tejaswin¹, Mayank Lohani¹•Institutions (1)

VIT University¹

01 Dec 2016-Human-centric Computing and Information Sciences

TL;DR: This paper proposes and displays an ontology-based object-attribute-value (O-A-V) information extraction system as a web model that acts as a user dictionary to refine the search keywords in the query for subsequent attempts to improve the standard information retrieval systems.

...read moreread less

Abstract: In the internet era, search engines play a vital role in information retrieval from web pages. Search engines arrange the retrieved results using various ranking algorithms. Additionally, retrieval is based on statistical searching techniques or content-based information extraction methods. It is still difficult for the user to understand the abstract details of every web page unless the user opens it separately to view the web content. This key point provided the motivation to propose and display an ontology-based object-attribute-value (O-A-V) information extraction system as a web model that acts as a user dictionary to refine the search keywords in the query for subsequent attempts. This first model is evaluated using various natural language processing (NLP) queries given as English sentences. Additionally, image search engines, such as Google Images, use content-based image information extraction and retrieval of web pages against the user query. To minimize the semantic gap between the image retrieval results and the expected user results, the domain ontology is built using image descriptions. The second proposed model initially examines natural language user queries using an NLP parser algorithm that will identify the subject-predicate-object (S-P-O) for the query. S-P-O extraction is an extended idea from the ontology-based O-A-V web model. Using this S-P-O extraction and considering the complex nature of writing SPARQL protocol and RDF query language (SPARQL) from the user point of view, the SPARQL auto query generation module is proposed, and it will auto generate the SPARQL query. Then, the query is deployed on the ontology, and images are retrieved based on the auto-generated SPARQL query. With the proposed methodology above, this paper seeks answers to following two questions. First, how to combine the use of domain ontology and semantics to improve information retrieval and user experience? Second, does this new unified framework improve the standard information retrieval systems? To answer these questions, a document retrieval system and an image retrieval system were built to test our proposed framework. The web document retrieval was tested against three key-words/bag-of-words models and a semantic ontology model. Image retrieval was tested on IAPR TC-12 benchmark dataset. The precision, recall and accuracy results were then compared against standard information retrieval systems using TREC_EVAL. The results indicated improvements over the standard systems. A controlled experiment was performed by test subjects querying the retrieval system in the absence and presence of our proposed framework. The queries were measured using two metrics, time and click-count. Comparisons were made on the retrieval performed with and without our proposed framework. The results were encouraging.

...read moreread less

56 citations

Book Chapter•DOI•

How Retrieval Attempts Affect Learning: A Review and Synthesis

[...]

Nate Kornell¹, Kalif E. Vaughn²•Institutions (2)

Williams College¹, Northern Kentucky University²

01 Jan 2016-Psychology of Learning and Motivation

TL;DR: In this paper, the authors examine the moderating effects of whether a retrieval attempt results in success or failure and conclude that retrieval practice is beneficial even when the retrieval attempt is unsuccessful.

...read moreread less

Abstract: Attempting to recall information from memory (ie, retrieval practice) has been shown to enhance learning across a wide variety of materials, learners, and experimental conditions. We examine the moderating effects of what is arguably the most fundamental distinction to be made about retrieval: whether a retrieval attempt results in success or failure. After reviewing research on this topic, we conclude that retrieval practice is beneficial even when the retrieval attempt is unsuccessful. This finding appears to hold true in a variety of laboratory and real-world contexts and applies to learners across the lifespan. Based on these findings we outline a two-stage model in which learning from retrieval involves (1) a retrieval attempt and then (2) processing the answer. We then turn to a second issue: Does retrieval success even matter for learning? Recent findings suggest that retrieval failure followed by feedback leads to the same amount of learning as retrieval success. In light of these findings, we propose that separate mechanisms are not needed to explain the effect of retrieval success and retrieval failure on learning. We then review existing theories of retrieval and comment on their compatibility with extant data, and end with theoretical conclusions for researchers as well as practical advice for learners and teachers.

...read moreread less

56 citations

Journal Article•DOI•

Lower-Cost ∈-Private Information Retrieval

[...]

Raphael R. Toledo¹, George Danezis¹, Ian Goldberg²•Institutions (2)

University College London¹, University of Waterloo²

01 Oct 2016

TL;DR: It is shown that basic schemes are weak, but some of them can be made arbitrarily safe by composing them with large anonymity systems, and the security of each scheme is proved using a flexible differentially private definition for private queries that can capture notions of imperfect privacy.

...read moreread less

Abstract: Private Information Retrieval (PIR), despite being well studied, is computationally costly and arduous to scale. We explore lower-cost relaxations of information-theoretic PIR, based on dummy queries, sparse vectors, and compositions with an anonymity system. We prove the security of each scheme using a flexible differentially private definition for private queries that can capture notions of imperfect privacy. We show that basic schemes are weak, but some of them can be made arbitrarily safe by composing them with large anonymity systems.

...read moreread less

46 citations

Proceedings Article•DOI•

Deep Learning for Information Retrieval

[...]

Hang Li¹, Zhengdong Lu¹•Institutions (1)

Huawei¹

07 Jul 2016

TL;DR: This tutorial aims at summarizing and introducing the results of recent research on deep learning for information retrieval, in order to stimulate and foster more significant research and development work on the topic in the future.

...read moreread less

Abstract: Recent years have observed a significant progress in information retrieval and natural language processing with deep learning technologies being successfully applied into almost all of their major tasks. The key to the success of deep learning is its capability of accurately learning distributed representations (vector representations or structured arrangement of them) of natural language expressions such as sentences, and effectively utilizing the representations in the tasks. This tutorial aims at summarizing and introducing the results of recent research on deep learning for information retrieval, in order to stimulate and foster more significant research and development work on the topic in the future. The tutorial mainly consists of three parts. In the first part, we introduce the fundamental techniques of deep learning for natural language processing and information retrieval, such as word embedding, recurrent neural networks, and convolutional neural networks. In the second part, we explain how deep learning, particularly representation learning techniques, can be utilized in fundamental NLP and IR problems, including matching, translation, classification, and structured prediction. In the third part, we describe how deep learning can be used in specific application tasks in details. The tasks are search, question answering (from either documents, database, or knowledge base), and image retrieval.

...read moreread less

43 citations

Journal Article•DOI•

An intelligent web search framework for performing efficient retrieval of data

[...]

B. Bazeer Ahamed¹, T. Ramkumar²•Institutions (2)

Sathyabama University¹, VIT University²

01 Nov 2016-Computers & Electrical Engineering

TL;DR: To enable the integration of multiple data sources while performing efficient retrieval of web data, an intelligent web search framework has been proposed in this paper.

...read moreread less

39 citations

Journal Article•DOI•

Development of a Search Strategy for an Evidence Based Retrieval Service.

[...]

Gah Juan Ho¹, Su May Liew¹, Chirk Jenn Ng¹, Ranita Hisham Shunmugam¹, Paul Glasziou² - Show less +1 more•Institutions (2)

University of Malaya¹, Bond University²

09 Dec 2016-PLOS ONE

TL;DR: This strategy has been shown to be feasible and can provide evidence to doctors’ clinical questions and has the potential to be incorporated into an interventional study to determine the impact of an online evidence retrieval system.

...read moreread less

Abstract: Background Physicians are often encouraged to locate answers for their clinical queries via an evidence-based literature search approach. The methods used are often not clearly specified. Inappropriate search strategies, time constraint and contradictory information complicate evidence retrieval. Aims Our study aimed to develop a search strategy to answer clinical queries among physicians in a primary care setting. Methods Six clinical questions of different medical conditions seen in primary care were formulated. A series of experimental searches to answer each question was conducted on 3 commonly advocated medical databases. We compared search results from a PICO (patients, intervention, comparison, outcome) framework for questions using different combinations of PICO elements. We also compared outcomes from doing searches using text words, Medical Subject Headings (MeSH), or a combination of both. All searches were documented using screenshots and saved search strategies. Results Answers to all 6 questions using the PICO framework were found. A higher number of systematic reviews were obtained using a 2 PICO element search compared to a 4 element search. A more optimal choice of search is a combination of both text words and MeSH terms. Despite searching using the Systematic Review filter, many non-systematic reviews or narrative reviews were found in PubMed. There was poor overlap between outcomes of searches using different databases. The duration of search and screening for the 6 questions ranged from 1 to 4 hours. Conclusion This strategy has been shown to be feasible and can provide evidence to doctors' clinical questions. It has the potential to be incorporated into an interventional study to determine the impact of an online evidence retrieval system.

...read moreread less

37 citations

Journal Article•DOI•

A Survey on Retrieval of Mathematical Knowledge

[...]

Ferruccio Guidi¹, Claudio Sacerdoti Coen¹•Institutions (1)

University of Bologna¹

18 Jul 2016-Mathematics in Computer Science

TL;DR: A survey of the literature on indexing and retrieval of mathematical knowledge, with pointers to 77 papers and tentative taxonomies of both retrieval problems and recurring techniques is presented.

...read moreread less

Abstract: We present a survey of the literature on indexing and retrieval of mathematical knowledge, with pointers to 77 papers and tentative taxonomies of both retrieval problems and recurring techniques.

...read moreread less

33 citations

Proceedings Article•DOI•

LIRE: open source visual information retrieval

[...]

Mathias Lux¹, Michael Riegler, Pål Halvorsen, Konstantin Pogorelov, Nektarios Anagnostopoulos¹ - Show less +1 more•Institutions (1)

Alpen-Adria-Universität Klagenfurt¹

10 May 2016

TL;DR: A novel tool called F-search is presented that emphasize the core strengths of LIRE: lightness, speed and accuracy of the Java library for visual information retrieval.

...read moreread less

Abstract: With an annual growth rate of 16.2% of taken photos a year, researchers predict an almost unbelievable number of 4.9 trillion stored images in 2017. Nearly 80% of these photos in 2017 will be taken with mobile phones. To be able to cope with this immense amount of visual data in a fast and accurate way, a visual information retrieval systems are needed for various domains and applications. LIRE, short for Lucene Image Retrieval, is a light weight and easy to use Java library for visual information retrieval. It allows developers and researchers to integrate common content based image retrieval approaches in their applications and research projects. LIRE supports global and local image features and can cope with millions of images using approximate search and distributing indexes on the cloud. In this demo we present a novel tool called F-search that emphasize the core strengths of LIRE: lightness, speed and accuracy.

...read moreread less

Proceedings Article•DOI•

Search as Learning (SAL) Workshop 2016

[...]

Jacek Gwizdka¹, Preben Hansen², Claudia Hauff³, Jiyin He⁴, Noriko Kando⁵ - Show less +1 more•Institutions (5)

University of Texas at Austin¹, Stockholm University², Delft University of Technology³, Centrum Wiskunde & Informatica⁴, National Institute of Informatics⁵

07 Jul 2016

TL;DR: The "Search as Learning" (SAL) workshop is focused on an area within the information retrieval field that is only beginning to emerge: supporting users in their learning whilst interacting with information content.

...read moreread less

Abstract: The "Search as Learning" (SAL) workshop is focused on an area within the information retrieval field that is only beginning to emerge: supporting users in their learning whilst interacting with information content.

...read moreread less

Journal Article•DOI•

Information Retrieval System and Machine Translation

[...]

Mangala Madankar, M.B. Chandak, Nekita Chavhan

01 Mar 2016-Procedia Computer Science

TL;DR: In this paper, some of the most important areas of information retrieval i.e. Cross-lingual Information Retrieval (CLIR), Multi-lingUAL Information RetRIeval (MLIR), Machine translation approaches and techniques are introduced.

...read moreread less

Journal Article•DOI•

Latent topics-based relevance feedback for video retrieval

[...]

Ruben Fernandez-Beltran¹, Filiberto Pla¹•Institutions (1)

James I University¹

01 Mar 2016-Pattern Recognition

TL;DR: This paper presents a novel Content-Based Video Retrieval approach in order to cope with the semantic gap challenge by means of latent topics and reveals that the proposed ranking function is able to provide a competitive advantage within the content-based retrieval field.

...read moreread less

Journal Article•DOI•

Discriminative Semantic Subspace Analysis for Relevance Feedback

[...]

Lining Zhang¹, Hubert P. H. Shum¹, Ling Shao¹•Institutions (1)

Northumbria University¹

01 Mar 2016-IEEE Transactions on Image Processing

TL;DR: A novel discriminative semantic subspace analysis (DSSA) method is proposed, which can directly learn a semantic sub space from similar and dissimilar pairwise constraints without using any explicit class label information.

...read moreread less

Abstract: Content-based image retrieval (CBIR) has attracted much attention during the past decades for its potential practical applications to image database management. A variety of relevance feedback (RF) schemes have been designed to bridge the gap between low-level visual features and high-level semantic concepts for an image retrieval task. In the process of RF, it would be impractical or too expensive to provide explicit class label information for each image. Instead, similar or dissimilar pairwise constraints between two images can be acquired more easily. However, most of the conventional RF approaches can only deal with training images with explicit class label information. In this paper, we propose a novel discriminative semantic subspace analysis (DSSA) method, which can directly learn a semantic subspace from similar and dissimilar pairwise constraints without using any explicit class label information. In particular, DSSA can effectively integrate the local geometry of labeled similar images, the discriminative information between labeled similar and dissimilar images, and the local geometry of labeled and unlabeled images together to learn a reliable subspace. Compared with the popular distance metric analysis approaches, our method can also learn a distance metric but perform more effectively when dealing with high-dimensional images. Extensive experiments on both the synthetic data sets and a real-world image database demonstrate the effectiveness of the proposed scheme in improving the performance of the CBIR.

...read moreread less

Proceedings Article•

Semi-Supervised Information Retrieval System for Clinical Decision Support.

[...]

Harsha Gurulingappa¹, Luca Toldo¹, Claudia Schepers, Alexander Bauer, Gerard Megaro² - Show less +1 more•Institutions (2)

Merck KGaA¹, Millipore Corporation²

01 Jan 2016

TL;DR: Experimental results show that document distance measures derived from unsupervised word embeddings contribute to significant ranking improvements when combined with traditional document retrieval approaches.

...read moreread less

Abstract: This article summarizes the approach developed for TREC 2016 Clinical Decision Support Track. In order to address the daunting challenge of retrieval of biomedical articles for answering clinical questions, an information retrieval methodology was developed that combines pseudo-relevance feedback, semantic query expansion and document similarity measures based on unsupervised word embeddings. The individual relevance metrics were combined through a supervised learning -to-rank model based on gradient boosting to maximize the normalized discounted cumulative gain (nDCG). Experimental results show that document distance measures derived from unsupervised word embeddings contribute to significant ranking improvements when combined with traditional document retrieval approaches.

...read moreread less

Journal Article•DOI•

The effects of multiple query evidences on social image retrieval

[...]

Zhiyong Cheng¹, Jialie Shen¹, Haiyan Miao²•Institutions (2)

Singapore Management University¹, Institute of High Performance Computing Singapore²

01 Jul 2016-Multimedia Systems

TL;DR: A set of comprehensive empirical studies to explore the effects of multiple query evidences on large-scale social image search and a novel quantitative metric is proposed and applied to assess the influences of different visual queries based on their complexity levels.

...read moreread less

Abstract: System performance assessment and comparison are fundamental for large-scale image search engine development. This article documents a set of comprehensive empirical studies to explore the effects of multiple query evidences on large-scale social image search. The search performance based on the social tags, different kinds of visual features and their combinations are systematically studied and analyzed. To quantify the visual query complexity, a novel quantitative metric is proposed and applied to assess the influences of different visual queries based on their complexity levels. Besides, we also study the effects of automatic text query expansion with social tags using a pseudo relevance feedback method on the retrieval performance. Our analysis of experimental results shows a few key research findings: (1) social tag-based retrieval methods can achieve much better results than content-based retrieval methods; (2) a combination of textual and visual features can significantly and consistently improve the search performance; (3) the complexity of image queries has a strong correlation with retrieval results' quality--more complex queries lead to poorer search effectiveness; and (4) query expansion based on social tags frequently causes search topic drift and consequently leads to performance degradation.

...read moreread less

Proceedings Article•DOI•

On Obtaining Effort Based Judgements for Information Retrieval

[...]

Manisha Verma, Emine Yilmaz, Nick Craswell¹•Institutions (1)

Microsoft¹

08 Feb 2016

TL;DR: This work shows that it is possible to get judgements of effort from the assessors and shows that given documents of the same relevance grade, effort needed to find the portion of the document relevant to the query is a significant factor in determining user satisfaction as well as user preference between these documents.

...read moreread less

Abstract: Document relevance has been the primary focus in the design, optimization and evaluation of retrieval systems. Traditional testcollections are constructed by asking judges the relevance grade for a document with respect to an input query. Recent work of Yilmaz et al. found an evidence that effort is another important factor in determining document utility, suggesting that more thought should be given into incorporating effort into information retrieval. However, that work did not ask judges to directly assess the level of effort required to consume a document or analyse how effort judgements relate to traditional relevance judgements. In this work, focusing on three aspects associated with effort, we show that it is possible to get judgements of effort from the assessors. We further show that given documents of the same relevance grade, effort needed to find the portion of the document relevant to the query is a significant factor in determining user satisfaction as well as user preference between these documents. Our results suggest that if the end goal is to build retrieval systems that optimize user satisfaction, effort should be included as an additional factor to relevance in building and evaluating retrieval systems. We further show that new retrieval features are needed if the goal is to build retrieval systems that jointly optimize relevance and effort and propose a set of such features. Finally, we focus on the evaluation of retrieval systems and show that incorporating effort into retrieval evaluation could lead to significant differences regarding the performance of retrieval systems.

...read moreread less

Proceedings Article•DOI•

New Collection Announcement: Focused Retrieval Over the Web

[...]

Ivan Habernal¹, Maria Sukhareva¹, Fiana Raiber², Anna Shtok², Oren Kurland², Hadar Ronen³, Judit Bar-Ilan³, Iryna Gurevych¹ - Show less +4 more•Institutions (3)

Technische Universität Darmstadt¹, Technion – Israel Institute of Technology², Bar-Ilan University³

07 Jul 2016

TL;DR: A new Web-based collection for focused retrieval is presented and the documents most highly ranked for each query by a highly effective learning-to-rank method were judged for relevance using crowdsourcing.

...read moreread less

Abstract: Focused retrieval (a.k.a., passage retrieval) is important at its own right and as an intermediate step in question answering systems. We present a new Web-based collection for focused retrieval. The document corpus is the Category A of the ClueWeb12 collection. Forty-nine queries from the educational domain were created. The $100$ documents most highly ranked for each query by a highly effective learning-to-rank method were judged for relevance using crowdsourcing. All sentences in the relevant documents were judged for relevance.

...read moreread less

Journal Article•DOI•

IRAFCA: an O(n) information retrieval algorithm based on formal concept analysis

[...]

Fethi Fkih¹, Mohamed Nazih Omri¹•Institutions (1)

University of Monastir¹

01 Aug 2016-Knowledge and Information Systems

TL;DR: A new information-retrieval algorithm based on formal concept analysis is proposed that deals with disjunctive and conjunctive queries and exploits the theoretical basis provided by the FCA to design an efficient and flexible approach for information retrieval.

...read moreread less

Abstract: With the exponential increase in the quantity of information circulating on the Internet, an evolution of information-retrieval systems becomes paramount. Indeed, current approaches for information systems design remain unable to meet the needs of users, either in performance (precision and recall) or response time. In this paper, we propose a new information-retrieval algorithm based on formal concept analysis. The proposed algorithm deals with disjunctive and conjunctive queries. In fact, information retrieval is a direct application of the formal concept analysis (FCA). This makes the adaptation of this theory to this field an easy and intuitive task. In this context, we exploited the theoretical basis provided by the FCA to design an efficient and flexible approach for information retrieval.

...read moreread less

Posted Content•

Evaluation of information retrieval: precision and recall

[...]

Monika Arora, Uma Kanjilal¹, Dinesh Varshney²•Institutions (2)

Indira Gandhi National Open University¹, Madhya Pradesh Bhoj Open University²

01 Feb 2016-International Journal of Indian Culture and Business Management

TL;DR: The recall and precision technique are used to evaluate the efficacy of information retrieval systems and the response time and the relevancy of the results are the significant factors in user satisfaction.

...read moreread less

Abstract: The information retrieval system evaluation revolves around the notion of relevant and non-relevant documents. The performance indicator such as precision and recall are used to determine how far the system satisfies the user requirements. The effectiveness of information retrieval systems is essentially measured by comparing performance, functionality and systematic approach on a common set of queries and documents. The significance tests are used to evaluate functional, performance (precession and recall), collection and interface evaluation. We must focus on the user satisfaction, which is the key parameter of performance evaluation. It identifies the collection of relevant documents under the retrieved set of collection in specific time interval. The recall and precision technique are used to evaluate the efficacy of information retrieval systems. The response time and the relevancy of the results are the significant factors in user satisfaction. The comparison of search engine Yahoo and Google based on precision and recall technique.

...read moreread less

Journal Article•DOI•

Rain or shine? Forecasting search process performance in exploratory search tasks

[...]

Chirag Shah¹, Chathra Hendahewa¹, Roberto González-Ibáñez²•Institutions (2)

Rutgers University¹, University of Santiago, Chile²

01 Jul 2016

TL;DR: A machine‐learning‐based method to dynamically evaluate and predict search performance several time‐steps ahead at each given time point of the search process during an exploratory search task is proposed.

...read moreread less

Abstract: Most information retrieval IR systems consider relevance, usefulness, and quality of information objects documents, queries for evaluation, prediction, and recommendation, often ignoring the underlying search process of information seeking. This may leave out opportunities for making recommendations that analyze the search process and/or recommend alternative search process instead of objects. To overcome this limitation, we investigated whether by analyzing a searcher's current processes we could forecast his likelihood of achieving a certain level of success with respect to search performance in the future. We propose a machine-learning-based method to dynamically evaluate and predict search performance several time-steps ahead at each given time point of the search process during an exploratory search task. Our prediction method uses a collection of features extracted from expression of information need and coverage of information. For testing, we used log data collected from 4 user studies that included 216 users 96 individuals and 60 pairs. Our results show 80-90% accuracy in prediction depending on the number of time-steps ahead. In effect, the work reported here provides a framework for evaluating search processes during exploratory search tasks and predicting search performance. Importantly, the proposed approach is based on user processes and is independent of any IR system.

...read moreread less

Book•

Dynamic Information Retrieval Modeling

[...]

Hui Yang¹, Marc Sloan², Jun Wang²•Institutions (2)

Georgetown University¹, University College London²

01 Jun 2016

TL;DR: In this article, the authors provide a comprehensive and up-to-date introduction to dynamic information retrieval modeling, the statistical modeling of IR systems that can adapt to change and learn with minimal computational footprint and be responsive and adaptive.

...read moreread less

Abstract: Dynamic aspects of Information Retrieval (IR), including changes found in data, users and systems, are increasingly being utilized in search engines and information filtering systems. Existing IR techniques are limited in their ability to optimize over changes, learn with minimal computational footprint and be responsive and adaptive. The objective of this tutorial is to provide a comprehensive and up-to-date introduction to Dynamic Information Retrieval Modeling, the statistical modeling of IR systems that can adapt to change. It will cover techniques ranging from classic relevance feedback to the latest applications of partially observable Markov decision processes (POMDPs) and a handful of useful algorithms and tools for solving IR problems incorporating dynamics.

...read moreread less

Journal Article•DOI•

Improving biomedical information retrieval by linear combinations of different query expansion techniques.

[...]

Ahmed AbdoAziz Ahmed Abdulla¹, Hongfei Lin¹, Bo Xu¹, Santosh Kumar Banbhrani¹•Institutions (1)

Dalian University of Technology¹

25 Jul 2016-BMC Bioinformatics

TL;DR: A new technique to refine Information Retrieval searches to better represent the user’s information need is presented in order to enhance the performance of information retrieval by using different query expansion techniques and apply a linear combinations between them.

...read moreread less

Abstract: Biomedical literature retrieval is becoming increasingly complex, and there is a fundamental need for advanced information retrieval systems. Information Retrieval (IR) programs scour unstructured materials such as text documents in large reserves of data that are usually stored on computers. IR is related to the representation, storage, and organization of information items, as well as to access. In IR one of the main problems is to determine which documents are relevant and which are not to the user’s needs. Under the current regime, users cannot precisely construct queries in an accurate way to retrieve particular pieces of data from large reserves of data. Basic information retrieval systems are producing low-quality search results. In our proposed system for this paper we present a new technique to refine Information Retrieval searches to better represent the user’s information need in order to enhance the performance of information retrieval by using different query expansion techniques and apply a linear combinations between them, where the combinations was linearly between two expansion results at one time. Query expansions expand the search query, for example, by finding synonyms and reweighting original terms. They provide significantly more focused, particularized search results than do basic search queries. The retrieval performance is measured by some variants of MAP (Mean Average Precision) and according to our experimental results, the combination of best results of query expansion is enhanced the retrieved documents and outperforms our baseline by 21.06 %, even it outperforms a previous study by 7.12 %. We propose several query expansion techniques and their combinations (linearly) to make user queries more cognizable to search engines and to produce higher-quality search results.

...read moreread less

Proceedings Article•DOI•

Large scale content-based video retrieval with LIvRE

[...]

Gabriel de Oliveira Barra¹, Mathias Lux², Xavier Giro-i-Nieto³•Institutions (3)

University of Barcelona¹, Alpen-Adria-Universität Klagenfurt², Polytechnic University of Catalonia³

15 Jun 2016

TL;DR: LIvRE supports image-based queries, which are efficiently matched with the extracted frames of the indexed videos, and consists of three main system components (pre-processing, indexing and retrieval), as well as a scalable and responsive HTML5 user interface accessible from a web browser.

...read moreread less

Abstract: The fast growth of video data requires robust, efficient, and scalable systems to allow for indexing and retrieval. These systems must be accessible from lightweight, portable and usable interfaces to help users in management and search of video content. This demo paper presents LIvRE, an extension of an existing open source tool for image retrieval to support video indexing. LIvRE consists of three main system components (pre-processing, indexing and retrieval), as well as a scalable and responsive HTML5 user interface accessible from a web browser. LIvRE supports image-based queries, which are efficiently matched with the extracted frames of the indexed videos.

...read moreread less

Proceedings Article•DOI•

Information retrieval in web crawling: A survey

[...]

Chandni Saini¹, Vinay Arora¹•Institutions (1)

Thapar University¹

01 Sep 2016

TL;DR: Review on strategies of information retrieval in web crawling has been presented that are classifying into four categories viz: focused, distributed, incremental and hidden web crawlers, on the basis of user customized parameters.

...read moreread less

Abstract: In today's scenario, World Wide Web (WWW) is flooded with huge amount of information. Due to growing popularity of the internet, finding the meaningful information among billions of information resources on the WWW is a challenging task. The information retrieval (IR) provides documents to the end users which satisfy their need of information. Search engine is used to extract valuable information from the internet. Web crawler is the principal part of search engine; it is an automatic script or program which can browse the WWW in automatic manner. This process is known as web crawling. In this paper, review on strategies of information retrieval in web crawling has been presented that are classifying into four categories viz: focused, distributed, incremental and hidden web crawlers. Finally, on the basis of user customized parameters the comparative analysis of various IR strategies has been performed.

...read moreread less

Music Information Retrieval

[...]

John Ashley Burgoyne, Ichiro Fujinaga, J.S. Downie, S. Schreibman, R. Siemens, J. Unsworth - Show less +2 more

01 Jan 2016

TL;DR: Music information retrieval (MIR) is a multidisciplinary research endeavor that strives to develop innovative content-based searching schemes, novel interfaces, and evolving networked delivery mechanisms in an effort to make the world's vast store of music accessible to all as mentioned in this paper.

...read moreread less

Abstract: Music information retrieval (MIR) is “a multidisciplinary research endeavor that strives to develop innovative content‐based searching schemes, novel interfaces, and evolving networked delivery mechanisms in an effort to make the world's vast store of music accessible to all.” MIR was born from computational musicology in the 1960s and has since grown to have links with music cognition and audio engineering, a dedicated annual conference (ISMIR) and an annual evaluation campaign (MIREX). MIR combines machine learning with expert human knowledge to use digital music data – images of music scores, “symbolic” data such as MIDI files, audio, and metadata about musical items – for information retrieval, classification and estimation, or sequence labeling. This chapter gives a brief history of MIR, introduces classical MIR tasks from optical music recognition to music recommendation systems, and outlines some of the key questions and directions for future developments in MIR.

...read moreread less

Proceedings Article•DOI•

The LExR Collection for Expertise Retrieval in Academia

[...]

Vítor Mangaravite¹, Rodrygo L. T. Santos¹, Isac S. Ribeiro¹, Marcos André Gonçalves¹, Alberto H. F. Laender¹ - Show less +1 more•Institutions (1)

Universidade Federal de Minas Gerais¹

07 Jul 2016

TL;DR: The Lattes Expertise Retrieval (LExR) test collection for research on academic expertise retrieval has been designed to provide a large-scale benchmark for two complementary expertise retrieval tasks, namely, expert profiling and expert finding.

...read moreread less

Abstract: Expertise retrieval has been the subject of intense research over the past decade, particularly with the public availability of benchmark test collections for expertise retrieval in enterprises. Another domain which has seen comparatively less research on expertise retrieval is academic search. In this paper, we describe the Lattes Expertise Retrieval (LExR) test collection for research on academic expertise retrieval. LExR has been designed to provide a large-scale benchmark for two complementary expertise retrieval tasks, namely, expert profiling and expert finding. Unlike currently available test collections, which fully support only one of these tasks, LExR provides graded relevance judgments performed by expert judges separately for each task. In addition, LExR is both cross-organization and cross-area, encompassing candidate experts from all areas of knowledge working in research institutions all over Brazil. As a result, it constitutes a valuable resource for fostering new research directions on expertise retrieval in an academic setting.

...read moreread less

Proceedings Article•DOI•

VizioMetrix: A Platform for Analyzing the Visual Information in Big Scholarly Data

[...]

Po-Shen Lee¹, Jevin D. West¹, Bill Howe¹•Institutions (1)

University of Washington¹

11 Apr 2016

TL;DR: VizioMetrix is a platform that extracts visual information from the scientific literature and makes it available for use in new information retrieval applications and for studies that look at patterns of visual information across millions of papers.

...read moreread less

Abstract: We present VizioMetrix, a platform that extracts visual information from the scientific literature and makes it available for use in new information retrieval applications and for studies that look at patterns of visual information across millions of papers. New ideas are conveyed visually in the scientific literature through figures --- diagrams, photos, visualizations, tables --- but these visual elements remain ensconced in the surrounding paper and difficult to use directly to facilitate information discovery tasks or longitudinal analytics. Very few applications in information retrieval, academic search, or bibliometrics make direct use of the figures, and none attempt to recognize and exploit the type of figure, which can be used to augment interactions with a large corpus of scholarly literature. The VizioMetrix platform processes a corpus of documents, classifies the figures, organizes the results into a cloud-hosted databases, and drives three distinct applications to support bibliometric analysis and information retrieval. The first application supports information retrieval tasks by allowing rapid browsing of classified figures. The second application supports longitudinal analysis of visual patterns in the literature and facilitates data mining of these figures. The third application supports crowdsourced tagging of figures to improve classification, augment search, and facilitate new kinds of analyses. Our initial corpus is the entirety of PubMed Central (PMC), and will be released to the public alongside this paper; we welcome other researchers to make use of these resources.

...read moreread less

Journal Article•DOI•

Combined retrieval

[...]

Kehua Guo¹, Ruifang Zhang¹, Zhurong Zhou², Yayuan Tang¹, Li Kuang¹ - Show less +1 more•Institutions (2)

Central South University¹, Southwest University²

01 Sep 2016-Information Sciences

TL;DR: Experimental results indicate that CR can significantly improve the retrieval performance with minimum effort and can provide a notably convenient user experience.

...read moreread less

Collapse