scispace - formally typeset
Search or ask a question

Showing papers on "Ranking (information retrieval) published in 2017"


Posted Content
TL;DR: A simple change to common loss functions used for multi-modal embeddings, inspired by hard negative mining, the use of hard negatives in structured prediction, and ranking loss functions, is introduced, which yields significant gains in retrieval performance.
Abstract: We present a new technique for learning visual-semantic embeddings for cross-modal retrieval. Inspired by hard negative mining, the use of hard negatives in structured prediction, and ranking loss functions, we introduce a simple change to common loss functions used for multi-modal embeddings. That, combined with fine-tuning and use of augmented data, yields significant gains in retrieval performance. We showcase our approach, VSE++, on MS-COCO and Flickr30K datasets, using ablation studies and comparisons with existing methods. On MS-COCO our approach outperforms state-of-the-art methods by 8.8% in caption retrieval and 11.3% in image retrieval (at R@1).

623 citations


Proceedings ArticleDOI
07 Aug 2017
TL;DR: K-NRM uses a translation matrix that models word-level similarities via word embeddings, a new kernel-pooling technique that uses kernels to extract multi-level soft match features, and a learning-to-rank layer that combines those features into the final ranking score.
Abstract: This paper proposes K-NRM, a kernel based neural model for document ranking. Given a query and a set of documents, K-NRM uses a translation matrix that models word-level similarities via word embeddings, a new kernel-pooling technique that uses kernels to extract multi-level soft match features, and a learning-to-rank layer that combines those features into the final ranking score. The whole model is trained end-to-end. The ranking layer learns desired feature patterns from the pairwise ranking loss. The kernels transfer the feature patterns into soft-match targets at each similarity level and enforce them on the translation matrix. The word embeddings are tuned accordingly so that they can produce the desired soft matches. Experiments on a commercial search engine's query log demonstrate the improvements of K-NRM over prior feature-based and neural-based states-of-the-art, and explain the source of K-NRM's advantage: Its kernel-guided embedding encodes a similarity metric tailored for matching query words to document words, and provides effective multi-level soft matches.

572 citations


Proceedings ArticleDOI
03 Apr 2017
TL;DR: This work proposes a novel document ranking model composed of two separate deep neural networks, one that matches the query and the document using a local representation, and another that Matching with distributed representations complements matching with traditional local representations.
Abstract: Models such as latent semantic analysis and those based on neural embeddings learn distributed representations of text, and match the query against the document in the latent semantic space. In traditional information retrieval models, on the other hand, terms have discrete or local representations, and the relevance of a document is determined by the exact matches of query terms in the body text. We hypothesize that matching with distributed representations complements matching with traditional local representations, and that a combination of the two is favourable. We propose a novel document ranking model composed of two separate deep neural networks, one that matches the query and the document using a local representation, and another that matches the query and the document using learned distributed representations. The two networks are jointly trained as part of a single neural network. We show that this combination or 'duet' performs significantly better than either neural network individually on a Web page ranking task, and significantly outperforms traditional baselines and other recently proposed models based on neural networks.

489 citations


Proceedings ArticleDOI
TL;DR: A unified framework takes advantage of both schools of thinking in information retrieval modelling and shows that the generative model learns to fit the relevance distribution over documents via the signals from the discriminative model to achieve a better estimation for document ranking.
Abstract: This paper provides a unified account of two schools of thinking in information retrieval modelling: the generative retrieval focusing on predicting relevant documents given a query, and the discriminative retrieval focusing on predicting relevancy given a query-document pair. We propose a game theoretical minimax game to iteratively optimise both models. On one hand, the discriminative model, aiming to mine signals from labelled and unlabelled data, provides guidance to train the generative model towards fitting the underlying relevance distribution over documents given the query. On the other hand, the generative model, acting as an attacker to the current discriminative model, generates difficult examples for the discriminative model in an adversarial way by minimising its discrimination objective. With the competition between these two models, we show that the unified framework takes advantage of both schools of thinking: (i) the generative model learns to fit the relevance distribution over documents via the signals from the discriminative model, and (ii) the discriminative model is able to exploit the unlabelled data selected by the generative model to achieve a better estimation for document ranking. Our experimental results have demonstrated significant performance gains as much as 23.96% on Precision@5 and 15.50% on MAP over strong baselines in a variety of applications including web search, item recommendation, and question answering.

416 citations


Proceedings ArticleDOI
07 Aug 2017
TL;DR: In this paper, a game theoretical minimax game is proposed to iteratively optimise both generative and discriminative models for document ranking, and the generative model is trained to fit the relevance distribution over documents via the signals from the discriminator.
Abstract: This paper provides a unified account of two schools of thinking in information retrieval modelling: the generative retrieval focusing on predicting relevant documents given a query, and the discriminative retrieval focusing on predicting relevancy given a query-document pair. We propose a game theoretical minimax game to iteratively optimise both models. On one hand, the discriminative model, aiming to mine signals from labelled and unlabelled data, provides guidance to train the generative model towards fitting the underlying relevance distribution over documents given the query. On the other hand, the generative model, acting as an attacker to the current discriminative model, generates difficult examples for the discriminative model in an adversarial way by minimising its discrimination objective. With the competition between these two models, we show that the unified framework takes advantage of both schools of thinking: (i) the generative model learns to fit the relevance distribution over documents via the signals from the discriminative model, and (ii) the discriminative model is able to exploit the unlabelled data selected by the generative model to achieve a better estimation for document ranking. Our experimental results have demonstrated significant performance gains as much as 23.96% on Precision@5 and 15.50% on MAP over strong baselines in a variety of applications including web search, item recommendation, and question answering.

413 citations


Proceedings ArticleDOI
02 Feb 2017
TL;DR: A counterfactual inference framework is presented that provides the theoretical basis for unbiased LTR via Empirical Risk Minimization despite biased data, and a Propensity-Weighted Ranking SVM is derived for discriminative learning from implicit feedback, where click models take the role of the propensity estimator.
Abstract: Implicit feedback (e.g., clicks, dwell times, etc.) is an abundant source of data in human-interactive systems. While implicit feedback has many advantages (e.g., it is inexpensive to collect, user centric, and timely), its inherent biases are a key obstacle to its effective use. For example, position bias in search rankings strongly influences how many clicks a result receives, so that directly using click data as a training signal in Learning-to-Rank (LTR) methods yields sub-optimal results. To overcome this bias problem, we present a counterfactual inference framework that provides the theoretical basis for unbiased LTR via Empirical Risk Minimization despite biased data. Using this framework, we derive a Propensity-Weighted Ranking SVM for discriminative learning from implicit feedback, where click models take the role of the propensity estimator. In contrast to most conventional approaches to de-biasing the data using click models, this allows training of ranking functions even in settings where queries do not repeat. Beyond the theoretical support, we show empirically that the proposed learning method is highly effective in dealing with biases, that it is robust to noise and propensity model misspecification, and that it scales efficiently. We also demonstrate the real-world applicability of our approach on an operational search engine, where it substantially improves retrieval performance.

388 citations


Proceedings ArticleDOI
07 Aug 2017
TL;DR: Anserini provides wrappers and extensions on top of core Lucene libraries that allow researchers to use more intuitive APIs to accomplish common research tasks, and aims to provide the best of both worlds to better align information retrieval practice and research.
Abstract: Software toolkits play an essential role in information retrieval research. Most open-source toolkits developed by academics are designed to facilitate the evaluation of retrieval models over standard test collections. Efforts are generally directed toward better ranking and less attention is usually given to scalability and other operational considerations. On the other hand, Lucene has become the de facto platform in industry for building search applications (outside a small number of companies that deploy custom infrastructure). Compared to academic IR toolkits, Lucene can handle heterogeneous web collections at scale, but lacks systematic support for evaluation over standard test collections. This paper introduces Anserini, a new information retrieval toolkit that aims to provide the best of both worlds, to better align information retrieval practice and research. Anserini provides wrappers and extensions on top of core Lucene libraries that allow researchers to use more intuitive APIs to accomplish common research tasks. Our initial efforts have focused on three functionalities: scalable, multi-threaded inverted indexing to handle modern web-scale collections, streamlined IR evaluation for ad hoc retrieval on standard test collections, and an extensible architecture for multi-stage ranking. Anserini ships with support for many TREC test collections, providing a convenient way to replicate competitive baselines right out of the box. Experiments verify that our system is both efficient and effective, providing a solid foundation to support future research.

358 citations


Journal ArticleDOI
TL;DR: This paper develops a novel deep multimodal distance metric learning (Deep-MDML) method, which adopts a new ranking model to use multi-modal features, including click features and visual features in DML.
Abstract: How do we retrieve images accurately? Also, how do we rank a group of images precisely and efficiently for specific queries? These problems are critical for researchers and engineers to generate a novel image searching engine. First, it is important to obtain an appropriate description that effectively represent the images. In this paper, multimodal features are considered for describing images. The images unique properties are reflected by visual features, which are correlated to each other. However, semantic gaps always exist between images visual features and semantics. Therefore, we utilize click feature to reduce the semantic gap. The second key issue is learning an appropriate distance metric to combine these multimodal features. This paper develops a novel deep multimodal distance metric learning (Deep-MDML) method. A structured ranking model is adopted to utilize both visual and click features in distance metric learning (DML). Specifically, images and their related ranking results are first collected to form the training set. Multimodal features, including click and visual features, are collected with these images. Next, a group of autoencoders is applied to obtain initially a distance metric in different visual spaces, and an MDML method is used to assign optimal weights for different modalities. Next, we conduct alternating optimization to train the ranking model, which is used for the ranking of new queries with click features. Compared with existing image ranking methods, the proposed method adopts a new ranking model to use multimodal features, including click features and visual features in DML. We operated experiments to analyze the proposed Deep-MDML in two benchmark data sets, and the results validate the effects of the method.

354 citations


Proceedings ArticleDOI
03 Apr 2017
TL;DR: Explicit Semantic Ranking is introduced, a new ranking technique that leverages knowledge graph embedding that represents queries and documents in the entity space and ranks them based on their semantic connections from their knowledgegraph embedding.
Abstract: This paper introduces Explicit Semantic Ranking (ESR), a new ranking technique that leverages knowledge graph embedding. Analysis of the query log from our academic search engine, SemanticScholar.org, reveals that a major error source is its inability to understand the meaning of research concepts in queries. To addresses this challenge, ESR represents queries and documents in the entity space and ranks them based on their semantic connections from their knowledge graph embedding. Experiments demonstrate ESR's ability in improving Semantic Scholar's online production system, especially on hard queries where word-based ranking fails.

341 citations


Proceedings ArticleDOI
07 Aug 2017
TL;DR: This paper proposes to train a neural ranking model using weak supervision, where labels are obtained automatically without human annotators or any external resources, and suggests that supervised neural ranking models can greatly benefit from pre-training on large amounts of weakly labeled data that can be easily obtained from unsupervised IR models.
Abstract: Despite the impressive improvements achieved by unsupervised deep neural networks in computer vision and NLP tasks, such improvements have not yet been observed in ranking for information retrieval. The reason may be the complexity of the ranking problem, as it is not obvious how to learn from queries and documents when no supervised signal is available. Hence, in this paper, we propose to train a neural ranking model using weak supervision, where labels are obtained automatically without human annotators or any external resources (e.g., click data). To this aim, we use the output of an unsupervised ranking model, such as BM25, as a weak supervision signal. We further train a set of simple yet effective ranking models based on feed-forward neural networks. We study their effectiveness under various learning scenarios (point-wise and pair-wise models) and using different input representations (i.e., from encoding query-document pairs into dense/sparse vectors to using word embedding representation). We train our networks using tens of millions of training instances and evaluate it on two standard collections: a homogeneous news collection (Robust) and a heterogeneous large-scale web collection (ClueWeb). Our experiments indicate that employing proper objective functions and letting the networks to learn the input representation based on weakly supervised data leads to impressive performance, with over 13% and 35% MAP improvements over the BM25 model on the Robust and the ClueWeb collections. Our findings also suggest that supervised neural ranking models can greatly benefit from pre-training on large amounts of weakly labeled data that can be easily obtained from unsupervised IR models.

320 citations


Proceedings ArticleDOI
TL;DR: The Fair Top-K Ranking (FTR) algorithm as discussed by the authors is the first algorithm grounded in statistical tests that can mitigate biases in the representation of an underrepresented group along a ranked list.
Abstract: In this work, we define and solve the Fair Top-k Ranking problem, in which we want to determine a subset of k candidates from a large pool of n >> k candidates, maximizing utility (i.e., select the "best" candidates) subject to group fairness criteria. Our ranked group fairness definition extends group fairness using the standard notion of protected groups and is based on ensuring that the proportion of protected candidates in every prefix of the top-k ranking remains statistically above or indistinguishable from a given minimum. Utility is operationalized in two ways: (i) every candidate included in the top-$k$ should be more qualified than every candidate not included; and (ii) for every pair of candidates in the top-k, the more qualified candidate should be ranked above. An efficient algorithm is presented for producing the Fair Top-k Ranking, and tested experimentally on existing datasets as well as new datasets released with this paper, showing that our approach yields small distortions with respect to rankings that maximize utility without considering fairness criteria. To the best of our knowledge, this is the first algorithm grounded in statistical tests that can mitigate biases in the representation of an under-represented group along a ranked list.

Journal ArticleDOI
TL;DR: An end-to-end dual-path convolutional network to learn the image and text representations based on an unsupervised assumption that each image/text group can be viewed as a class, which allows the system to directly learn from the data and fully utilize the supervision.
Abstract: Matching images and sentences demands a fine understanding of both modalities In this paper, we propose a new system to discriminatively embed the image and text to a shared visual-textual space In this field, most existing works apply the ranking loss to pull the positive image / text pairs close and push the negative pairs apart from each other However, directly deploying the ranking loss is hard for network learning, since it starts from the two heterogeneous features to build inter-modal relationship To address this problem, we propose the instance loss which explicitly considers the intra-modal data distribution It is based on an unsupervised assumption that each image / text group can be viewed as a class So the network can learn the fine granularity from every image/text group The experiment shows that the instance loss offers better weight initialization for the ranking loss, so that more discriminative embeddings can be learned Besides, existing works usually apply the off-the-shelf features, ie, word2vec and fixed visual feature So in a minor contribution, this paper constructs an end-to-end dual-path convolutional network to learn the image and text representations End-to-end learning allows the system to directly learn from the data and fully utilize the supervision On two generic retrieval datasets (Flickr30k and MSCOCO), experiments demonstrate that our method yields competitive accuracy compared to state-of-the-art methods Moreover, in language based person retrieval, we improve the state of the art by a large margin The code has been made publicly available

Posted Content
TL;DR: In this paper, the authors developed a methodology for measuring personalization in Web search results and applied their methodology to 200 users on Google Web Search and 100 users on Bing, finding that, on average, 11.7% of results showed differences due to personalization on Google, while 15.8 percent of results were personalized on Bing.
Abstract: Web search is an integral part of our daily lives. Recently, there has been a trend of personalization in Web search, where different users receive different results for the same search query. The increasing level of personalization is leading to concerns about Filter Bubble effects, where certain users are simply unable to access information that the search engines' algorithm decides is irrelevant. Despite these concerns, there has been little quantification of the extent of personalization in Web search today, or the user attributes that cause it. In light of this situation, we make three contributions. First, we develop a methodology for measuring personalization in Web search results. While conceptually simple, there are numerous details that our methodology must handle in order to accurately attribute differences in search results to personalization. Second, we apply our methodology to 200 users on Google Web Search and 100 users on Bing. We find that, on average, 11.7% of results show differences due to personalization on Google, while 15.8% of results are personalized on Bing, but that this varies widely by search query and by result ranking. Third, we investigate the user features used to personalize on Google Web Search and Bing. Surprisingly, we only find measurable personalization as a result of searching with a logged in account and the IP address of the searching user. Our results are a first step towards understanding the extent and effects of personalization on Web search engines today.

Proceedings ArticleDOI
06 Nov 2017
TL;DR: Experiments on both benchmark LETOR dataset and a large scale clickthrough data show that DeepRank can significantly outperform learning to ranking methods, and existing deep learning methods.
Abstract: This paper concerns a deep learning approach to relevance ranking in information retrieval (IR). Existing deep IR models such as DSSM and CDSSM directly apply neural networks to generate ranking scores, without explicit understandings of the relevance. According to the human judgement process, a relevance label is generated by the following three steps: 1) relevant locations are detected; 2) local relevances are determined; 3) local relevances are aggregated to output the relevance label. In this paper we propose a new deep learning architecture, namely DeepRank, to simulate the above human judgment process. Firstly, a detection strategy is designed to extract the relevant contexts. Then, a measure network is applied to determine the local relevances by utilizing a convolutional neural network (CNN) or two-dimensional gated recurrent units (2D-GRU). Finally, an aggregation network with sequential integration and term gating mechanism is used to produce a global relevance score. DeepRank well captures important IR characteristics, including exact/semantic matching signals, proximity heuristics, query term importance, and diverse relevance requirement. Experiments on both benchmark LETOR dataset and a large scale clickthrough data show that DeepRank can significantly outperform learning to ranking methods, and existing deep learning methods.

Posted Content
TL;DR: Li et al. as discussed by the authors used the output of an unsupervised ranking model, such as BM25, as a weak supervision signal to train a neural ranking model using weak supervision, where labels are obtained automatically without human annotators or any external resources (e.g., click data).
Abstract: Despite the impressive improvements achieved by unsupervised deep neural networks in computer vision and NLP tasks, such improvements have not yet been observed in ranking for information retrieval. The reason may be the complexity of the ranking problem, as it is not obvious how to learn from queries and documents when no supervised signal is available. Hence, in this paper, we propose to train a neural ranking model using weak supervision, where labels are obtained automatically without human annotators or any external resources (e.g., click data). To this aim, we use the output of an unsupervised ranking model, such as BM25, as a weak supervision signal. We further train a set of simple yet effective ranking models based on feed-forward neural networks. We study their effectiveness under various learning scenarios (point-wise and pair-wise models) and using different input representations (i.e., from encoding query-document pairs into dense/sparse vectors to using word embedding representation). We train our networks using tens of millions of training instances and evaluate it on two standard collections: a homogeneous news collection(Robust) and a heterogeneous large-scale web collection (ClueWeb). Our experiments indicate that employing proper objective functions and letting the networks to learn the input representation based on weakly supervised data leads to impressive performance, with over 13% and 35% MAP improvements over the BM25 model on the Robust and the ClueWeb collections. Our findings also suggest that supervised neural ranking models can greatly benefit from pre-training on large amounts of weakly labeled data that can be easily obtained from unsupervised IR models.

Proceedings ArticleDOI
21 Aug 2017
TL;DR: This work presents S3, a new repair synthesis engine that leverages programming-by-examples methodology to synthesize high-quality bug repairs and compares S3’s repair effectiveness with state-of-the-art synthesis engines Angelix, Enumerative, and CVC4.
Abstract: A notable class of techniques for automatic program repair is known as semantics-based. Such techniques, e.g., Angelix, infer semantic specifications via symbolic execution, and then use program synthesis to construct new code that satisfies those inferred specifications. However, the obtained specifications are naturally incomplete, leaving the synthesis engine with a difficult task of synthesizing a general solution from a sparse space of many possible solutions that are consistent with the provided specifications but that do not necessarily generalize. We present S3, a new repair synthesis engine that leverages programming-by-examples methodology to synthesize high-quality bug repairs. The novelty in S3 that allows it to tackle the sparse search space to create more general repairs is three-fold: (1) A systematic way to customize and constrain the syntactic search space via a domain-specific language, (2) An efficient enumeration- based search strategy over the constrained search space, and (3) A number of ranking features based on measures of the syntactic and semantic distances between candidate solutions and the original buggy program. We compare S3’s repair effectiveness with state-of-the-art synthesis engines Angelix, Enumerative, and CVC4. S3 can successfully and correctly fix at least three times more bugs than the best baseline on datasets of 52 bugs in small programs, and 100 bugs in real-world large programs.

Posted Content
Ledell Wu, Adam Fisch1, Sumit Chopra1, Keith Adams1, Antoine Bordes1, Jason Weston1 
TL;DR: Empirical results on a number of tasks show that StarSpace is highly competitive with existing methods, whilst also being generally applicable to new cases where those methods are not.
Abstract: We present StarSpace, a general-purpose neural embedding model that can solve a wide variety of problems: labeling tasks such as text classification, ranking tasks such as information retrieval/web search, collaborative filtering-based or content-based recommendation, embedding of multi-relational graphs, and learning word, sentence or document level embeddings. In each case the model works by embedding those entities comprised of discrete features and comparing them against each other -- learning similarities dependent on the task. Empirical results on a number of tasks show that StarSpace is highly competitive with existing methods, whilst also being generally applicable to new cases where those methods are not.

Posted Content
TL;DR: Categorize and evaluate those algorithms proposed during the period of 2003 to 2016 for content-based image retrieval and conclude with several promising directions for future research.
Abstract: The explosive increase and ubiquitous accessibility of visual data on the Web have led to the prosperity of research activity in image search or retrieval. With the ignorance of visual content as a ranking clue, methods with text search techniques for visual retrieval may suffer inconsistency between the text words and visual content. Content-based image retrieval (CBIR), which makes use of the representation of visual content to identify relevant images, has attracted sustained attention in recent two decades. Such a problem is challenging due to the intention gap and the semantic gap problems. Numerous techniques have been developed for content-based image retrieval in the last decade. The purpose of this paper is to categorize and evaluate those algorithms proposed during the period of 2003 to 2016. We conclude with several promising directions for future research.

Journal ArticleDOI
TL;DR: The conclusions are that the current YouTube algorithm favors the presence of reliable videos in upper rank positions in diabetes-related searches and how videos from reliable channels, such as hospitals and health organizations, are evolving in the ranking is analyzed.
Abstract: Health consumers are increasingly using the Internet to search for health information. The existence of overloaded, inaccurate, obsolete, or simply incorrect health information available on the Internet is a serious obstacle for finding relevant and good-quality data that actually helps patients. Search engines of multimedia Internet platforms are thought to help users to find relevant information according to their search. But, is the information recovered by those search engines from quality sources? Is the health information uploaded from reliable sources, such as hospitals and health organizations, easily available to patients? The availability of videos is directly related to the ranking position in YouTube search. The higher the ranking of the information is, the more accessible it is. The aim of this study is to analyze the ranking evolution of diabetes health videos on YouTube in order to discover how videos from reliable channels, such as hospitals and health organizations, are evolving in the ranking. The analysis was done by tracking the ranking of 2372 videos on a daily basis during a 30-day period using 20 diabetes-related queries. Our conclusions are that the current YouTube algorithm favors the presence of reliable videos in upper rank positions in diabetes-related searches.

Journal ArticleDOI
TL;DR: The proposed ranking index is intelligible and intrinsically superior to the original ranking index in seeking compromised solutions and showed that if the number of alternatives exceeds two or if the relative importance of the two separations should be considered, the proposedranking index would be a better choice.

Proceedings ArticleDOI
01 Oct 2017
TL;DR: A novel global ranking algorithm, applied to a Markov network built upon the 3D map, which takes account of not only visual similarities between individual 2D-3D matches, but also their global compatibilities among all matching pairs found in the scene.
Abstract: Given an image of a street scene in a city, this paper develops a new method that can quickly and precisely pinpoint at which location (as well as viewing direction) the image was taken, against a pre-stored large-scale 3D point-cloud map of the city. We adopt the recently developed 2D-3D direct feature matching framework for this task [23,31,32,42–44]. This is a challenging task especially for large-scale problems. As the map size grows bigger, many 3D points in the wider geographical area can be visually very similar–or even identical–causing severe ambiguities in 2D-3D feature matching. The key is to quickly and unambiguously find the correct matches between a query image and the large 3D map. Existing methods solve this problem mainly via comparing individual features’ visual similarities in a local and per feature manner, thus only local solutions can be found, inadequate for large-scale applications. In this paper, we introduce a global method which harnesses global contextual information exhibited both within the query image and among all the 3D points in the map. This is achieved by a novel global ranking algorithm, applied to a Markov network built upon the 3D map, which takes account of not only visual similarities between individual 2D-3D matches, but also their global compatibilities (as measured by co-visibility) among all matching pairs found in the scene. Tests on standard benchmark datasets show that our method achieved both higher precision and comparable recall, compared with the state-of-the-art.

Journal ArticleDOI
TL;DR: A linear programming method is developed to solve the MCDM problems with probabilistic linguistic information, and a case study about the evaluation of hospitals is conducted to illustrate the proposed method.


Posted Content
TL;DR: This tutorial introduces basic concepts and intuitions behind neural IR models, and places them in the context of traditional retrieval models, by introducing fundamental concepts of IR and different neural and non-neural approaches to learning vector representations of text.
Abstract: Neural ranking models for information retrieval (IR) use shallow or deep neural networks to rank search results in response to a query. Traditional learning to rank models employ machine learning techniques over hand-crafted IR features. By contrast, neural models learn representations of language from raw text that can bridge the gap between query and document vocabulary. Unlike classical IR models, these new machine learning based approaches are data-hungry, requiring large scale training data before they can be deployed. This tutorial introduces basic concepts and intuitions behind neural IR models, and places them in the context of traditional retrieval models. We begin by introducing fundamental concepts of IR and different neural and non-neural approaches to learning vector representations of text. We then review shallow neural IR methods that employ pre-trained neural term embeddings without learning the IR task end-to-end. We introduce deep neural networks next, discussing popular deep architectures. Finally, we review the current DNN models for information retrieval. We conclude with a discussion on potential future directions for neural IR.

Proceedings ArticleDOI
01 Sep 2017
TL;DR: An argument search framework for studying how people query for arguments, how to mine arguments from the web, or how to rank them is developed and a prototype search engine is built that relies on an initial, freely accessible index of nearly 300k arguments crawled from reliable web resources.
Abstract: Computational argumentation is expected to play a critical role in the future of web search. To make this happen, many search-related questions must be revisited, such as how people query for arguments, how to mine arguments from the web, or how to rank them. In this paper, we develop an argument search framework for studying these and further questions. The framework allows for the composition of approaches to acquiring, mining, assessing, indexing, querying, retrieving, ranking, and presenting arguments while relying on standard infrastructure and interfaces. Based on the framework, we build a prototype search engine, called args, that relies on an initial, freely accessible index of nearly 300k arguments crawled from reliable web resources. The framework and the argument search engine are intended as an environment for collaborative research on computational argumentation and its practical evaluation.

Proceedings ArticleDOI
21 Jul 2017
TL;DR: A novel memory-augmented Attribute Manipulation Network (AMNet) which can manipulate image representation at the attribute level and achieve remarkably good performance compared with well-designed baselines in terms of effectiveness of attribute manipulation and search accuracy.
Abstract: We introduce a new fashion search protocol where attribute manipulation is allowed within the interaction between users and search engines, e.g. manipulating the color attribute of the clothing from red to blue. It is particularly useful for image-based search when the query image cannot perfectly match users expectation of the desired product. To build such a search engine, we propose a novel memory-augmented Attribute Manipulation Network (AMNet) which can manipulate image representation at the attribute level. Given a query image and some attributes that need to modify, AMNet can manipulate the intermediate representation encoding the unwanted attributes and change them to the desired ones through following four novel components: (1) a dual-path CNN architecture for discriminative deep attribute representation learning, (2) a memory block with an internal memory and a neural controller for prototype attribute representation learning and hosting, (3) an attribute manipulation network to modify the representation of the query image with the prototype feature retrieved from the memory block, (4) a loss layer which jointly optimizes the attribute classification loss and a triplet ranking loss over triplet images for facilitating precise attribute manipulation and image retrieving. Extensive experiments conducted on two large-scale fashion search datasets, i.e. DARN and DeepFashion, have demonstrated that AMNet is able to achieve remarkably good performance compared with well-designed baselines in terms of effectiveness of attribute manipulation and search accuracy.

Proceedings ArticleDOI
Ke Yan1, Yonghong Tian1, Yaowei Wang1, Wei Zeng1, Tiejun Huang1 
01 Oct 2017
TL;DR: This paper model the relationship of vehicle images as multiple grains, and proposes two approaches to alleviate the precise vehicle search problem by exploiting multi-grain ranking constraints, which achieve the state-of-the-art performance on both datasets.
Abstract: Precise search of visually-similar vehicles poses a great challenge in computer vision, which needs to find exactly the same vehicle among a massive vehicles with visually similar appearances for a given query image. In this paper, we model the relationship of vehicle images as multiple grains. Following this, we propose two approaches to alleviate the precise vehicle search problem by exploiting multi-grain ranking constraints. One is Generalized Pairwise Ranking, which generalizes the conventional pairwise from considering only binary similar/dissimilar relations to multiple relations. The other is Multi-Grain based List Ranking, which introduces permutation probability to score a permutation of a multi-grain list, and further optimizes the ranking by the likelihood loss function. We implement the two approaches with multi-attribute classification in a multi-task deep learning framework. To further facilitate the research on precise vehicle search, we also contribute two high-quality and well-annotated vehicle datasets, named VD1 and VD2, which are collected from two different cities with diverse annotated attributes. As two of the largest publicly available precise vehicle search datasets, they contain 1,097,649 and 807,260 vehicle images respectively. Experimental results show that our approaches achieve the state-of-the-art performance on both datasets.

Journal ArticleDOI
TL;DR: A support‐vector‐machine‐based single‐model global quality assessment (SVMQA) method that outperformed the existing best single‐ model QA methods both in ranking provided protein models and in selecting the best model from the pool.
Abstract: Motivation The accurate ranking of predicted structural models and selecting the best model from a given candidate pool remain as open problems in the field of structural bioinformatics. The quality assessment (QA) methods used to address these problems can be grouped into two categories: consensus methods and single-model methods. Consensus methods in general perform better and attain higher correlation between predicted and true quality measures. However, these methods frequently fail to generate proper quality scores for native-like structures which are distinct from the rest of the pool. Conversely, single-model methods do not suffer from this drawback and are better suited for real-life applications where many models from various sources may not be readily available. Results In this study, we developed a support-vector-machine-based single-model global quality assessment (SVMQA) method. For a given protein model, the SVMQA method predicts TM-score and GDT_TS score based on a feature vector containing statistical potential energy terms and consistency-based terms between the actual structural features (extracted from the three-dimensional coordinates) and predicted values (from primary sequence). We trained SVMQA using CASP8, CASP9 and CASP10 targets and determined the machine parameters by 10-fold cross-validation. We evaluated the performance of our SVMQA method on various benchmarking datasets. Results show that SVMQA outperformed the existing best single-model QA methods both in ranking provided protein models and in selecting the best model from the pool. According to the CASP12 assessment, SVMQA was the best method in selecting good-quality models from decoys in terms of GDTloss. Availability and implementation SVMQA method can be freely downloaded from http://lee.kias.re.kr/SVMQA/SVMQA_eval.tar.gz. Contact jlee@kias.re.kr. Supplementary information Supplementary data are available at Bioinformatics online.

Proceedings ArticleDOI
03 Apr 2017
TL;DR: Zhang et al. as discussed by the authors proposed a Geo-Temporal sequential embedding rank (Geo-Teaser) model for POI recommendation based on the success of the word2vec framework.
Abstract: Point-of-interest (POI) recommendation is an important application for location-based social networks (LBSNs), which learns the user preference and mobility pattern from check-in sequences to recommend POIs. Previous studies show that modeling the sequential pattern of user check-ins is necessary for POI recommendation. Markov chain model, recurrent neural network, and the word2vec framework are used to model check-in sequences in previous work. However, all previous sequential models ignore the fact that check-in sequences on different days naturally exhibit the various temporal characteristics, for instance, "work" on weekday and "entertainment" on weekend. In this paper, we take this challenge and propose a Geo-Temporal sequential embedding rank (Geo-Teaser) model for POI recommendation. Inspired by the success of the word2vec framework to model the sequential contexts, we propose a temporal POI embedding model to learn POI representations under some particular temporal state. The temporal POI embedding model captures the contextual check-in information in sequences and the various temporal characteristics on different days as well. Furthermore, We propose a new way to incorporate the geographical influence into the pairwise preference ranking method through discriminating the unvisited POIs according to geographical information. Then we develop a geographically hierarchical pairwise preference ranking model. Finally, we propose a unified framework to recommend POIs combining these two models. To verify the effectiveness of our proposed method, we conduct experiments on two real-life datasets. Experimental results show that the Geo-Teaser model outperforms state-of-the-art models. Compared with the best baseline competitor, the Geo-Teaser model improves at least 20% on both datasets for all metrics.

Posted Content
TL;DR: This paper investigates two-branch neural networks for learning the similarity between image-sentence matching and region-phrase matching, and proposes two network structures that produce different output representations.
Abstract: Image-language matching tasks have recently attracted a lot of attention in the computer vision field. These tasks include image-sentence matching, i.e., given an image query, retrieving relevant sentences and vice versa, and region-phrase matching or visual grounding, i.e., matching a phrase to relevant regions. This paper investigates two-branch neural networks for learning the similarity between these two data modalities. We propose two network structures that produce different output representations. The first one, referred to as an embedding network, learns an explicit shared latent embedding space with a maximum-margin ranking loss and novel neighborhood constraints. Compared to standard triplet sampling, we perform improved neighborhood sampling that takes neighborhood information into consideration while constructing mini-batches. The second network structure, referred to as a similarity network, fuses the two branches via element-wise product and is trained with regression loss to directly predict a similarity score. Extensive experiments show that our networks achieve high accuracies for phrase localization on the Flickr30K Entities dataset and for bi-directional image-sentence retrieval on Flickr30K and MSCOCO datasets.