scispace - formally typeset
Proceedings ArticleDOI

Robust classification of rare queries using web knowledge

Reads0
Chats0
TLDR
This work proposes a methodology for building a practical robust query classification system that can identify thousands of query classes with reasonable accuracy, while dealing in real-time with the query volume of a commercial web search engine.
Abstract: 
We propose a methodology for building a practical robust query classification system that can identify thousands of query classes with reasonable accuracy, while dealing in real-time with the query volume of a commercial web search engine. We use a blind feedback technique: given a query, we determine its topic by classifying the web search results retrieved by the query. Motivated by the needs of search advertising, we primarily focus on rare queries, which are the hardest from the point of view of machine learning, yet in aggregation account for a considerable fraction of search engine traffic. Empirical evaluation confirms that our methodology yields a considerably higher classification accuracy than previously reported. We believe that the proposed methodology will lead to better matching of online ads to rare queries and overall to a better user experience.

read more

Citations
More filters
Journal ArticleDOI

Web page classification: Features and algorithms

TL;DR: As work in Web page classification is reviewed, the importance of these Web-specific features and algorithms are noted, state-of-the-art practices are described, and the underlying assumptions behind the use of information from neighboring pages are tracked.
Proceedings ArticleDOI

Learning query intent from regularized click graphs

TL;DR: This work aims at drastically increasing the amounts of training data by semi-supervised learning with click graphs by inferring class memberships of unlabeled queries from those of labeled ones according to their proximities in a click graph.
Proceedings ArticleDOI

Understanding user's query intent with wikipedia

TL;DR: This paper proposes a general methodology to the problem of query intent classification that can achieve much better coverage to classify queries in an intent domain even through the number of seed intent examples is very small and can be easily applied to various intent domains.
Journal ArticleDOI

Mining Query Logs: Turning Search Usage Data into Knowledge

TL;DR: This survey is on introducing to the discipline of query mining by showing its foundations and by analyzing the basic algorithms and techniques that are used to extract useful knowledge from this (potentially) infinite source of information.
References
More filters
Book

Pattern classification and scene analysis

TL;DR: In this article, a unified, comprehensive and up-to-date treatment of both statistical and descriptive methods for pattern recognition is provided, including Bayesian decision theory, supervised and unsupervised learning, nonparametric techniques, discriminant analysis, clustering, preprosessing of pictorial data, spatial filtering, shape description techniques, perspective transformations, projective invariants, linguistic procedures, and artificial intelligence techniques for scene analysis.
Proceedings Article

Okapi at TREC

TL;DR: Much of the work involved investigating plausible methods of applying Okapi-style weighting to phrases, and expansion using terms from the top documents retrieved by a pilot search on topic terms was used.
Journal ArticleDOI

Improving Retrieval Performance by Relevance Feedback

TL;DR: Relevance feedback is an automatic process, introduced over 20 years ago, designed to produce query formulations following an initial retrieval operation to demonstrate the effectiveness of the various methods.