scispace - formally typeset
Journal IssueDOI

Patterns of query reformulation during Web searching

TLDR
Results show that Reformulation and Assistance account for approximately 45p of all query reformulations; furthermore, the results demonstrate that the first- and second-order models provide the best predictability, between 28 and 40p overall and higher than 70p for some patterns.
Abstract
Query reformulation is a key user behavior during Web search. Our research goal is to develop predictive models of query reformulation during Web searching. This article reports results from a study in which we automatically classified the query-reformulation patterns for 964,780 Web searching sessions, composed of 1,523,072 queries, to predict the next query reformulation. We employed an n-gram modeling approach to describe the probability of users transitioning from one query-reformulation state to another to predict their next state. We developed first-, second-, third-, and fourth-order models and evaluated each model for accuracy of prediction, coverage of the dataset, and complexity of the possible pattern set. The results show that Reformulation and Assistance account for approximately 45p of all query reformulations; furthermore, the results demonstrate that the first- and second-order models provide the best predictability, between 28 and 40p overall and higher than 70p for some patterns. Implications are that the n-gram approach can be used for improving searching systems and searching assistance. © 2009 Wiley Periodicals, Inc.

read more

Content maybe subject to copyright    Report

Citations
More filters

Defining a session on Web search engines

TL;DR: In this article, the authors explore three alternative methods for detection of session boundaries and show that defining sessions by query reformulation along with Internet Protocol address and cookie provides the best measure, resulting in an 82% increase in the count of sessions.
Proceedings ArticleDOI

Generating Clarifying Questions for Information Retrieval

TL;DR: A taxonomy of clarification for open-domain search queries is identified by analyzing large-scale query reformulation data sampled from Bing search logs, and supervised and reinforcement learning models for generating clarifying questions learned from weak supervision data are proposed.
Proceedings ArticleDOI

Learning user reformulation behavior for query auto-completion

TL;DR: The feasibility of exploiting the context to learn user reformulation behavior for boosting prediction performance is investigated and a supervised approach to query auto-completion is proposed, where three kinds of reformulation-related features are considered, including term-level, query-level and session-level features.
Journal ArticleDOI

Natural language processing for aviation safety reports

TL;DR: The different NLP techniques designed and used in collaboration between the CLLE-ERSS research laboratory and the CFH/Safety Data company to manage and analyse aviation incident reports are described.
Proceedings ArticleDOI

How do users respond to voice input errors?: lexical and phonetic query reformulation in voice search

TL;DR: A clearer picture is provided on how to further improve current voice search systems by evaluating the impacts of typical voice input errors on users' search progress and the effectiveness of different reformulation strategies on handling these errors.
References
More filters
Journal ArticleDOI

Analysis of a very large web search engine query log

TL;DR: It is shown that web users type in short queries, mostly look at the first 10 results only, and seldom modify the query, suggesting that traditional information retrieval techniques may not work well for answering web search requests.
Journal ArticleDOI

Ask for information retrieval: part I.: background and theory

TL;DR: The results of the design study indicate that at least some of the premises of the project are reasonable, and that an ASK‐based information retrieval system is at least feasible.
Journal ArticleDOI

How are we searching the world wide web?: a comparison of nine search engine transaction logs

TL;DR: In this paper, the authors report results from research that examines characteristics and changes in Web searching from nine studies of five Web search engines based in the US and Europe and find that users are viewing fewer result pages, searchers on US-based web search engines use more query operators, and there are statistically significant differences in the use of Boolean operators and result pages viewed, and one cannot necessary apply results from studies of one particular Web search engine to another web search engine.
Proceedings Article

Syskill & webert: Identifying interesting web sites

TL;DR: The naive Bayesian classifier offers several advantages over other learning algorithms on this task and an initial portion of a web page is sufficient for making predictions on its interestingness substantially reducing the amount of network transmission required to make predictions.
Book

A Guide to Chi-Squared Testing

TL;DR: The Chi-Squared test of Pearson as discussed by the authors was used for a composite hypothesis and the Chi-squared test for an exponential family of distributions was used to test whether a given composite hypothesis is a composite or not.