scispace - formally typeset
Proceedings ArticleDOI

Learning to extract cross-session search tasks

Reads0
Chats0
TLDR
A semi-supervised clustering model is proposed based on the latent structural SVM framework, and a set of effective automatic annotation rules are proposed as weak supervision to release the burden of manual annotation in identifying cross-session search tasks.
Abstract
Search tasks, comprising a series of search queries serving the same information need, have recently been recognized as an accurate atomic unit for modeling user search intent. Most prior research in this area has focused on short-term search tasks within a single search session, and heavily depend on human annotations for supervised classification model learning. In this work, we target the identification of long-term, or cross-session, search tasks (transcending session boundaries) by investigating inter-query dependencies learned from users' searching behaviors. A semi-supervised clustering model is proposed based on the latent structural SVM framework, and a set of effective automatic annotation rules are proposed as weak supervision to release the burden of manual annotation. Experimental results based on a large-scale search log collected from Bing.com confirms the effectiveness of the proposed model in identifying cross-session search tasks and the utility of the introduced weak supervision signals. Our learned model enables a more comprehensive understanding of users' search behaviors via search logs and facilitates the development of dedicated search-engine support for long-term tasks.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

Win-win search: dual-agent stochastic game in session search

TL;DR: This work mathematically model dynamics in session search, including decision states, query changes, clicks, and rewards, as a cooperative game between the user and the search engine as a dual-agent stochastic game.
Proceedings ArticleDOI

User Modeling for a Personal Assistant

TL;DR: A user modeling system that serves as the foundation of a personal assistant that identifies coherent contexts that correspond to tasks, interests, and habits, and an algorithm for identifying contexts that is 8 to 30 times faster than previous algorithms are presented.
Proceedings ArticleDOI

Context Attentive Document Ranking and Query Suggestion

TL;DR: A two-level hierarchical recurrent neural network is introduced to learn search context representation of individual queries, search tasks, and corresponding dependency structure by jointly optimizing two companion retrieval tasks: document ranking and query suggestion.
Proceedings ArticleDOI

Identifying and labeling search tasks via query-based hawkes processes

TL;DR: A probabilistic method for identifying and labeling search tasks based on the following intuitive observations: queries that are issued temporally close by users in many sequences of queries are likely to belong to the same search task, meanwhile, different users having the same information needs tend to submit topically coherent search queries.
Proceedings ArticleDOI

Mining, Ranking and Recommending Entity Aspects

TL;DR: This paper proposes an approach that mines, clusters, and ranks entity aspects from query logs, and proposes two approaches based on semantic relatedness and aspect transitions within user sessions that find that a combined approach gives the best performance.
References
More filters
Book

The Nature of Statistical Learning Theory

TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
Journal ArticleDOI

Data clustering: a review

TL;DR: An overview of pattern clustering methods from a statistical pattern recognition perspective is presented, with a goal of providing useful advice and references to fundamental concepts accessible to the broad community of clustering practitioners.
Proceedings Article

Constrained K-means Clustering with Background Knowledge

TL;DR: This paper demonstrates how the popular k-means clustering algorithm can be protably modied to make use of information about the problem domain that is available in addition to the data instances themselves.
Journal ArticleDOI

Analysis of a very large web search engine query log

TL;DR: It is shown that web users type in short queries, mostly look at the first 10 results only, and seldom modify the query, suggesting that traditional information retrieval techniques may not work well for answering web search requests.
Book

Information Processing

Related Papers (5)