Learning to extract cross-session search tasks

doi:10.1145/2488388.2488507

Proceedings ArticleDOI

Learning to extract cross-session search tasks

Hongning Wang, +5 more

- pp 1353-1364

Chats0

TLDR

A semi-supervised clustering model is proposed based on the latent structural SVM framework, and a set of effective automatic annotation rules are proposed as weak supervision to release the burden of manual annotation in identifying cross-session search tasks.

Abstract:

Search tasks, comprising a series of search queries serving the same information need, have recently been recognized as an accurate atomic unit for modeling user search intent. Most prior research in this area has focused on short-term search tasks within a single search session, and heavily depend on human annotations for supervised classification model learning. In this work, we target the identification of long-term, or cross-session, search tasks (transcending session boundaries) by investigating inter-query dependencies learned from users' searching behaviors. A semi-supervised clustering model is proposed based on the latent structural SVM framework, and a set of effective automatic annotation rules are proposed as weak supervision to release the burden of manual annotation. Experimental results based on a large-scale search log collected from Bing.com confirms the effectiveness of the proposed model in identifying cross-session search tasks and the utility of the introduced weak supervision signals. Our learned model enables a more comprehensive understanding of users' search behaviors via search logs and facilitates the development of dedicated search-engine support for long-term tasks.

Learning to extract cross-session search tasks

Citations

Win-win search: dual-agent stochastic game in session search

User Modeling for a Personal Assistant

Context Attentive Document Ranking and Query Suggestion

Identifying and labeling search tasks via query-based hawkes processes

Mining, Ranking and Recommending Entity Aspects

References

The Nature of Statistical Learning Theory

Data clustering: a review

Constrained K-means Clustering with Background Knowledge

Analysis of a very large web search engine query log

Information Processing

Related Papers (5)

Beyond the session timeout: automatic hierarchical segmentation of search topics in query logs

Modeling and analysis of cross-session search tasks

Identifying task-based sessions in search engine query logs

Multitasking during web search sessions

Combining evidence for automatic web session identification