scispace - formally typeset
Search or ask a question

Showing papers by "Zhen Liao published in 2009"


Proceedings ArticleDOI
29 Mar 2009
TL;DR: A novel "sequential query prediction" approach that tries to grasp a user's search intent based on his/her past query sequence and its resemblance to historical query sequence models mined from massive search engine logs is proposed.
Abstract: Web query recommendation has long been considered a key feature of search engines. Building a good Web query recommendation system, however, is very difficult due to the fundamental challenge of predicting users' search intent, especially given the limited user context information. In this paper, we propose a novel "sequential query prediction" approach that tries to grasp a user's search intent based on his/her past query sequence and its resemblance to historical query sequence models mined from massive search engine logs. Different query sequence models were examined, including the naive variable length N-gram model, Variable Memory Markov (VMM) model, and our proposed Mixture Variable Memory Markov (MVMM) model. Extensive experiments were conducted to benchmark our sequence prediction algorithms against two conventional pairwise approaches on large-scale search logs extracted from a commercial search engine. Results show that the sequence-wise approaches significantly outperform the conventional pair-wise ones in terms of prediction accuracy. In particular, our MVMM approach, consistently leads the pack, making it an effective and practical approach towards Web query recommendation.

132 citations


Proceedings Article
07 Sep 2009
TL;DR: A hybrid probabilistic approach is proposed which combines language model and statistical machine translation model for tag recommendation, and experimental results validate the effectiveness of this method.
Abstract: Social Tagging is a typical Web 2.0 application for users to share knowledge and organize the massive web resources. Choosing appropriate words as tags might be time consuming for users, thus a tag recommendation system is needed for accelerating this procedure. In this paper we formulate tag recommendation as a probabilistic ranking process, especially we propose a hybrid probabilistic approach which combines language model and statistical machine translation model. Experimental results validate the effectiveness of our method.

3 citations


Proceedings ArticleDOI
Zhen Liao1, Ya Lou Huang1, Mao Qiang Xie1, Jie Liu1, Yang Wang1, Min Lu1 
12 Jul 2009
TL;DR: A new approach is proposed, which adopts “Query Influence Weighting” algorithm for computing query importance and incorporates the importance into the loss function for guiding the model constructing and experimental results show that the approach outperforms conventional Ranking SVM and other baselines.
Abstract: Ranking continuously plays an important role in document retrieval and has attracted remarkable attentions. Existing ranking methods conduct the loss function for each query independently but ignore the fact that minimizing the loss of one query may increase that of another if they are contradictory. In principle, the punishment for errors of important queries should be enlarged. In this paper we propose a new approach “Query Influence Weighting”, which adopts “Query Influence Weighting” algorithm for computing query importance and incorporates the importance into the loss function for guiding the model constructing. We conduct a ranking model based on a state-of-art method named Ranking SVM. Experimental results on two public datasets show that the “Query Influence Weighting” approach outperforms conventional Ranking SVM and other baselines. We further analyze the influence consistency on training and testing datasets and validate the effectiveness of our approach.

Proceedings ArticleDOI
12 Jul 2009
TL;DR: Experimental results show that the proposed novel active ranking framework on query-level which aims to employ different ranking models for different queries can reduce the labeling cost greatly without decreasing the ranking accuracy.
Abstract: Learning to rank is becoming more and more popular in machine learning and information retrieval field. However, like many other supervised approaches, one of the main problems with learning to rank is lack of labeled data. Recently, there have been attempts to address the challenges in active sampling for learning to rank. But none of these methods take into consideration the differences between queries*. In this paper, we propose a novel active ranking framework on query-level which aims to employ different ranking models for different queries. Then, we used Rank SVM as a base ranker, realized a query-level active ranking algorithm and applied it to document retrieval. Experimental results on real-world data set show that our approach can reduce the labeling cost greatly without decreasing the ranking accuracy.