scispace - formally typeset
Search or ask a question
Author

Mohammad Aliannejadi

Bio: Mohammad Aliannejadi is an academic researcher from University of Amsterdam. The author has contributed to research in topics: Information needs & Relevance (information retrieval). The author has an hindex of 17, co-authored 80 publications receiving 908 citations. Previous affiliations of Mohammad Aliannejadi include Amirkabir University of Technology & University of Zanjan.

Papers published on a yearly basis

Papers
More filters
Proceedings ArticleDOI
18 Jul 2019
TL;DR: This paper proposed a retrieval framework consisting of three components: question retrieval, question selection, and document retrieval, which takes into account the original query and previous question-answer interactions while selecting the next question.
Abstract: Users often fail to formulate their complex information needs in a single query. As a consequence, they may need to scan multiple result pages or reformulate their queries, which may be a frustrating experience. Alternatively, systems can improve user satisfaction by proactively asking questions of the users to clarify their information needs. Asking clarifying questions is especially important in conversational systems since they can only return a limited number of (often only one) result(s). In this paper, we formulate the task of asking clarifying questions in open-domain information-seeking conversational systems. To this end, we propose an offline evaluation methodology for the task and collect a dataset, called Qulac, through crowdsourcing. Our dataset is built on top of the TREC Web Track 2009-2012 data and consists of over 10K question-answer pairs for 198 TREC topics with 762 facets. Our experiments on an oracle model demonstrate that asking only one good question leads to over 170% retrieval performance improvement in terms of P@1, which clearly demonstrates the potential impact of the task. We further propose a retrieval framework consisting of three components: question retrieval, question selection, and document retrieval. In particular, our question selection model takes into account the original query and previous question-answer interactions while selecting the next question. Our model significantly outperforms competitive baselines. To foster research in this area, we have made Qulac publicly available.

193 citations

Journal ArticleDOI
01 Nov 2014
TL;DR: Data mining classification techniques including Decision Tree, Artificial Neural Networks, K-Nearest Neighbors, and Support Vector Machine are employed to improve churn prediction and a hybrid methodology which made considerable improvements to the value of some of evaluations metrics is proposed.
Abstract: We have employed Decision Tree, Artificial Neural Networks, K-Nearest Neighbors, and Support Vector Machine to improve churn prediction.Using the data of an Iranian mobile company these techniques were experienced and were compared to each other.We proposed a hybrid methodology which made considerable improvements to the value of some of evaluations metrics.Results showed that above 95% accuracy for Recall and Precision is easily achievable.A new methodology for extracting influential features is introduced and experienced. To survive in today's telecommunication business it is imperative to distinguish customers who are not reluctant to move toward a competitor. Therefore, customer churn prediction has become an essential issue in telecommunication business. In such competitive business a reliable customer predictor will be regarded priceless. This paper has employed data mining classification techniques including Decision Tree, Artificial Neural Networks, K-Nearest Neighbors, and Support Vector Machine so as to compare their performances. Using the data of an Iranian mobile company, not only were these techniques experienced and compared to one another, but also we have drawn a parallel between some different prominent data mining software. Analyzing the techniques' behavior and coming to know their specialties, we proposed a hybrid methodology which made considerable improvements to the value of some of the evaluations metrics. The proposed methodology results showed that above 95% accuracy for Recall and Precision is easily achievable. Apart from that a new methodology for extracting influential features in dataset was introduced and experienced.

138 citations

Proceedings ArticleDOI
TL;DR: This paper formulate the task of asking clarifying questions in open-domain information-seeking conversational systems, propose an offline evaluation methodology for the task, and collect a dataset, called Qulac, through crowdsourcing, which significantly outperforms competitive baselines.
Abstract: Users often fail to formulate their complex information needs in a single query. As a consequence, they may need to scan multiple result pages or reformulate their queries, which may be a frustrating experience. Alternatively, systems can improve user satisfaction by proactively asking questions of the users to clarify their information needs. Asking clarifying questions is especially important in conversational systems since they can only return a limited number of (often only one) result(s). In this paper, we formulate the task of asking clarifying questions in open-domain information-seeking conversational systems. To this end, we propose an offline evaluation methodology for the task and collect a dataset, called Qulac, through crowdsourcing. Our dataset is built on top of the TREC Web Track 2009-2012 data and consists of over 10K question-answer pairs for 198 TREC topics with 762 facets. Our experiments on an oracle model demonstrate that asking only one good question leads to over 170% retrieval performance improvement in terms of P@1, which clearly demonstrates the potential impact of the task. We further propose a retrieval framework consisting of three components: question retrieval, question selection, and document retrieval. In particular, our question selection model takes into account the original query and previous question-answer interactions while selecting the next question. Our model significantly outperforms competitive baselines. To foster research in this area, we have made Qulac publicly available.

109 citations

Journal ArticleDOI
TL;DR: In this paper, a probabilistic model is proposed to find the mapping between user-annotated tags and locations' taste keywords, and the computed scores are integrated using learning to rank techniques.
Abstract: Personalized recommendation of Points of Interest (POIs) plays a key role in satisfying users on Location-Based Social Networks (LBSNs). In this article, we propose a probabilistic model to find the mapping between user-annotated tags and locations’ taste keywords. Furthermore, we introduce a dataset on locations’ contextual appropriateness and demonstrate its usefulness in predicting the contextual relevance of locations. We investigate four approaches to use our proposed mapping for addressing the data sparsity problem: one model to reduce the dimensionality of location taste keywords and three models to predict user tags for a new location. Moreover, we present different scores calculated from multiple LBSNs and show how we incorporate new information from the mapping into a POI recommendation approach. Then, the computed scores are integrated using learning to rank techniques. The experiments on two TREC datasets show the effectiveness of our approach, beating state-of-the-art methods.

80 citations

Posted Content
TL;DR: This document presents a detailed description of the challenge on clarifying questions for dialogue systems (ClariQ) and provides a common evaluation framework to evaluate mixed-initiative conversations.
Abstract: This document presents a detailed description of the challenge on clarifying questions for dialogue systems (ClariQ). The challenge is organized as part of the Conversational AI challenge series (ConvAI3) at Search Oriented Conversational AI (SCAI) EMNLP workshop in 2020. The main aim of the conversational systems is to return an appropriate answer in response to the user requests. However, some user requests might be ambiguous. In IR settings such a situation is handled mainly thought the diversification of the search result page. It is however much more challenging in dialogue settings with limited bandwidth. Therefore, in this challenge, we provide a common evaluation framework to evaluate mixed-initiative conversations. Participants are asked to rank clarifying questions in an information-seeking conversations. The challenge is organized in two stages where in Stage 1 we evaluate the submissions in an offline setting and single-turn conversations. Top participants of Stage 1 get the chance to have their model tested by human annotators.

57 citations


Cited by
More filters
Proceedings ArticleDOI
18 Jul 2019
TL;DR: This paper proposed a retrieval framework consisting of three components: question retrieval, question selection, and document retrieval, which takes into account the original query and previous question-answer interactions while selecting the next question.
Abstract: Users often fail to formulate their complex information needs in a single query. As a consequence, they may need to scan multiple result pages or reformulate their queries, which may be a frustrating experience. Alternatively, systems can improve user satisfaction by proactively asking questions of the users to clarify their information needs. Asking clarifying questions is especially important in conversational systems since they can only return a limited number of (often only one) result(s). In this paper, we formulate the task of asking clarifying questions in open-domain information-seeking conversational systems. To this end, we propose an offline evaluation methodology for the task and collect a dataset, called Qulac, through crowdsourcing. Our dataset is built on top of the TREC Web Track 2009-2012 data and consists of over 10K question-answer pairs for 198 TREC topics with 762 facets. Our experiments on an oracle model demonstrate that asking only one good question leads to over 170% retrieval performance improvement in terms of P@1, which clearly demonstrates the potential impact of the task. We further propose a retrieval framework consisting of three components: question retrieval, question selection, and document retrieval. In particular, our question selection model takes into account the original query and previous question-answer interactions while selecting the next question. Our model significantly outperforms competitive baselines. To foster research in this area, we have made Qulac publicly available.

193 citations

Proceedings ArticleDOI
Hamed Zamani1, Susan T. Dumais1, Nick Craswell1, Paul N. Bennett1, Gord Lueck1 
20 Apr 2020
TL;DR: A taxonomy of clarification for open-domain search queries is identified by analyzing large-scale query reformulation data sampled from Bing search logs, and supervised and reinforcement learning models for generating clarifying questions learned from weak supervision data are proposed.
Abstract: Search queries are often short, and the underlying user intent may be ambiguous. This makes it challenging for search engines to predict possible intents, only one of which may pertain to the current user. To address this issue, search engines often diversify the result list and present documents relevant to multiple intents of the query. An alternative approach is to ask the user a question to clarify her information need. Asking clarifying questions is particularly important for scenarios with “limited bandwidth” interfaces, such as speech-only and small-screen devices. In addition, our user studies and large-scale online experiments show that asking clarifying questions is also useful in web search. Although some recent studies have pointed out the importance of asking clarifying questions, generating them for open-domain search tasks remains unstudied and is the focus of this paper. Lack of training data even within major search engines for this task makes it challenging. To mitigate this issue, we first identify a taxonomy of clarification for open-domain search queries by analyzing large-scale query reformulation data sampled from Bing search logs. This taxonomy leads us to a set of question templates and a simple yet effective slot filling algorithm. We further use this model as a source of weak supervision to automatically generate clarifying questions for training. Furthermore, we propose supervised and reinforcement learning models for generating clarifying questions learned from weak supervision data. We also investigate methods for generating candidate answers for each clarifying question, so users can select from a set of pre-defined answers. Human evaluation of the clarifying questions and candidate answers for hundreds of search queries demonstrates the effectiveness of the proposed solutions.

132 citations

Proceedings ArticleDOI
17 Oct 2018
TL;DR: The results demonstrate the importance of sparsity in neural IR models and show that dense representations can be pruned effectively, giving new insights about essential semantic features and their distributions.
Abstract: The availability of massive data and computing power allowing for effective data driven neural approaches is having a major impact on machine learning and information retrieval research, but these models have a basic problem with efficiency. Current neural ranking models are implemented as multistage rankers: for efficiency reasons, the neural model only re-ranks the top ranked documents retrieved by a first-stage efficient ranker in response to a given query. Neural ranking models learn dense representations causing essentially every query term to match every document term, making it highly inefficient or intractable to rank the whole collection. The reliance on a first stage ranker creates a dual problem: First, the interaction and combination effects are not well understood. Second, the first stage ranker serves as a "gate-keeper" or filter, effectively blocking the potential of neural models to uncover new relevant documents. In this work, we propose a standalone neural ranking model (SNRM) by introducing a sparsity property to learn a latent sparse representation for each query and document. This representation captures the semantic relationship between the query and documents, but is also sparse enough to enable constructing an inverted index for the whole collection. We parameterize the sparsity of the model to yield a retrieval model as efficient as conventional term based models. Our model gains in efficiency without loss of effectiveness: it not only outperforms the existing term matching baselines, but also performs similarly to the recent re-ranking based neural models with dense representations. Our model can also take advantage of pseudo-relevance feedback for further improvements. More generally, our results demonstrate the importance of sparsity in neural IR models and show that dense representations can be pruned effectively, giving new insights about essential semantic features and their distributions.

129 citations

Journal ArticleDOI
TL;DR: The performance of state-of-the-art techniques to deal with class imbalance in churn prediction is compared and a recently developed expected maximum profit criterion is used as one of the main performance measures to offer more insights from the perspective of cost-benefit.

120 citations