scispace - formally typeset
Search or ask a question

Showing papers by "Eugene Agichtein published in 2016"


Proceedings ArticleDOI
07 Jul 2016
TL;DR: This work revisits different phases in the KBQA process and demonstrates that text resources improve question interpretation, candidate generation and ranking, and introduces a new system, Text2KB, that enriches question answering over a knowledge base by using external text data.
Abstract: One of the major challenges for automated question answering over Knowledge Bases (KBQA) is translating a natural language question to the Knowledge Base (KB) entities and predicates. Previous systems have used a limited amount of training data to learn a lexicon that is later used for question answering. This approach does not make use of other potentially relevant text data, outside the KB, which could supplement the available information. We introduce a new system, Text2KB, that enriches question answering over a knowledge base by using external text data. Specifically, we revisit different phases in the KBQA process and demonstrate that text resources improve question interpretation, candidate generation and ranking. Building on a state-of-the-art traditional KBQA system, Text2KB utilizes web search results, community question answering and general text document collection data, to detect question topic entities, map question phrases to KB predicates, and to enrich the features of the candidates derived from the KB. Text2KB significantly improves performance over the baseline KBQA method, as measured on a popular WebQuestions dataset. The results and insights developed in this work can guide future efforts on combining textual and structured KB data for question answering.

70 citations


Proceedings Article
21 Sep 2016
TL;DR: CRQA is presented, a crowd-powered, near real-time automatic question answering system for complex informational tasks, that incorporates a crowdsourcing module for augmenting and validating the candidate answers.
Abstract: Modern search engines have made dramatic progress in answering questions about facts, such as those that might be retrieved or directly inferred from a knowledge base. However, many other real user questions are more complex, such as requests for opinions, explanations, instructions or advice for a particular situation, and are still largely beyond the competence of the computer systems. As conversational agents become more popular, QA systems are increasingly expected to handle such complex questions, and to do so in (nearly) real-time, as the searcher is unlikely to wait longer than a minute or two for an answer. One way to overcome some of the challenges in complex question answering is crowdsourcing. We explore two ways crowdsourcing can assist a question answering system that operates in (near) real time: by providing answer validation, which could be used to filter or re-rank the candidate answers, and by creating the answer candidates directly. In this paper we present CRQA, a crowd-powered, near real-time automatic question answering system for complex informational tasks, that incorporates a crowdsourcing module for augmenting and validating the candidate answers. The crowd input, obtained in real-time, is integrated into CRQA via a learning-to-rank model, to select the final system answer. Our large-scale experiments, performed on a live stream of real users questions, show that even within a one minute time limit, CRQA can produce answers of high quality. The returned answers are judged to be significantly better compared to the automatic system alone, and even are often preferred to answers posted days later in the original community question answering site. Our findings can be useful for developing hybrid human-computer systems for automatic question answering and conversational agents.

23 citations


Proceedings ArticleDOI
Dan Pelleg1, Oleg Rokhlenko1, Idan Szpektor1, Eugene Agichtein2, Ido Guy1 
27 Feb 2016
TL;DR: The results show that automated quality filtering indeed improves user engagement, usually aligning with, and often outperforming, crowd-based quality judgments.
Abstract: Social media gives voice to the people, but also opens the door to low-quality contributions, which degrade the experience for the majority of users. To address the latter issue, the prevailing solution is to rely on the 'wisdom of the crowds' to promote good content (e.g., via votes or 'like' buttons), or to downgrade bad content. Unfortunately, such crowd feedback may be sparse, subjective, and slow to accumulate. In this pa- per, we investigate the effects, on the users, of automatically filtering question-answering content, using a combination of syntactic, semantic, and social signals. Using this filtering, a large-scale experiment with real users was performed to mea- sure the resulting engagement and satisfaction. To our knowledge, this experiment represents the first reported large-scale user study of automatically curating social media content in real time. Our results show that automated quality filtering indeed improves user engagement, usually aligning with, and often outperforming, crowd-based quality judgments.

12 citations


Proceedings Article
01 Jan 2016
TL;DR: The LiveQA track, now in its second year, is focused on real-time question answering for real-user questions, and introduced a pilot task aimed at identifying the question intent.
Abstract: The LiveQA track, now in its second year, is focused on real-time question answering for real-user questions. During the test period, real user questions are drawn from those newly submitted on a popular community question answering site, Yahoo Answers (YA), that have not yet been answered. These questions are sent to the participating systems, who provide an answer in real time. Returned answers are judged by the NIST assessors on a 4-level Likert scale. The most challenging aspects of this task are that the questions can be on any one of many popular topics, are informally stated, and are often complex and at least partly subjective. Furthermore, the participant systems must return an answer in under 60 seconds, which places additional, and realistic, constraints on the kind of processing that a system can do. In addition to the main real-time question answering task, this year we introduced a pilot task aimed at identifying the question intent. As human questions submitted on forums and CQA sites are verbose in nature and contain many redundant or unnecessary terms, participants were challenged to identify the significant parts of the question. The main theme of the question is marked by the systems by specifying a list of spans that capture its main intent. This automatic “summary” of the question was evaluated by measuring its ROUGEand METEOR-based similarity to a succinct rephrase of the question, manually provided by NIST assessors.

7 citations


Proceedings ArticleDOI
07 Jul 2016
TL;DR: The aim of this workshop is to bring together researchers in diverse areas working on this problem, including those from NLP, IR, social media and recommender systems communities, to conduct a more focused and open discussion.
Abstract: Web search engines have made great progress at answering factoid queries. However, they are not well-tailored for managing more complex questions, especially when they require explanation and/or description. The WebQA workshop series aims at exploring diverse approaches to answering questions on the Web. This year, particular emphasis will be given to Community Question Answering (CQA), where comments by the users engaged in the forum communities can be used to answer new questions. Questions posted on the Web can be short and ambiguous (similarly to Web queries to a search engine). These issues make the WebQA task more challenging than traditional QA, and finding the most effective approaches for it remains an open problem. Unlike the more formal conference format, the aim of this workshop is to bring together researchers in diverse areas working on this problem, including those from NLP, IR, social media and recommender systems communities. This workshop is specifically designed for the SIGIR audience. However, due to its format, its goal, as compared to the main conference, is to conduct a more focused and open discussion, encouraging the presentation of work in progress and late-breaking initial results in Web Question Answering. Both academic and industrial participation will be solicited, including keynotes and invited speakers.

6 citations


Proceedings ArticleDOI
01 Jun 2016
TL;DR: This work explores two ways crowdsourcing can assist a question answering system that operates in (near) real time: by providing answer validation, which could be used to filter or re-rank the candidate answers, and by creating the answer candidates directly.
Abstract: Modern search engines have made dramatic progress in the answering of many user’s questions about facts, such as those that might be retrieved or directly inferred from a knowledge base. However, many other questions that real users ask are more complex, such as asking for opinions or advice for a particular situation, and are still largely beyond the competence of the computer systems. As conversational agents become more popular, QA systems are increasingly expected to handle such complex questions, and to do so in (nearly) real-time, as the searcher is unlikely to wait longer than a minute or two for an answer. One way to overcome some of the challenges in complex question answering is crowdsourcing. We explore two ways crowdsourcing can assist a question answering system that operates in (near) real time: by providing answer validation, which could be used to filter or re-rank the candidate answers, and by creating the answer candidates directly. Specifically, we focus on understanding the effects of time restrictions in the near real-time QA setting. Our experiments show that even within a one minute time limit, crowd workers can produce reliable ratings for up to three answer candidates, and generate answers that are better than an average automated system from the LiveQA 2015 shared task. Our findings can be useful for developing hybrid human-computer systems for automatic question answering and conversational agents.

6 citations


Proceedings Article
01 Jan 2016
TL;DR: The two QA systems developed to participate in the TREC LiveQA 2016 shared task demonstrate the effectiveness of the introduced crowdsourcing module, which allowed them to achieve an improvement of ∼20% in average answer score over a fully automatic Emory-QA system.
Abstract: This paper describes the two QA systems we developed to participate in the TREC LiveQA 2016 shared task. The first run represents an improvement of our fully automatic real-time QA system from LiveQA 2015, Emory-QA. The second run, Emory-CRQA, which stands for Crowd-powered Real-time Question Answering, incorporates human feedback, in real-time, to improve answer candidate generation and ranking. The base Emory-QA system uses the title and the body of a question to query Yahoo! Answers, Answers.com, WikiHow and general web search and retrieve a set of candidate answers along with their topics and contexts. This information is used to represent each candidate by a set of features, rank them with a trained LambdaMART model, and return the top ranked candidates as an answer to the question. The second run, Emory-CRQA, integrates a crowdsourcing module, which provides the system with additional answer candidates and quality ratings, obtained in near real-time (under one minute) from a crowd of workers When Emory-CRQA receives a question, it is forwarded to the crowd, who can start working on the answer in parallel with the automatic pipeline. When the automatic pipeline is done generating and ranking candidates, a subset of them is immediately sent to the same workers who have been working on answering the questions. Workers then rate the quality of all humanor system-generated candidate answers. The resulting ratings, as well as original system scores, are used as features for the final re-ranking module, which returns the highest scoring answer. The official run results of the tasks indicate promising improvements for both runs compared to the best performing system from LiveQA 2015. Additionally, they demonstrate the effectiveness of the introduced crowdsourcing module, which allowed us to achieve an improvement of ∼20% in average answer score over a fully automatic Emory-QA system.

2 citations