scispace - formally typeset
Search or ask a question

Showing papers by "Katsumi Tanaka published in 2015"


Proceedings ArticleDOI
01 Jul 2015
TL;DR: This paper proposes an approach that transforms word contexts across time based on their neural network representations and experimentally demonstrates the effectiveness of the method on the New York Times Annotated Corpus.
Abstract: In the current fast-paced world, people tend to possess limited knowledge about things from the past. For example, some young users may not know that Walkman played similar function as iPod does nowadays. In this paper, we approach the temporal correspondence problem in which, given an input term (e.g., iPod) and the target time (e.g. 1980s), the task is to find the counterpart of the query that existed in the target time. We propose an approach that transforms word contexts across time based on their neural network representations. We then experimentally demonstrate the effectiveness of our method on the New York Times Annotated Corpus.

39 citations


Journal ArticleDOI
TL;DR: Estimating the focus time of documents which is defined as the time period to which document's content refers and which is considered complementary dimension to the document's creation time is proposed.
Abstract: Statistical approach for estimating the focus time of text documents.Classification framework for categorizing documents into temporal and atemporal.Bi-Temporal Document Representation using document focus time and creation time. Time is an important aspect of text documents. While some documents are atemporal, many have strong temporal characteristics and contain contents related to time. Such documents can be mapped to their corresponding time periods. In this paper, we propose estimating the focus time of documents which is defined as the time period to which document's content refers and which is considered complementary dimension to the document's creation time. We propose several estimators of focus time by utilizing statistical knowledge from external resources such as news article collections. The advantage of our approach is that document focus time can be estimated even for documents that do not contain any temporal expressions or contain only few of them. We evaluate the effectiveness of our methods on the diverse datasets of documents about historical events related to 5 countries. Our approach achieves average error of less than 21years on collections of Wikipedia pages, extracts from history-related books and web pages, while using the total time frame of 113years. We also demonstrate an example classification method to distinguish temporal from atemporal documents.

12 citations


Proceedings ArticleDOI
06 Dec 2015
TL;DR: The results suggest the feasibility of prediction when using relatively simple features collected from user tweeting histories as well as from their social contexts and attempt to predict purchasing actions of Twitter users.
Abstract: Social Network Services (SNS) have been recently commonly used for collecting and inferring user-related information. As SNS users leave multiple traces of their activities and thoughts, it has become possible to predict their future actions, e.g., a trip to a far-away destination or a purchase of a certain product. In this paper, we approach the latter problem and attempt to predict purchasing actions of Twitter users. In particular, we focus on the task of purchase prediction of two product categories: digital cameras and personal computers. The results suggest the feasibility of prediction when using relatively simple features collected from user tweeting histories as well as from their social contexts.

7 citations


Proceedings ArticleDOI
30 Jul 2015
TL;DR: A method for finding such foods that yield good results to a user based on the user's situation and showing the recommending foods with the evidence of the foods yielding good results so that the user is convinced to choose the food.
Abstract: We propose a method for finding such foods that yield good results to a user based on the user's situation and showing the recommending foods with the evidence of the foods yielding good results so that the user is convinced to choose the food. We focus on the following problems: (1) existing recommendation systems only focus on “who chooses what” but people is not always able to select the best one which yields good result; (2) even though the proposed method finds the suitable foods, it is not sufficient to simply recommend them because users cannot understand why the foods are good. To address these problems, we construct a model by analyzing food related tweets on the Twitter. In our experiments, we demonstrate that the proposed method successfully avoided recommending a food which yields a bad result and providing evidence in addition to the recommendation foods is beneficial for decision making of food.

7 citations


Book ChapterDOI
09 Dec 2015
TL;DR: The behaviors of both askers and answerers were found to change over the time of their active participation: the askers tended to expand the range of categories for which they asked questions while the answerers tended to contract the range to which they answered questions.
Abstract: Macro and micro analyses of why and when users stop asking and/or answering questions on a community question answering (cQA) site were done for a ten years’ worth of questions and answers posted on Yahoo! Chiebukuro (Japanese Yahoo! Answers), the biggest cQA site in Japan. The macro analysis focused on how long participants were active in the QA community from the viewpoints of several user characteristics. In turn, the micro analysis focused on how the participants behaviors changes. The behaviors of both askers and answerers were found to change over the time of their active participation: the askers tended to expand the range of categories for which they asked questions while the answerers tended to contract the range of categories for which they answered questions.

5 citations


Proceedings ArticleDOI
06 Dec 2015
TL;DR: This work addresses the problem of entity identification on a microblog with special attention to indirect reference cases in which entities are not referred to by their names by developing features that are particularly important for certain types of indirect references and modeling dependency among referred entities by a Conditional Random Field (CRF) model.
Abstract: We address the problem of entity identification on a microblog with special attention to indirect reference cases in which entities are not referred to by their names. Most studies on identifying entities referred to by their full/partial name or abbreviation, while there are many indirectly mentioned entities in microblogs, which are difficult to identify in short text such as microblogs. We therefore tackled indirect reference cases by developing features that are particularly important for certain types of indirect references and modeling dependency among referred entities by a Conditional Random Field (CRF) model. In addition, we model non-sequential order dependency while keeping the inference tractable by dynamically building dependency among entities. The experimental results suggest that our features were effective for indirect references, and our CRF model with adaptive dependency was robust even when there were multiple mentions in a microblog and achieved the same high performance as that with the fully connected CRF model.

4 citations


Proceedings ArticleDOI
22 Oct 2015
TL;DR: This study investigates how explicit search roles assigned to group members affect their search performance and behavior in collaborative information seeking (CIS) and analyzed the existing Gatherer and Surveyor roles and their effects on search performances and query formulation behaviors.
Abstract: We investigate how explicit search roles assigned to group members affect their search performance and behavior in collaborative information seeking (CIS). Although several roles have been proposed in CIS, how these roles affect the search performances and behaviors of the members has not yet been explored. We focus on the existing Gatherer and Surveyor roles and analyze their effects on search performances and query formulation behaviors. The goal of our study is to understand the relationships between the roles and search behaviors and get insights into developing algorithms such as query suggestions or document rankings adaptive to the roles and behaviors. We conducted a user study with 20 participants in 10 pairs, where each pair of Gatherer and Surveyor were asked to perform a recall-oriented collaborative search task. We first analyzed the search performance of the two roles in terms of recall and diversity. We also analyzed how their queries were affected by their preceding queries or webpages that were visited through a questionnaire and log analysis. Finally, we discussed what algorithms would be required to support role-based CIS.

3 citations


Proceedings ArticleDOI
11 Dec 2015
TL;DR: This work targets at sentential queries and proposes a method for improving their retrieval performance, called query rewriting, which acquires paraphrases from the noisy Web and uses them to avoid returning no answers.
Abstract: The effectiveness of retrieval decreases with the increase in query length. We target at sentential queries and propose a method for improving their retrieval performance, called query rewriting. Briefly, given a sentential query, our method acquires paraphrases from the noisy Web and uses them to avoid returning no answers. In particular, since a relation can be represented either intensionally (referred to as paraphrase templates) or extensionally (referred to as coordinate tuples), the mutual reinforcement between them are taken into account. The experimental results show that for declarative sentences, the average precision of our method is 68.1%, compared to 44.2% of the baseline. Besides, the relative recall of our method is 95.9%, nearly 3 times compared to that of the baseline. While for questions, the average precision of our method is 46.9%, compared to 39.9% of the baseline. We also show the effectiveness of query rewriting in two applications.

1 citations


Book ChapterDOI
20 Apr 2015
TL;DR: This work proposes a method to acquire paraphrases from the Web in accordance with a given sentence, aiming at finding sentence-level paraphrasing from the noisy Web, instead of domain-specific corpora.
Abstract: We propose a method to acquire paraphrases from the Web in accordance with a given sentence. For example, consider an input sentence “Lemon is a high vitamin c fruit”. Its paraphrases are expressions or sentences that convey the same meaning but are different syntactically, such as “Lemons are rich in vitamin c”, or “Lemons contain a lot of vitamin c”. We aim at finding sentence-level paraphrases from the noisy Web, instead of domain-specific corpora. By observing search results of paraphrases, users are able to estimate the likelihood of the sentence as a fact. We evaluate the proposed method on five distinct semantic relations. Experiments show our average precision is \(60.5\,\%\), compared to TE/ASE method with average precision of \(44.15\,\%\). Besides, we can acquire 3 paraphrases more than TE/ASE method per input.

Proceedings ArticleDOI
06 Dec 2015
TL;DR: This paper proposes a novel approach to search for documents that explain query topics and are easy to understand for average users by measuring the comprehensibility and the relevance of documents based on the concept of Query Domain Graph constructed from Wikipedia articles related to the query.
Abstract: Comprehensibility is an important quality aspect of documents. Incomprehensible documents are of little utility to readers even if they are relevant. However, for many difficult queries such as technical ones, the topically relevant documents tend to be characterized by poor comprehensibility. This makes it difficult for users to satisfy their information needs when searching for documents about difficult topics. In this paper, we propose a novel approach to search for documents that explain query topics and are easy to understand for average users. In particular, we measure the comprehensibility and the relevance of documents based on the concept of Query Domain Graph constructed from Wikipedia articles related to the query. For estimating document comprehensibility we use the frequency and density of difficult terms within documents as well as we utilize graph-based document representation. We then propose retrieval techniques that balance the relevance and comprehensibility based on the concept of difficult word substitution, in which difficult words are replaced by the sets of easy and related words.

Proceedings ArticleDOI
11 Dec 2015
TL;DR: The authors' page revisiting methods were implemented within a Web browser, so that users can find those previously browsed pages while browsing and searching and outperformed conventional baseline methods in terms ofpage revisiting.
Abstract: A recent study on information refinding reported that 44% of Web page visits and 33% of Web queries involved revisiting previously browsed pages. We propose methods for finding previously browsed pages regarded as coordinate pages of currently browsed pages. Intuitively, the notion of coordinate pages means that both of them belong to an identical class. To find the coordinate pages for given pages, we use a user's browsing and search behavior, such as her query log and tab usage, as well as link navigation. Our page revisiting methods were implemented within a Web browser, so that users can find those previously browsed pages while browsing and searching. We conducted experiments in which our methods outperformed conventional baseline methods in terms of page revisiting.

Proceedings ArticleDOI
17 Oct 2015
TL;DR: The workshop seeks to identify some of the problems and challenges facing the development of such tools and interfaces and to flourish new ideas and findings that can shape or influence future research directions and developments.
Abstract: Held for the first time in conjunction with the ACM International Conference on Information and Knowledge Management (CIKM), NWSearch 2015 aims to bring together researchers, developers and practitioners who are interested in pushing the search boundary on the Web and exploring more novel forms of searches, interfaces, task formulations, and result organizations and presentations. In particular, the workshop seeks to identify some of the problems and challenges facing the development of such tools and interfaces and to flourish new ideas and findings that can shape or influence future research directions and developments. The workshop organizers solicited contributions that would fall within the large spectrum of human-computer interaction in one extreme and system production and development in the other extreme.