Home
/
Authors
/
Koji Eguchi

Author

Koji Eguchi

Other affiliations: Hitotsubashi University, National Institute of Informatics, Graduate University for Advanced Studies ...read more

Bio: Koji Eguchi is an academic researcher from Kobe University. The author has contributed to research in topics: Topic model & Inference. The author has an hindex of 15, co-authored 90 publications receiving 889 citations. Previous affiliations of Koji Eguchi include Hitotsubashi University & National Institute of Informatics.

Papers published on a yearly basis

2022
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
1999
1998
1997

Papers

PDF

Open Access

More filters

Overview of CLIR Task at the Sixth NTCIR Workshop

[...]

Kazuaki Kishida¹, Kuang-hua Chen², Sukhoon Lee³, Kazuko Kuriyama, Noriko Kando⁴, Hsin-Hsi Chen², Sung-Hyon Myaeng⁵, Koji Eguchi⁴ - Show less +4 more•Institutions (5)

Surugadai University¹, National Taiwan University², Chungnam National University³, National Institute of Informatics⁴, Information and Communications University⁵

01 Jan 2002

TL;DR: The system of the NTCIR-6 CLIR task and its test collection (document sets, topic sets, and method for relevance judgments), and reviews CLIR techniques used by participants and search performance of runs submitted for evaluation are described.

...read moreread less

Abstract: The purpose of this paper is to overview research efforts at the NTCIR-6 CLIR task, which is a project of large-scale retrieval experiments on cross-lingual information retrieval (CLIR) of Chinese, Japanese, Korean, and English. The project has three sub-tasks, multi-lingual IR (MLIR), bilingual IR (BLIR), and single language IR (SLIR), in which many research groups from ten countries or regions are participating. This paper describes the system of the NTCIR-6 CLIR task and its test collection (document sets, topic sets, and method for relevance judgments), and reviews CLIR techniques used by participants and search performance of runs submitted for evaluation.

...read moreread less

134 citations

Proceedings Article•DOI•

Sentiment Retrieval using Generative Models

[...]

Koji Eguchi¹, Victor Lavrenko²•Institutions (2)

National Institute of Informatics¹, University of Massachusetts Amherst²

22 Jul 2006

TL;DR: This paper proposes several sentiment information retrieval models in the framework of probabilistic language models, assuming that a user both inputs query terms expressing a certain topic and also specifies a sentiment polarity of interest in some manner.

...read moreread less

Abstract: Ranking documents or sentences according to both topic and sentiment relevance should serve a critical function in helping users when topics and sentiment polarities of the targeted text are not explicitly given, as is often the case on the web. In this paper, we propose several sentiment information retrieval models in the framework of probabilistic language models, assuming that a user both inputs query terms expressing a certain topic and also specifies a sentiment polarity of interest in some manner. We combine sentiment relevance models and topic relevance models with model parameters estimated from training data, considering the topic dependence of the sentiment. Our experiments prove that our models are effective.

...read moreread less

127 citations

Overview of the Web Retrieval Task at the Third NTCIR Workshop

[...]

Koji Eguchi¹, Keizo Oyama¹, Emi Ishida¹, Noriko Kando¹, Kazuko Kuriyama - Show less +1 more•Institutions (1)

National Institute of Informatics¹

01 Dec 2003

TL;DR: In the Web Retrieval Task of the Third NTCIR Workshop as discussed by the authors, the retrieval effectiveness of Web search engine systems using a common data set, and built a re-usable test collection suitable for evaluating Web Search engine systems, was evaluated.

...read moreread less

Abstract: This paper gives an overview of the Web Retrieval Task that was conducted from 2001 to 2002 at the Third NTCIR Workshop. In the Web Retrieval Task, we attempted to assess the retrieval effectiveness of Web search engine systems using a common data set, and built a re-usable test collection suitable for evaluating Web search engine systems. With these objectives, we constructed 100-gigabyte and 10-gigabyte document data that were mainly gathered from the ‘.jp’ domain. Participants were allowed to access those data only within the ‘Open Laboratory’ located at the National Institute of Informatics. Relevance judgments were performed on the retrieved documents, which were written in Japanese or English, by considering the relationshiop between the pages referenced by hyperlinks. Some evaluation measures were also applied to individual system results submitted by the participants.

...read moreread less

55 citations

Proceedings Article•

Interactive clustering of text collections according to a user-specified criterion

[...]

Ron Bekkerman¹, Hema Raghavan¹, James Allan¹, Koji Eguchi²•Institutions (2)

University of Massachusetts Amherst¹, National Institute of Informatics²

06 Jan 2007

TL;DR: This work proposes an interactive scheme for clustering document collections, based on any criterion of the user's preference, and demonstrates excellent results on clustering by sentiment, substantially outperforming an SVM trained on a large amount of labeled data.

...read moreread less

Abstract: Document clustering is traditionally tackled from the perspective of grouping documents that are topically similar. However, many other criteria for clustering documents can be considered: for example, documents' genre or the author's mood. We propose an interactive scheme for clustering document collections, based on any criterion of the user's preference. The user holds an active position in the clustering process: first, she chooses the types of features suitable to the underlying task, leading to a task-specific document representation. She can then provide examples of features-- if such examples are emerging, e.g., when clustering by the author's sentiment, words like 'perfect', 'mediocre', 'awful' are intuitively good features. The algorithm proceeds iteratively, and the user can fix errors made by the clustering system at the end of each iteration. Such an interactive clustering method demonstrates excellent results on clustering by sentiment, substantially outperforming an SVM trained on a large amount of labeled data. Even if features are not provided because they are not intuitively obvious to the user--e.g., what would be good features for clustering by genre using part-of-speech trigrams?--our multi-modal clustering method performs significantly better than k-means and Latent Dirichlet Allocation (LDA).

...read moreread less

47 citations

Proceedings Article•

Symmetric Correspondence Topic Models for Multilingual Text Analysis

[...]

Kosuke Fukumasu¹, Koji Eguchi¹, Eric P. Xing²•Institutions (2)

Kobe University¹, Carnegie Mellon University²

03 Dec 2012

TL;DR: A new topic model is proposed that incorporates a hidden variable to control a pivot language, in an extension of CorrLDA, that is more effective than some other existing multilingual topic models.

...read moreread less

Abstract: Topic modeling is a widely used approach to analyzing large text collections. A small number of multilingual topic models have recently been explored to discover latent topics among parallel or comparable documents, such as in Wikipedia. Other topic models that were originally proposed for structured data are also applicable to multilingual documents. Correspondence Latent Dirichlet Allocation (CorrLDA) is one such model; however, it requires a pivot language to be specified in advance. We propose a new topic model, Symmetric Correspondence LDA (SymCorrLDA), that incorporates a hidden variable to control a pivot language, in an extension of CorrLDA. We experimented with two multilingual comparable datasets extracted from Wikipedia and demonstrate that SymCorrLDA is more effective than some other existing multilingual topic models.

...read moreread less

37 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

Collapse

Cited by

PDF

Open Access

More filters

Pattern Recognition and Machine Learning

[...]

Christopher M. Bishop¹•Institutions (1)

Microsoft¹

01 Jan 2006

TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.

...read moreread less

Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

...read moreread less

10,141 citations

Book•

Opinion Mining and Sentiment Analysis

[...]

Bo Pang¹, Lillian Lee²•Institutions (2)

Yahoo!¹, Cornell University²

08 Jul 2008

TL;DR: This survey covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems and focuses on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis.

...read moreread less

Abstract: An important part of our information-gathering behavior has always been to find out what other people think. With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs, new opportunities and challenges arise as people now can, and do, actively use information technologies to seek out and understand the opinions of others. The sudden eruption of activity in the area of opinion mining and sentiment analysis, which deals with the computational treatment of opinion, sentiment, and subjectivity in text, has thus occurred at least in part as a direct response to the surge of interest in new systems that deal directly with opinions as a first-class object. This survey covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems. Our focus is on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis. We include material on summarization of evaluative text and on broader issues regarding privacy, manipulation, and economic impact that the development of opinion-oriented information-access services gives rise to. To facilitate future work, a discussion of available resources, benchmark datasets, and evaluation campaigns is also provided.

...read moreread less

7,452 citations

Book•

Sentiment Analysis and Opinion Mining

[...]

Bing Liu¹•Institutions (1)

University of Illinois at Chicago¹

01 May 2012

TL;DR: Sentiment analysis and opinion mining is the field of study that analyzes people's opinions, sentiments, evaluations, attitudes, and emotions from written language as discussed by the authors and is one of the most active research areas in natural language processing and is also widely studied in data mining, Web mining, and text mining.

...read moreread less

Abstract: Sentiment analysis and opinion mining is the field of study that analyzes people's opinions, sentiments, evaluations, attitudes, and emotions from written language. It is one of the most active research areas in natural language processing and is also widely studied in data mining, Web mining, and text mining. In fact, this research has spread outside of computer science to the management sciences and social sciences due to its importance to business and society as a whole. The growing importance of sentiment analysis coincides with the growth of social media such as reviews, forum discussions, blogs, micro-blogs, Twitter, and social networks. For the first time in human history, we now have a huge volume of opinionated data recorded in digital form for analysis. Sentiment analysis systems are being applied in almost every business and social domain because opinions are central to almost all human activities and are key influencers of our behaviors. Our beliefs and perceptions of reality, and the choices we make, are largely conditioned on how others see and evaluate the world. For this reason, when we need to make a decision we often seek out the opinions of others. This is true not only for individuals but also for organizations. This book is a comprehensive introductory and survey text. It covers all important topics and the latest developments in the field with over 400 references. It is suitable for students, researchers and practitioners who are interested in social media analysis in general and sentiment analysis in particular. Lecturers can readily use it in class for courses on natural language processing, social media analysis, text mining, and data mining. Lecture slides are also available online.

...read moreread less

4,515 citations

Journal Article•DOI•

Big Data: A Survey

[...]

Min Chen¹, Shiwen Mao², Yunhao Liu³•Institutions (3)

Huazhong University of Science and Technology¹, Auburn University², Tsinghua University³

01 Apr 2014-Mobile Networks and Applications

TL;DR: The background and state-of-the-art of big data are reviewed, including enterprise management, Internet of Things, online social networks, medial applications, collective intelligence, and smart grid, as well as related technologies.

...read moreread less

Abstract: In this paper, we review the background and state-of-the-art of big data. We first introduce the general background of big data and review related technologies, such as could computing, Internet of Things, data centers, and Hadoop. We then focus on the four phases of the value chain of big data, i.e., data generation, data acquisition, data storage, and data analysis. For each phase, we introduce the general background, discuss the technical challenges, and review the latest advances. We finally examine the several representative applications of big data, including enterprise management, Internet of Things, online social networks, medial applications, collective intelligence, and smart grid. These discussions aim to provide a comprehensive overview and big-picture to readers of this exciting area. This survey is concluded with a discussion of open problems and future directions.

...read moreread less

2,303 citations

Sentiment Analysis and Subjectivity

[...]

Bing Liu¹•Institutions (1)

University of Illinois at Chicago¹

01 Jan 2010

TL;DR: In this article, the authors focus on opinion expressions that convey people's positive or negative sentiments, i.e., opinions are subjective expressions that describe people's sentiments, appraisals or feelings toward entities, events and their properties.

...read moreread less

Abstract: Textual information in the world can be broadly categorized into two main types: facts and opinions. Facts are objective expressions about entities, events and their properties. Opinions are usually subjective expressions that describe people’s sentiments, appraisals or feelings toward entities, events and their properties. The concept of opinion is very broad. In this chapter, we only focus on opinion expressions that convey people’s positive or negative sentiments. Much of the existing research on textual information processing has been focused on mining and retrieval of factual information, e.g., information retrieval, Web search, text classification, text clustering and many other text mining and natural language processing tasks. Little work had been done on the processing of opinions until only recently. Yet, opinions are so important that whenever we need to make a decision we want to hear others’ opinions. This is not only true for individuals but also true for organizations. One of the main reasons for the lack of study on opinions is the fact that there was little opinionated text available before the World Wide Web. Before the Web, when an individual needed to make a decision, he/she typically asked for opinions from friends and families. When an organization wanted to find the opinions or sentiments of the general public about its products and services, it conducted opinion polls, surveys, and focus groups. However, with the Web, especially with the explosive growth of the usergenerated content on the Web in the past few years, the world has been transformed. The Web has dramatically changed the way that people express their views and opinions. They can now post reviews of products at merchant sites and express their views on almost anything in Internet forums, discussion groups, and blogs, which are collectively called the user-generated content. This online wordof-mouth behavior represents new and measurable sources of information with many practical applications. Now if one wants to buy a product, he/she is no longer limited to asking his/her friends and families because there are many product reviews on the Web which give opinions of existing users of the product. For a company, it may no longer be necessary to conduct surveys, organize focus groups or employ external consultants in order to find consumer opinions about its products and those of its competitors because the user-generated content on the Web can already give them such information.

...read moreread less

1,575 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150

Collapse