Home
/
Authors
/
Sung-Hyon Myaeng

Author

Sung-Hyon Myaeng

Other affiliations: Chungnam National University, Information and Communications University

Bio: Sung-Hyon Myaeng is an academic researcher from KAIST. The author has contributed to research in topics: Web query classification & Information extraction. The author has an hindex of 24, co-authored 175 publications receiving 3076 citations. Previous affiliations of Sung-Hyon Myaeng include Chungnam National University & Information and Communications University.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Some Effective Techniques for Naive Bayes Text Classification

[...]

Sang-Bum Kim¹, Kyoung-Soo Han¹, Hae-Chang Rim¹, Sung-Hyon Myaeng²•Institutions (2)

Korea University¹, Information and Communications University²

01 Nov 2006-IEEE Transactions on Knowledge and Data Engineering

TL;DR: This paper proposed two empirical heuristics: per-document text normalization and feature weighting method, which performed very well in the standard benchmark collections, competing with state-of-the-art text classifiers based on a highly complex learning method such as SVM.

...read moreread less

Abstract: While naive Bayes is quite effective in various data mining tasks, it shows a disappointing result in the automatic text classification problem Based on the observation of naive Bayes for the natural language text, we found a serious problem in the parameter estimation process, which causes poor results in text classification domain In this paper, we propose two empirical heuristics: per-document text normalization and feature weighting method While these are somewhat ad hoc methods, our proposed naive Bayes text classifier performs very well in the standard benchmark collections, competing with state-of-the-art text classifiers based on a highly complex learning method such as SVM

...read moreread less

430 citations

Proceedings Article•DOI•

A practical hypertext catergorization method using links and incrementally available class information

[...]

Hyo-Jung Oh¹, Sung-Hyon Myaeng², Mann-Ho Lee²•Institutions (2)

Electronics and Telecommunications Research Institute¹, Chungnam National University²

01 Jul 2000

TL;DR: This paper proposes a practical method for enhancing both the speed and the quality of hypertext categorization using hyperlinks, and achieves up to 18.5% of improvement in effectiveness while reducing the processing time dramatically.

...read moreread less

Abstract: As WWW grows at an increasing speed, a classifier targeted at hypertext has become in high demand. While document categorization is quite a mature, the issue of utilizing hypertext structure and hyperlinks has been relatively unexplored. In this paper, we propose a practical method for enhancing both the speed and the quality of hypertext categorization using hyperlinks. In comparison against a recently proposed technique that appears to be the only one of the kind, we obtained up to 18.5% of improvement in effectiveness while reducing the processing time dramatically. We attempt to explain through experiments what factors contribute to the improvement.

...read moreread less

183 citations

Patent•

Query and document topic category transition analysis system and method and query expansion-based information retrieval system and method

[...]

Sung-Hyon Myaeng¹, Yuchul Jung¹, Kyung-min Kim¹•Institutions (1)

KAIST¹

17 Feb 2010

TL;DR: In this paper, a query expansion-based information retrieval method using query/document topic category transition analysis is proposed, in which a query input from a user is expanded using a topic-category transition analysis result, and corresponding information or documents are retrieved using the expanded query are provided.

...read moreread less

Abstract: An information retrieval system and method, and more particularly, a query/document topic category transition analysis system and method in which a query topic category of a query input from a user as an information retrieval keyword and a document topic category of a document which a user regards as relevant and selects from information retrieval results are classified to analyze transition between the query topic category and the document topic category, and a query expansion-based information retrieval system and method using query/document topic category transition analysis in which a query input from a user is expanded using a topic category transition analysis result, and corresponding information or documents are retrieved using the expanded query are provided. The query expansion-based information retrieval method using query/document topic category transition analysis, includes: in a state in which a topic category transition map is generated as a result of analyzing topic category transition between a user query and a relevant document, and corresponding documents are generated as pseudo documents according to each topic category for the user query and the relevant document, determining a corresponding query topic category based on query/document text information for an input query input from a user; allocating a relevant document topic category for the classified query topic category based on the topic category transition map; ranking representative keywords for the query topic category and the relevant document topic category based on the pseudo documents; expanding the input query using the ranked representative keywords; and retrieving corresponding documents using the expanded query.

...read moreread less

172 citations

Journal Article•DOI•

Automatic Extraction of Cause-Effect Information from Newspaper Text Without Knowledge-based Inferencing

[...]

Christopher S. G. Khoo¹, Jaklin Kornfilt², Robert N Oddy², Sung-Hyon Myaeng•Institutions (2)

Nanyang Technological University¹, Syracuse University²

01 Dec 1998-Literary and Linguistic Computing

TL;DR: This study investigated how effectively cause-effect information can be extracted from newspaper text using a simple computational method (i.e. without knowledge-based inferencing and without full parsing of sentences).

...read moreread less

Abstract: This study investigated how effectively cause-effect information can be extracted from newspaper text using a simple computational method (i.e. without knowledge-based inferencing and without full parsing of sentences). An automatic method was developed for identifying and extracting cause-effect information in Wall Street Journal text using linguistic clues and pattern matching. The set of linguistic patterns used for identifying causal relationships was based on a thorough review of the literature and on an analysis of sample sentences from the Wall Street Journal. The cause-effect information extracted using the method was compared with that identified by two human judges. The program successfully extracted ∼68% of the causal relationships identified by both judges (the intersection of the two sets of causal relationships identified by the judges). Of the instances that the computer program identified as causal relationships, ∼25% were identified by both judges, and 64% were identified by at least one of the judges. Problems encountered are discussed

...read moreread less

146 citations

Proceedings Article•DOI•

Text genre classification with genre-revealing and subject-revealing features

[...]

Yong-Bae Lee¹, Sung-Hyon Myaeng¹•Institutions (1)

Chungnam National University¹

11 Aug 2002

TL;DR: The experimental results show that the proposed method outperforms a direct application of a statistical learner often used for subject classification and it is conjecture that this dual feature set approach can be generalized to improve the performance of subject classification as well.

...read moreread less

Abstract: Subject or prepositional content has been the focus of most classification research. Genre or style, on the other hand, is a different and important property of text, and automatic text genre classification is becoming important for classification and retrieval purposes as well as for some natural language processing research. In this paper, we present a method for automatic genre classification that is based on statistically selected features obtained from both subject-classified and genre classified training data. The experimental results show that the proposed method outperforms a direct application of a statistical learner often used for subject classification. We also observe that the deviation formula and discrimination formula using document frequency ratios also work as expected. We conjecture that this dual feature set approach can be generalized to improve the performance of subject classification as well.

...read moreread less

136 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36

Collapse

Cited by

PDF

Open Access

More filters

Data Mining - Concepts and Techniques.

[...]

Petra Perner

01 Jan 2002

9,314 citations

Journal Article•

Data Mining Practical Machine Learning Tools and Techniques

[...]

อนิรุธ สืบสิงห์

01 Jan 2014-Journal of management science

9,185 citations

Journal Article•DOI•

Machine learning in automated text categorization

[...]

Fabrizio Sebastiani

01 Mar 2002-ACM Computing Surveys

TL;DR: This survey discusses the main approaches to text categorization that fall within the machine learning paradigm and discusses in detail issues pertaining to three different problems, namely, document representation, classifier construction, and classifier evaluation.

...read moreread less

Abstract: The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last 10 years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this problem is based on machine learning techniques: a general inductive process automatically builds a classifier by learning, from a set of preclassified documents, the characteristics of the categories. The advantages of this approach over the knowledge engineering approach (consisting in the manual definition of a classifier by domain experts) are a very good effectiveness, considerable savings in terms of expert labor power, and straightforward portability to different domains. This survey discusses the main approaches to text categorization that fall within the machine learning paradigm. We will discuss in detail issues pertaining to three different problems, namely, document representation, classifier construction, and classifier evaluation.

...read moreread less

7,539 citations

Book•

Opinion Mining and Sentiment Analysis

[...]

Bo Pang¹, Lillian Lee²•Institutions (2)

Yahoo!¹, Cornell University²

08 Jul 2008

TL;DR: This survey covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems and focuses on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis.

...read moreread less

Abstract: An important part of our information-gathering behavior has always been to find out what other people think. With the growing availability and popularity of opinion-rich resources such as online review sites and personal blogs, new opportunities and challenges arise as people now can, and do, actively use information technologies to seek out and understand the opinions of others. The sudden eruption of activity in the area of opinion mining and sentiment analysis, which deals with the computational treatment of opinion, sentiment, and subjectivity in text, has thus occurred at least in part as a direct response to the surge of interest in new systems that deal directly with opinions as a first-class object. This survey covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems. Our focus is on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis. We include material on summarization of evaluative text and on broader issues regarding privacy, manipulation, and economic impact that the development of opinion-oriented information-access services gives rise to. To facilitate future work, a discussion of available resources, benchmark datasets, and evaluation campaigns is also provided.

...read moreread less

7,452 citations

Remembering. A Study in Experimental and Social Psychology, Cambridge (University Press) 1964.

[...]

F. C. Bartlett

01 Jan 1964

TL;DR: In this paper, the notion of a collective unconscious was introduced as a theory of remembering in social psychology, and a study of remembering as a study in Social Psychology was carried out.

...read moreread less

Abstract: Part I. Experimental Studies: 2. Experiment in psychology 3. Experiments on perceiving III Experiments on imaging 4-8. Experiments on remembering: (a) The method of description (b) The method of repeated reproduction (c) The method of picture writing (d) The method of serial reproduction (e) The method of serial reproduction picture material 9. Perceiving, recognizing, remembering 10. A theory of remembering 11. Images and their functions 12. Meaning Part II. Remembering as a Study in Social Psychology: 13. Social psychology 14. Social psychology and the matter of recall 15. Social psychology and the manner of recall 16. Conventionalism 17. The notion of a collective unconscious 18. The basis of social recall 19. A summary and some conclusions.

...read moreread less

5,690 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse