Home
/
Authors
/
Aytuğ Onan

Author

Aytuğ Onan

Other affiliations: Ege University, Celal Bayar University

Bio: Aytuğ Onan is an academic researcher from Izmir Kâtip Çelebi University. The author has contributed to research in topics: Artificial intelligence & Computer science. The author has an hindex of 16, co-authored 52 publications receiving 1227 citations. Previous affiliations of Aytuğ Onan include Ege University & Celal Bayar University.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Ensemble of keyword extraction methods and classifiers in text classification

[...]

Aytuğ Onan¹, Serdar Korukoglu², Hasan Bulut²•Institutions (2)

Celal Bayar University¹, Ege University²

15 Sep 2016-Expert Systems With Applications

TL;DR: The empirical analysis indicates that the utilization of keyword-based representation of text documents in conjunction with ensemble learning can enhance the predictive performance and scalability ofText classification schemes, which is of practical importance in the application fields of text classification.

...read moreread less

Abstract: Text classification is a domain with high dimensional feature space.Extracting the keywords as the features can be extremely useful in text classification.An empirical analysis of five statistical keyword extraction methods.A comprehensive analysis of classifier and keyword extraction ensembles.For ACM collection, a classification accuracy of 93.80% with Bagging ensemble of Random Forest. Automatic keyword extraction is an important research direction in text mining, natural language processing and information retrieval. Keyword extraction enables us to represent text documents in a condensed way. The compact representation of documents can be helpful in several applications, such as automatic indexing, automatic summarization, automatic classification, clustering and filtering. For instance, text classification is a domain with high dimensional feature space challenge. Hence, extracting the most important/relevant words about the content of the document and using these keywords as the features can be extremely useful. In this regard, this study examines the predictive performance of five statistical keyword extraction methods (most frequent measure based keyword extraction, term frequency-inverse sentence frequency based keyword extraction, co-occurrence statistical information based keyword extraction, eccentricity-based keyword extraction and TextRank algorithm) on classification algorithms and ensemble methods for scientific text document classification (categorization). In the study, a comprehensive study of comparing base learning algorithms (Naive Bayes, support vector machines, logistic regression and Random Forest) with five widely utilized ensemble methods (AdaBoost, Bagging, Dagging, Random Subspace and Majority Voting) is conducted. To the best of our knowledge, this is the first empirical analysis, which evaluates the effectiveness of statistical keyword extraction methods in conjunction with ensemble learning algorithms. The classification schemes are compared in terms of classification accuracy, F-measure and area under curve values. To validate the empirical analysis, two-way ANOVA test is employed. The experimental analysis indicates that Bagging ensemble of Random Forest with the most-frequent based keyword extraction method yields promising results for text classification. For ACM document collection, the highest average predictive performance (93.80%) is obtained with the utilization of the most frequent based keyword extraction method with Bagging ensemble of Random Forest algorithm. In general, Bagging and Random Subspace ensembles of Random Forest yield promising results. The empirical analysis indicates that the utilization of keyword-based representation of text documents in conjunction with ensemble learning can enhance the predictive performance and scalability of text classification schemes, which is of practical importance in the application fields of text classification.

...read moreread less

445 citations

Journal Article•DOI•

A feature selection model based on genetic rank aggregation for text sentiment classification

[...]

Aytuğ Onan¹, Serdar Korukoglu²•Institutions (2)

Celal Bayar University¹, Ege University²

01 Feb 2017-Journal of Information Science

TL;DR: An ensemble approach for feature selection is presented, which aggregates the several individual feature lists obtained by the different feature selection methods so that a more robust and efficient feature subset can be obtained.

...read moreread less

Abstract: Sentiment analysis is an important research direction of natural language processing, text mining and web mining which aims to extract subjective information in source materials The main challenge encountered in machine learning method-based sentiment classification is the abundant amount of data available This amount makes it difficult to train the learning algorithms in a feasible time and degrades the classification accuracy of the built model Hence, feature selection becomes an essential task in developing robust and efficient classification models whilst reducing the training time In text mining applications, individual filter-based feature selection methods have been widely utilized owing to their simplicity and relatively high performance This paper presents an ensemble approach for feature selection, which aggregates the several individual feature lists obtained by the different feature selection methods so that a more robust and efficient feature subset can be obtained In order to aggregate the individual feature lists, a genetic algorithm has been utilized Experimental evaluations indicated that the proposed aggregation model is an efficient method and it outperforms individual filter-based feature selection methods on sentiment classification

...read moreread less

274 citations

Journal Article•DOI•

A multiobjective weighted voting ensemble classifier based on differential evolution algorithm for text sentiment classification

[...]

Aytuğ Onan¹, Aytuğ Onan², Serdar Korukoglu², Hasan Bulut²•Institutions (2)

Celal Bayar University¹, Ege University²

15 Nov 2016-Expert Systems With Applications

TL;DR: Experimental analysis of classification tasks, including sentiment analysis, software defect prediction, credit risk modeling, spam filtering, and semantic mapping, suggests that the proposed ensemble method can predict better than conventional ensemble learning methods such as AdaBoost, bagging, random subspace, and majority voting.

...read moreread less

Abstract: Typically performed by supervised machine learning algorithms, sentiment analysis is highly useful for extracting subjective information from text documents online. Most approaches that use ensemble learning paradigms toward sentiment analysis involve feature engineering in order to enhance the predictive performance. In response, we sought to develop a paradigm of a multiobjective, optimization-based weighted voting scheme to assign appropriate weight values to classifiers and each output class based on the predictive performance of classification algorithms, all to enhance the predictive performance of sentiment classification. The proposed ensemble method is based on static classifier selection involving majority voting error and forward search, as well as a multiobjective differential evolution algorithm. Based on the static classifier selection scheme, our proposed ensemble method incorporates Bayesian logistic regression, naive Bayes, linear discriminant analysis, logistic regression, and support vector machines as base learners, whose performance in terms of precision and recall values determines weight adjustment. Our experimental analysis of classification tasks, including sentiment analysis, software defect prediction, credit risk modeling, spam filtering, and semantic mapping, suggests that the proposed classification scheme can predict better than conventional ensemble learning methods such as AdaBoost, bagging, random subspace, and majority voting. Of all datasets examined, the laptop dataset showed the best classification accuracy (98.86%).

...read moreread less

272 citations

Journal Article•DOI•

Sentiment analysis on product reviews based on weighted word embeddings and deep neural networks

[...]

Aytuğ Onan¹•Institutions (1)

Izmir Kâtip Çelebi University¹

10 Dec 2021-Concurrency and Computation: Practice and Experience

TL;DR: The empirical results indicate that the proposed deep learning architecture outperforms the conventional deep learning methods on sentiment analysis on product reviews obtained from Twitter.

...read moreread less

Abstract: Sentiment analysis is one of the major tasks of natural language processing, in which attitudes, thoughts, opinions, or judgments toward a particular subject has been extracted. Web is an unstructured and rich source of information containing many text documents with opinions and reviews. The recognition of sentiment can be helpful for individual decision makers, business organizations, and governments. In this article, we present a deep learning‐based approach to sentiment analysis on product reviews obtained from Twitter. The presented architecture combines TF‐IDF weighted Glove word embedding with CNN‐LSTM architecture. The CNN‐LSTM architecture consists of five layers, that is, weighted embedding layer, convolution layer (where, 1‐g, 2‐g, and 3‐g convolutions have been employed), max‐pooling layer, followed by LSTM, and dense layer. In the empirical analysis, the predictive performance of different word embedding schemes (ie, word2vec, fastText, GloVe, LDA2vec, and DOC2vec) with several weighting functions (ie, inverse document frequency, TF‐IDF, and smoothed inverse document frequency function) have been evaluated in conjunction with conventional deep neural network architectures. The empirical results indicate that the proposed deep learning architecture outperforms the conventional deep learning methods.

...read moreread less

197 citations

Journal Article•DOI•

An ensemble scheme based on language function analysis and feature engineering for text genre classification

[...]

Aytuğ Onan¹•Institutions (1)

Celal Bayar University¹

01 Feb 2018-Journal of Information Science

TL;DR: An ensemble classification scheme is presented, which integrates Random Subspace ensemble of Random Forest with four types of features (features used in authorship attribution, character n-grams, part of speech n- grams and the frequency of the most discriminative words) and the highest average predictive performance obtained by the proposed scheme is 94.43%.

...read moreread less

Abstract: Text genre classification is the process of identifying functional characteristics of text documents. The immense quantity of text documents available on the web can be properly filtered, organised...

...read moreread less

193 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15

Collapse

Cited by

PDF

Open Access

More filters

Data Mining - Concepts and Techniques.

[...]

Petra Perner

01 Jan 2002

9,314 citations

Journal Article•

Data Mining Practical Machine Learning Tools and Techniques

[...]

อนิรุธ สืบสิงห์

01 Jan 2014-Journal of management science

9,185 citations

Journal Article•DOI•

Ensemble of keyword extraction methods and classifiers in text classification

[...]

Aytuğ Onan¹, Serdar Korukoglu², Hasan Bulut²•Institutions (2)

Celal Bayar University¹, Ege University²

15 Sep 2016-Expert Systems With Applications

...read moreread less

445 citations

Journal Article•DOI•

Hybrid integration of Multilayer Perceptron Neural Networks and machine learning ensembles for landslide susceptibility assessment at Himalayan area (India) using GIS

[...]

Binh Thai Pham¹, Dieu Tien Bui², Indra Prakash³, M. B. Dholakia¹•Institutions (3)

Gujarat Technological University¹, University College of Southeast Norway², Government of Gujarat³

01 Feb 2017-Catena

TL;DR: Analysis of results indicates that landslide models using machine learning ensemble frameworks are promising methods which can be used as alternatives of individual base classifiers for landslide susceptibility assessment of other prone areas.

...read moreread less

Abstract: The main objective of this study is to evaluate and compare the performance of landslide models using machine learning ensemble technique for landslide susceptibility assessment. This technique is a combination of ensemble methods (AdaBoost, Bagging, Dagging, MultiBoost, Rotation Forest, and Random SubSpace) and the base classifier of Multiple Perceptron Neural Networks (MLP Neural Nets). Ensemble techniques have been widely applied in other fields; however, their application is still rare in the assessment of landslide problems. Meanwhile, MLP Neural Nets, which is known as an artificial neural network, has been applied widely and efficiently in landslide problems. In the present study, landslide models of part Himalayan area (India) have been constructed and validated. For the evaluation and comparison of these models, receiver operating characteristic curve and Chi Square test methods have been applied. Overall, all landslide models performed well in landslide susuceptibility assessment but the performance of the MultiBoost model is the highest (AUC = 0.886), followed by Dagging model (AUC = 0.885), the Rotation Forest model (AUC = 0.882), the Bagging and Random SubSpace models (AUC = 0.881), and the AdaBoost model (AUC = 0.876), respectively. Moreover, machine learning ensemble models have improved significantly the performance of the base classifier of MLP Neural Nets (AUC = 0.874). Analysis of results indicates that landslide models using machine learning ensemble frameworks are promising methods which can be used as alternatives of individual base classifiers for landslide susceptibility assessment of other prone areas.

...read moreread less

436 citations

Journal Article•DOI•

A novel hybrid artificial intelligence approach for flood susceptibility assessment

[...]

Kamran Chapi¹, Vijay P. Singh², Ataollah Shirzadi¹, Himan Shahabi¹, Dieu Tien Bui³, Binh Thai Pham, Khabat Khosravi - Show less +3 more•Institutions (3)

University of Kurdistan¹, Texas A&M University², University College of Southeast Norway³

01 Sep 2017-Environmental Modelling and Software

TL;DR: Results indicate that the proposed Bagging-LMT model can be used for sustainable management of flood-prone areas and outperformed all state-of-the-art benchmark soft computing models.

...read moreread less

Abstract: A new artificial intelligence (AI) model, called Bagging-LMT - a combination of bagging ensemble and Logistic Model Tree (LMT) - is introduced for mapping flood susceptibility. A spatial database was generated for the Haraz watershed, northern Iran, that included a flood inventory map and eleven flood conditioning factors based on the Information Gain Ratio (IGR). The model was evaluated using precision, sensitivity, specificity, accuracy, Root Mean Square Error, Mean Absolute Error, Kappa and area under the receiver operating characteristic curve criteria. The model was also compared with four state-of-the-art benchmark soft computing models, including LMT, logistic regression, Bayesian logistic regression, and random forest. Results revealed that the proposed model outperformed all these models and indicate that the proposed model can be used for sustainable management of flood-prone areas.

...read moreread less

372 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse