Home
/
Authors
/
Konstantinos Chandrinos

Author

Konstantinos Chandrinos

Bio: Konstantinos Chandrinos is an academic researcher. The author has contributed to research in topics: Naive Bayes classifier & The Internet. The author has an hindex of 7, co-authored 8 publications receiving 1641 citations.

Papers

PDF

Open Access

More filters

Posted Content•

An Evaluation of Naive Bayesian Anti-Spam Filtering

[...]

Ion Androutsopoulos, John Koutsias, Konstantinos Chandrinos, George Paliouras, Constantine D. Spyropoulos¹ - Show less +1 more•Institutions (1)

National Centre of Scientific Research "Demokritos"¹

07 Jun 2000-arXiv: Computation and Language

TL;DR: It is reached that additional safety nets are needed for the Naive Bayesian anti-spam filter to be viable in practice.

...read moreread less

Abstract: It has recently been argued that a Naive Bayesian classifier can be used to filter unsolicited bulk e-mail (“spam”). We conduct a thorough evaluation of this proposal on a corpus that we make publicly available, contributing towards standard benchmarks. At the same time we investigate the effect of attribute-set size, training-corpus size, lemmatization, and stop-lists on the filter’s performance, issues that had not been previously explored. After introducing appropriate cost-sensitive evaluation measures, we reach the conclusion that additional safety nets are needed for the Naive Bayesian anti-spam filter to be viable in practice.

...read moreread less

641 citations

Posted Content•

An Experimental Comparison of Naive Bayesian and Keyword-Based Anti-Spam Filtering with Personal E-mail Messages

[...]

Ion Androutsopoulos, John Koutsias, Konstantinos Chandrinos, Constantine D. Spyropoulos

22 Aug 2000-arXiv: Computation and Language

TL;DR: In this article, a Naive Bayesian classifier is trained automatically to detect spam messages, and a large collection of personal e-mail messages are made publicly available in "encrypted" form contributing towards standard benchmarks.

...read moreread less

Abstract: The growing problem of unsolicited bulk e-mail, also known as "spam", has generated a need for reliable anti-spam e-mail filters. Filters of this type have so far been based mostly on manually constructed keyword patterns. An alternative approach has recently been proposed, whereby a Naive Bayesian classifier is trained automatically to detect spam messages. We test this approach on a large collection of personal e-mail messages, which we make publicly available in "encrypted" form contributing towards standard benchmarks. We introduce appropriate cost-sensitive measures, investigating at the same time the effect of attribute-set size, training-corpus size, lemmatization, and stop lists, issues that have not been explored in previous experiments. Finally, the Naive Bayesian filter is compared, in terms of performance, to a filter that uses keyword patterns, and which is part of a widely used e-mail reader.

...read moreread less

464 citations

Proceedings Article•DOI•

An experimental comparison of naive Bayesian and keyword-based anti-spam filtering with personal e-mail messages

[...]

Ion Androutsopoulos, John Koutsias, Konstantinos Chandrinos, Constantine D. Spyropoulos

01 Jul 2000

TL;DR: This work introduces appropriate cost-sensitive measures, and investigates at the same time the effect of attribute-set size, training-corpus size, lemmatization, and stop lists, issues that have not been explored in previous experiments.

...read moreread less

Abstract: The growing problem of unsolicited bulk e-mail, also known as “spam”, has generated a need for reliable anti-spam e-mail filters. Filters of this type have so far been based mostly on manually constructed keyword patterns. An alternative approach has recently been proposed, whereby a Naive Bayesian classifier is trained automatically to detect spam messages. We test this approach on a large collection of personal e-mail messages, which we make publicly available in “encrypted” form contributing towards standard benchmarks. We introduce appropriate cost-sensitive measures, investigating at the same time the effect of attribute-set size, training-corpus size, lemmatization, and stop lists, issues that have not been explored in previous experiments. Finally, the Naive Bayesian filter is compared, in terms of performance, to a filter that uses keyword patterns, and which is part of a widely used e-mail reader.

...read moreread less

448 citations

Book Chapter•DOI•

Trafficopter: A Distributed Collection System for Traffic Information

[...]

Alexandros Moukas¹, Konstantinos Chandrinos, Pattie Maes¹•Institutions (1)

Massachusetts Institute of Technology¹

04 Jul 1998

TL;DR: Trafficopter is described, a multi-agent system that collects and propagates traffic information in an urban setting using distributed methods using PDAs/WindowsCE-based terminals equipped with GPS and wireless transceivers.

...read moreread less

Abstract: We describe Trafficopter, a multi-agent system that collects and propagates traffic information in an urban setting using distributed methods. Agents into the vehicles themselves collect and propagate traffic-related information in a decentralized, self-organizing fashion with no single point of failure. The ideas in this system are the use of the vehicles/agents themselves as a way of collecting traffic data and the way those data are distributed to the interested vehicles. The tools used are a traffic simulator and a set of PDAs/WindowsCE-based terminals equipped with GPS and wireless transceivers. The simulator is used for the investigation of the validity of traffic control and information propagation algorithms in a distributed environment and the WindowsCE terminals for applying the above ideas into the real world.

...read moreread less

42 citations

Proceedings Article•DOI•

ELS: a word-level method for entity-level sentiment analysis

[...]

Nikos Engonopoulos¹, Angeliki Lazaridou¹, Georgios Paliouras, Konstantinos Chandrinos•Institutions (1)

National and Kapodistrian University of Athens¹

25 May 2011

TL;DR: ELS, a new method for entity-level sentiment classification using sequence modeling by Conditional Random Fields, performs better than the common bag-of-words approaches, especially when the authors target the local sentiment in small parts of a larger document.

...read moreread less

Abstract: We introduce ELS, a new method for entity-level sentiment classification using sequence modeling by Conditional Random Fields (CRF). The CRF is trained to identify the sentiment of each word in a document, which is then used to determine the sentiment for the entity, based on where it appears in the text. Due to its sequential nature, the CRF classifier performs better than the common bag-of-words approaches, especially when we target the local sentiment in small parts of a larger document. Identifying the sentiment about a specific entity, mentioned in a blog post or a larger product review, is a special case of such local sentiment classification. Furthermore, the proposed approach performs well even in short pieces of text, where bag-of-words approaches usually fail, due to the sparseness of the resulting feature vector. We have implemented and tested the proposed method on a publicly available benchmark corpus of short product reviews in English. The results that we present in this paper improve significantly upon published results on the same data, thus confirming our intuition about the approach.

...read moreread less

30 citations

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Machine learning

[...]

Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

01 Dec 1996-ACM Computing Surveys

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

...read moreread less

Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

...read moreread less

13,246 citations

Journal Article•DOI•

Machine learning in automated text categorization

[...]

Fabrizio Sebastiani

01 Mar 2002-ACM Computing Surveys

TL;DR: This survey discusses the main approaches to text categorization that fall within the machine learning paradigm and discusses in detail issues pertaining to three different problems, namely, document representation, classifier construction, and classifier evaluation.

...read moreread less

Abstract: The automated categorization (or classification) of texts into predefined categories has witnessed a booming interest in the last 10 years, due to the increased availability of documents in digital form and the ensuing need to organize them. In the research community the dominant approach to this problem is based on machine learning techniques: a general inductive process automatically builds a classifier by learning, from a set of preclassified documents, the characteristics of the categories. The advantages of this approach over the knowledge engineering approach (consisting in the manual definition of a classifier by domain experts) are a very good effectiveness, considerable savings in terms of expert labor power, and straightforward portability to different domains. This survey discusses the main approaches to text categorization that fall within the machine learning paradigm. We will discuss in detail issues pertaining to three different problems, namely, document representation, classifier construction, and classifier evaluation.

...read moreread less

7,539 citations

Book•

The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data

[...]

Ronen Feldman¹, James Sanger•Institutions (1)

Hebrew University of Jerusalem¹

01 Dec 2006

TL;DR: Providing an in-depth examination of core text mining and link detection algorithms and operations, this text examines advanced pre-processing techniques, knowledge representation considerations, and visualization approaches.

...read moreread less

Abstract: 1. Introduction to text mining 2. Core text mining operations 3. Text mining preprocessing techniques 4. Categorization 5. Clustering 6. Information extraction 7. Probabilistic models for Information extraction 8. Preprocessing applications using probabilistic and hybrid approaches 9. Presentation-layer considerations for browsing and query refinement 10. Visualization approaches 11. Link analysis 12. Text mining applications Appendix Bibliography.

...read moreread less

1,628 citations

Book•

Bayesian Reasoning and Machine Learning

[...]

David Barber¹•Institutions (1)

University College London¹

12 Mar 2012

TL;DR: Comprehensive and coherent, this hands-on text develops everything from basic reasoning to advanced techniques within the framework of graphical models, and develops analytical and problem-solving skills that equip them for the real world.

...read moreread less

Abstract: Machine learning methods extract value from vast data sets quickly and with modest resources They are established tools in a wide range of industrial applications, including search engines, DNA sequencing, stock market analysis, and robot locomotion, and their use is spreading rapidly People who know the methods have their choice of rewarding jobs This hands-on text opens these opportunities to computer science students with modest mathematical backgrounds It is designed for final-year undergraduates and master's students with limited background in linear algebra and calculus Comprehensive and coherent, it develops everything from basic reasoning to advanced techniques within the framework of graphical models Students learn more than a menu of techniques, they develop analytical and problem-solving skills that equip them for the real world Numerous examples and exercises, both computer based and theoretical, are included in every chapter Resources for students and instructors, including a MATLAB toolbox, are available online

...read moreread less

1,474 citations

Journal Article•DOI•

Multiagent Systems: A Survey from a Machine Learning Perspective

[...]

Peter Stone¹, Manuela Veloso²•Institutions (2)

AT&T Labs¹, Carnegie Mellon University²

01 Jun 2000-Autonomous Robots

TL;DR: This survey of MAS is intended to serve as an introduction to the field and as an organizational framework, and highlights how multiagent systems can be and have been used to build complex systems.

...read moreread less

Abstract: Distributed Artificial Intelligence (DAI) has existed as a subfield of AI for less than two decades. DAI is concerned with systems that consist of multiple independent entities that interact in a domain. Traditionally, DAI has been divided into two sub-disciplines: Distributed Problem Solving (DPS) focuses on the information management aspects of systems with several components working together towards a common goals Multiagent Systems (MAS) deals with behavior management in collections of several independent entities, or agents. This survey of MAS is intended to serve as an introduction to the field and as an organizational framework. A series of general multiagent scenarios are presented. For each scenario, the issues that arise are described along with a sampling of the techniques that exist to deal with them. The presented techniques are not exhaustive, but they highlight how multiagent systems can be and have been used to build complex systems. When options exist, the techniques presented are biased towards machine learning approaches. Additional opportunities for applying machine learning to MAS are highlighted and robotic soccer is presented as an appropriate test bed for MAS. This survey does not focus exclusively on robotic systems. However, we believe that much of the prior research in non-robotic MAS is relevant to robotic MAS, and we explicitly discuss several robotic MAS, including all of those presented in this issue.

...read moreread less

1,073 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse