scispace - formally typeset
Search or ask a question

Showing papers by "Gülşen Eryiğit published in 2005"


Journal ArticleDOI
TL;DR: Experimental evaluation confirms that MaltParser can achieve robust, efficient and accurate parsing for a wide range of languages without language-specific enhancements and with rather limited amounts of training data.
Abstract: Parsing unrestricted text is useful for many language technology applications but requires parsing methods that are both robust and efficient. MaltParser is a language-independent system for data-driven dependency parsing that can be used to induce a parser for a new language from a treebank sample in a simple yet flexible manner. Experimental evaluation confirms that MaltParser can achieve robust, efficient and accurate parsing for a wide range of languages without language-specific enhancements and with rather limited amounts of training data.

801 citations


Book ChapterDOI
TL;DR: A new EA approach (MIA) which benefits from the EDA-like approach it employs for re-initializing populations after a change as well as using different change handling mechanisms together is proposed.
Abstract: There is a growing interest in applying evolutionary algorithms to dynamic environments. Different types of changes in the environment benefit from different types of mechanisms to handle the change. In this study, the mechanisms used in literature are categorized into four groups. A new EA approach (MIA) which benefits from the EDA-like approach it employs for re-initializing populations after a change as well as using different change handling mechanisms together is proposed. Experiments are conducted using the 0/1 single knapsack problem to compare MIA with other algorithms and to explore its performance. Promising results are obtained which promote further study. Current research is being done to extend MIA to other problem domains.

23 citations


Proceedings ArticleDOI
25 Jun 2005
TL;DR: This study explores the applicability of a generational EA that uses a penalty-based constraint handling technique and a gene locus based, asymmetric, adaptive mutation scheme for the knapsack problem.
Abstract: Knapsack problems are among the most common problems in literature tackled with evolutionary algorithms (EA). Their major advantage lies in the fact that they are relatively simple to implement while they allow generalizations for a wide range of real world problems. The multi-dimensional knapsack problem (MKP), which belongs to the class of NP-complete combinatorial optimization problems, is one of the variations of the knapsack problem. The MKP has a wide range of real world applications such as cargo loading, selecting projects to fund, budget management, cutting stock, etc. The MKP has been studied quite extensively in the EA community. Due to the constrained nature of the problem, constraint handling techniques gain great importance in the performance of the proposed EA approaches. In this study, the applicability of a generational EA that uses a penalty-based constraint handling technique and a gene locus based, asymmetric, adaptive mutation scheme is explored for the MKP. The effects of the parameters of the explored approach is determined through tests. Further experiments, using large MKP instances from commonly used benchmarks available through the Internet are performed. Comparison tables are given for the performance of the explored approach and other good performing EAs found in literature for the MKP. Results show that performance improves greatly when compared with other penalty-based techniques, but the explored approach is still not the best performer among all. However, unlike the explored technique, the EAs using the other constraint handling techniques require a great amount of extra computational effort and need heuristic information specific to the optimization problem. Based on these observations, and the fact that the performance difference between the explored scheme and the better performers is not too high, research on improving the explored approach is still in progress.

13 citations


01 Jan 2005
TL;DR: Results show that SVM has significantly better performance for no-cost and high-cost cases, but NB performs best when the cost is extremely high.
Abstract: This paper presents a comparison of support vector machines (SVM), memory-based learning (MBL) and Naive Bayes (NB) techniques for the classification of legitimate and spam mails. Although there are a number of method-comparative studies regarding spam mail filtering, most of the studies are tested on separate data sets. In order to evaluate the effectiveness of SVM, MBL and NB methods, we have used a common publicly available corpus (LINGSPAM). As MBL and NB methods are previously tested with this corpus, the obtained best parameters are used in the experiments with few changes. On the other hand, intense experiments are made to find the best attribute dimensions with SVMs. Results show that SVM has significantly better performance for no-cost and high-cost cases, but NB performs best when the cost is extremely high.

4 citations


01 Jan 2005
TL;DR: Results show that SVM has significantly better performance for no-cost and high-cost cases, but NB performs best when the cost is extremely high.
Abstract: Bu makalenin amaci, yaramaz (spam) epostalari, normal e-postalardan ayirma sureci icin, karar destek makineleri (Support Vector Machines - SVM), bellek tabanli ogrenme (Memory Based Learning - MBL) ve Naive Bayes (NB) yontemlerinin karsilastirmali degerlendirmesini yapmaktir. Yaramaz e-posta-larin suzulmesinde kullanilan yontemleri karsilastiran bircok calisma olmasina karsin, bu calismalarin buyuk cogunlugu, farkli veri kumeleri kullandiklarindan karsilastirilabilir nitelikte degildir. Bu calismada, SVM, MBL ve NB yontemleri karsilastirilirken, herkesin erisimine acik olan ortak bir derlem (corpus) olan LINGSPAM derlemi kullanilmistir. MBL ve NB yontemleri, onceki calismalarda bu veri kumesi uzerinde sinandigi icin, onceki deneylerden elde edilen en iyi parametreler ufak degisikliklerle kullanilmistir. Ancak SVM yonteminin en iyi sonucu vermesini saglamak icin cok sayida deney yapilmistir. Calismamizda bir e-postanin, yaramaz olarak taninmasi durumunda, bu e-postaya nasil davranilacagina iliskin senaryo onerileri verilmis ve gerceklenen siniflandiricilarin hatali calismasi durumunda ilgili senaryolara gore ortaya cikabilecek hatalarin bedeli goz onune alinarak bu uc siniflandirma yontemi degerlendirilmistir. Ortaya cikan sonuclarda, SVM yonteminin hata bedelinin sifir oldugu ya da yuksek oldugu senaryolar icin basariminin diger yontemlerden daha iyi oldugu gorulmustur. Ancak hata bedelinin cok yuksek olmasi durumunda ise NB yontemi en iyi sonucu vermistir. Abstract This paper presents a comparison of support vector machines (SVM), memory-based learning (MBL) and Naive Bayes (NB) techniques for the classification of legitimate and spam mails. Although there are a number of method-comparative studies regarding spam mail filtering, most of the studies are tested on separate data sets. In order to evaluate the effectiveness of SVM, MBL and NB methods, we have used a common publicly available corpus (LINGSPAM). As MBL and NB methods are previously tested with this corpus, the obtained best parameters are used in the experiments with few changes. On the other hand, intense experiments are made to find the best attribute dimensions with SVMs. Results show that SVM has significantly better performance for no-cost and high-cost cases, but NB performs best when the cost is extremely high