Showing papers in "International Journal of Computer Applications in 2017"
TL;DR: Results show that the data encoded with Sum Coding and Backward Difference Coding technique give highest accuracy as compared to the data pre-processed by rest of the techniques.
Abstract: In classification analysis, the dependent variable is frequently influenced not only by ratio scale variables, but also by qualitative (nominal scale) variables. Machine Learning algorithms accept only numerical inputs, hence, it is necessary to encode these categorical variables into numerical values using encoding techniques. This paper presents a comparative study of seven categorical variable encoding techniques to be used for classification using Artificial Neural Networks on a categorical dataset. The Car Evaluation dataset provided by UCI is used for training. Results show that the data encoded with Sum Coding and Backward Difference Coding technique give highest accuracy as compared to the data pre-processed by rest of the techniques.
TL;DR: This paper discusses different crossover operators that help in solving the problem that involves large population size, which is travelling sales man problem.
Abstract: Genetic Algorithms are the population based search and optimization technique that mimic the process of natural evolution. Genetic algorithms are very effective way of finding a very effective way of quickly finding a reasonable solution to a complex problem. Performance of genetic algorithms mainly depends on type of genetic operators which involve crossover and mutation operators. Different crossover and mutation operators exist to solve the problem that involves large population size. Example of such a problem is travelling sales man problem, which is having a large set of solution. In this paper we will discuss different crossover operators that help in solving the problem.
TL;DR: Various algorithms of the decision tree (ID3, C4.5, CART), their features, advantages, and disadvantages are discussed.
Abstract: Today the computer technology and computer network technology has developed so much and is still developing with pace.Thus, the amount of data in the information industry is getting higher day by day. This large amount of data can be helpful for analyzing and extracting useful knowledge from it. The hidden patterns of data are analyzed and then categorized into useful knowledge. This process is known as Data Mining. .Among the various data mining techniques, Decision Tree is also the popular one. Decision tree uses divide and conquer technique for the basic learning strategy. A decision tree is a flow chart-like structure in which each internal node represents a “test” on an attribute where each branch represents the outcome of the test and each leaf node represents a class label. This paper discusses various algorithms of the decision tree (ID3, C4.5, CART), their features, advantages, and disadvantages.
TL;DR: A detailed overview of the process of fruit classification and grading has been presented and some extraction methods like Speeded Up Robust Features (SURF), Histogram of Oriented Gradient (HOG) and Local Binary Pattern (LBP) are discussed with the common features of fruits like color, size, shape and texture.
Abstract: One of the important quality features of fruits is its appearance. Appearance not only influences their market value, the preferences and the choice of the consumer, but also their internal quality to a certain extent. Color, texture, size, shape, as well the visual flaws are generally examined to assess the outside quality of food. Manually controlling external quality control of fruit is time consuming and laborintensive. Thus for automatic external quality control of food and agricultural products, computer vision systems have been widely used in the food industry and have proved to be a scientific and powerful tool for by intensive work over decades. The use of machine and computer vision technology in the field of external quality inspection of fruit has been published based on studies carried on spatial image and / or spectral image processing and analysis. A detailed overview of the process of fruit classification and grading has been presented in this paper. Detail examination of each step is done. Some extraction methods like Speeded Up Robust Features (SURF), Histogram of Oriented Gradient (HOG) and Local Binary Pattern (LBP) are discussed with the common features of fruits like color, size, shape and texture. Machine learning algorithms like K-nearest neighbor (KNN), Support Vector Machine (SVM), Artificial Neural Networks (ANN) and Convolutional Neural Networks (CNN) are also discussed. Process, advantages, disadvantages, challenges occurring in food-classification and grading is discussed in this paper, which can give direction to researchers. General Terms Machine Vision, Fruit Classification, Grading.
TL;DR: This survey paper compare's and details the various type of recommender system and popular recommendation algorithms and its uses.
Abstract: Recommendation systems have become extremely common in recent years. It helps the customer to discover information and settle on choices where they do not have the required learning to judge a specific item. It can be utilized as a part of different diverse approaches to encourage its customer with effective information sorting. It is a software tool and techniques that provide suggestion based on the customer's taste to discover new appropriate thing for them by filtering personalized information based on the user's preferences from a large volume of information. Users taste and preferences should be constructed accurately in order to provide most relevant suggestions. This survey paper compare's and details the various type of recommender system and popular recommendation algorithms and its uses.
TL;DR: An automated irrigation system which monitors and maintains the desired soil moisture content via automatic watering through microcontroller ATMEGA328P on arduino uno platform is proposed.
Abstract: Automation of farm activities can transform agricultural domain from being manual and static to intelligent and dynamic leading to higher production with lesser human supervision. This paper proposes an automated irrigation system which monitors and maintains the desired soil moisture content via automatic watering. Microcontroller ATMEGA328P on arduino uno platform is used to implement the control unit. The setup uses soil moisture sensors which measure the exact moisture level in soil. This value enables the system to use appropriate quantity of water which avoids over/under irrigation. IOT is used to keep the farmers updated about the status of sprinklers. Information from the sensors is regularly updated on a webpage using GSM-GPRS SIM900A modem through which a farmer can check whether the water sprinklers are ON/OFF at any given time. Also, the sensor readings are transmitted to a Thing speak channel to generate graphs for analysis.
TL;DR: To analyze the performance of SVM, two pre classified datasets of tweets are used and for comparative analysis, three measures are used: Precision, Recall and F-Measure.
Abstract: Community's view and feedback have always proved to be the most essential and valuable resource for companies and organizations. With social media being the emerging trend among everyone, it paves way for unprecedented analysis and evaluation of various aspects for which organizations had to rely on unconventional, time consuming and error prone methods earlier. This technique of analysis directly falls under the domain of \"sentiment analysis\". Sentiment analysis encompasses the vast field of effective classification of user generated text under defined polarities. There are several tools and algorithms available to perform sentiment detection and analysis including supervised machine learning algorithms that perform classification on the target corpus, after getting trained with training data. Lexical techniques which performs classification on the basis of dictionary based annotated corpus and Hybrid tools which are combination of machine learning and lexicon based algorithms. In this paper we have used Support Vector Machine (SVM) for sentiment analysis in Weka. SVM is one of the widely used supervised machine learning algorithms for textual polarity detection. To analyze the performance of SVM, two pre classified datasets of tweets are used and for comparative analysis, three measures are used: Precision, Recall and F-Measure. Results are shown in the form of tables and graphs.
TL;DR: Improvement in the kmean clustering algorithm will be proposed which can define number of clusters automatically and assign required cluster to un-clustered points and will leads to improvement in accuracy and reduce clustering time by the member assigned to the cluster to predict cancer.
Abstract: Clustering is technique which is used to analyze the data in efficient manner and generate required information. To cluster the dataset, there is a technique named k-mean, is applied which is based on central point selection and calculation of Euclidian Distance. Here in k-mean, dataset will be loaded and from the dataset. Central points are selected using the formulae Euclidian distance and on the basis of Euclidian distance points are assigned to the clusters. The main disadvantage of k-mean is of accuracy, as in k-mean clustering user needs to define number of clusters. Because of user defined number of clusters, some points of the dataset are remained un-clustered. In this work, improvement in the kmean clustering algorithm will be proposed which can define number of clusters automatically and assign required cluster to un-clustered points. The proposed improvement will leads to improvement in accuracy and reduce clustering time by the member assigned to the cluster to predict cancer.
TL;DR: This paper will explore the various domains where blockchain has had an impact and where future implementations may be expected and will bring together all the key developments so far in terms of putting blockchain to practice.
Abstract: Blockchain is being termed as the fifth disruptive innovation in computing. In simplest words, it is a distributed ledger of records that is immutable and verifiable. Since its advent in 2008, blockchain as a concept has been used in various ways. The largest impact or application is seen as a multitude of cryptocurrencies that have sprung up. However, with time, it has become clear that blockchain as a technology is likely to have an impact much wider than just the cryptocurrency domain and much deeper than simple distributed ledger storage. This detailed survey intends to bring together all the key developments so far in terms of putting blockchain to practice. While the most common adoption of blockchain is in finance and banking domain, there are experiments being conducted by many big players in various other domains. This paper will explore the various domains where blockchain has had an impact and where future implementations may be expected.
TL;DR: The main objective of this paper is to review the various machine learning approaches for diagnosing Myocardial Infarction, differentiate Arrhythmias (heart beat variation), Hypertrophy (increase thickness of the heart muscle) and Enlargement of Heart.
Abstract: Electrocardiogram (ECG) is a P, QRS and T wave demonstrating the electrical activity of the heart. Feature extraction and segmentation in ECG plays a significant role in diagnosing most of the cardiac disease. The main objective of this paper is to review the various machine learning approaches for diagnosing Myocardial Infarction (heart attack), differentiate Arrhythmias (heart beat variation), Hypertrophy (increase thickness of the heart muscle) and Enlargement of Heart. Further, we also present various machine learning approaches and compare different methods and results used to analyze the ECG. The existing methods are compared and contrasted based on qualitative and qualitative parameters viz., purpose of the work, algorithms adopted and results obtained.
TL;DR: This paper aims to review some papers regarding research in sentiment analysis on Twitter, describing the methodologies adopted and models applied, along with describing a generalized Python based approach.
Abstract: Twitter is a platform widely used by people to express their opinions and display sentiments on different occasions. Sentiment analysis is an approach to analyze data and retrieve sentiment that it embodies. Twitter sentiment analysis is an application of sentiment analysis on data from Twitter (tweets), in order to extract sentiments conveyed by the user. In the past decades, the research in this field has consistently grown. The reason behind this is the challenging format of the tweets which makes the processing difficult. The tweet format is very small which generates a whole new dimension of problems like use of slang, abbreviations etc. In this paper, we aim to review some papers regarding research in sentiment analysis on Twitter, describing the methodologies adopted and models applied, along with describing a generalized Python based approach.
TL;DR: With the help of the advantage of deep learning in modeling different types of data, deep recommender systems can better understand users’ demand to further improve quality of recommendation.
Abstract: The advancement in technology accelerated and opened availability of various alternatives to make a choice in every domain. In the era of big data it is a tedious and time consuming task to evaluate the features of a large amount of information provided to make a choice. One solution to ease this overload problem is building recommender system that can process a large amount of data and support users’ decision making ability. In this paper different traditional recommendation techniques, deep learning approaches for recommender system and survey of deep learning techniques on recommender system are presented. A variety of techniques have been proposed to perform recommendation, including content based, collaborative and hybrid recommenders. Due to the limitation of the traditional recommendation methods in obtaining accurate result a deep learning approach is introduced both for collaborative and content based approaches that will enable the model to learn different features of users and items automatically to improve accuracy of recommendation. Even though deep learning poses a great impact in various areas, applying the model to a recommender systems have not been fully exploited. With the help of the advantage of deep learning in modeling different types of data, deep recommender systems can better understand users’ demand to further improve quality of recommendation.
TL;DR: In this article, a short review of the recent advances made in the field of recommendation using various variants of deep learning technology is presented, including collaborative system, content based system, and hybrid system.
Abstract: With the exponential increase in the amount of digital information over the internet, online shops, online music, video and image libraries, search engines and recommendation system have become the most convenient ways to find relevant information within a short time. In the recent times, deep learning's advances have gained significant attention in the field of speech recognition, image processing and natural language processing. Meanwhile, several recent studies have shown the utility of deep learning in the area of recommendation systems and information retrieval as well. In this short review, we cover the recent advances made in the field of recommendation using various variants of deep learning technology. We organize the review in three parts: Collaborative system, Content based system and Hybrid system. The review also discusses the contribution of deep learning integrated recommendation systems into several application domains. The review concludes by discussion of the impact of deep learning in recommendation system in various domain and whether deep learning has shown any significant improvement over the conventional systems for recommendation. Finally, we also provide future directions of research which are possible based on the current state of use of deep learning in recommendation systems.
TL;DR: From this survey research it is learnt that connecting supervised machine learning algorithm with boosting process increased prediction efficiency and there is a wide scope in this research element.
Abstract: Data mining is one amid the core research areas in the field of computer science. Yet there is a knowledge data detection process helps the data mining to extract hidden information from the dataset there is a big scope of machine learning algorithms. Especially supervised machine learning algorithms gain extensive importance in data mining research. Boosting action is regularly helps the supervised machine learning algorithms for rising the predictive / classification veracity. This survey research article prefer two famous supervised machine learning algorithms that is decision trees and support vector machine and presented the recent research works carried out. Also recent improvement on Adaboost algorithms (boosting process) is also granted. From this survey research it is learnt that connecting supervised machine learning algorithm with boosting process increased prediction efficiency and there is a wide scope in this research element.
TL;DR: A combination of the contrast limited adaptive histogram equalization (CLAHE) method and the wavelet based Fusion techniques are used and it is found that based on adaptive Fusion the visual content of the medical images are efficiently improved under all kind of capturing environments.
Abstract: Medical image processing is a challenging field of research since the captured images suffers from the noise and poor contrast. The efficiency of the medical image processing depends on the quality of the captured medical images. Major factors for the low contrast medical images are age of capturing equipments, poor illumination conditions and inexperience of medical staff. Thus, contrast enhancement methods are used for improving the contrast of medical images before being used. In this paper an combination of the contrast limited adaptive histogram equalization (CLAHE) method and the wavelet based Fusion techniques are used for designing the efficient medical image enhancement method. Method is capable of adapting the Fusion rules adaptively for best enhancement results. First CLAHE image enhancement is used for improving the contrast of the medical images. then in second stage 2D Discrete wavelet transformation based adaptive image fusion is used for fusing the original and CLAHE output images. For testing the performance SNR and entropy are calculated and used as parameters. It is found that based on adaptive Fusion the visual content of the medical images are efficiently improved under all kind of capturing environments.
TL;DR: A system which detect fraud in credit card transaction processing using a decision tree with combination of Luhn's algorithm and Hunt's algorithm is proposed.
Abstract: Online shopping and banking has increased by the growth of internet and by use of credit card. Along with this number of credit card fraud is also increased. Many modern techniques based on Artificial Intelligence, Data warehousing has evolved in detecting various credit card fraudulent transactions. We proposed a system which detect fraud in credit card transaction processing using a decision tree with combination of Luhn's algorithm and Hunt's algorithm. Luhn’s algorithm is used to validate the card number. Address matching rule checks whether the Billing Address and Shipping Address match or not. This check does not guarantee whether a transaction is fraud or genuine. But if the two addresses match, the transaction can be classified as genuine with a high probability. Else, the transaction is labelled as suspect. A customer usually carries out similar types of transactions in terms of amount, which can be visualized as part of a cluster. Since a fraudster is likely to differ from the customer’s account, his transactions can be detected as exceptions to the cluster – a process known as outlier detection. General Terms Credit card fraud, online Transaction, Electronic Commerce
TL;DR: This paper conducts a survey of techniques which are available for face detection and indicates that hybrid approach with discrete wavelet transformation produces better results.
Abstract: Face Recognition is used in order to ensure authentication in terms of feature verification. Techniques are defined to identify faces under different situations. This paper conducts a survey of techniques which are available for face detection. Recognition is possible in case features are extracted from the presented face images. For this purpose feature extraction mechanisms like discrete wavelet transformation (DWT), SIFT, linear discriminate analysis (LDA), principal component analysis (PCA) are commonly used. Analysis process indicates that hybrid approach with discrete wavelet transformation produces better results. Comparative study of literature is also presented through this work.
TL;DR: This paper focuses on improving KNN classifier in existing intrusion detection task which combines K-MEANS clustering and KNN classification, to improve IDS performance.
Abstract: These days, with the tremendous growth of network-based service and shared information on networks, the risk of network attacks and intrusions increases too, therefore network security and protecting the network is getting more significance than before. Intrusion Detection System (IDS) is one of the solutions to detect attacks and anomalies in the network. The ever rising new intrusion or attack types causes difficulties for their detection, therefore Data mining techniques has been widely applied in network intrusion detection systems for extracting useful knowledge from large number of network data to detect intrusions. Many clustering and classification algorithms are used in IDS, therefore improving the functionality of these algorithms will improve IDS performance. This paper focuses on improving KNN classifier in existing intrusion detection task which combines K-MEANS clustering and KNN classification.
TL;DR: This research study presents a wrapper approach for intrusion detection with a superior overall performance and performs better than other leading state-of-the-arts models such as KNN, Boosted DT, Hidden NB and Markov chain.
Abstract: Increasing internet usage and connectivity demands a network intrusion detection system combating cynical network attacks. Data mining therefore is a popular technique used by intrusion detection system to prevent the network attacks and classify the network events as either normal or attack. Our research study presents a wrapper approach for intrusion detection. In this framework Feature selection technique eliminate the irrelevant features to reduce the time complexity and build a better model to predict the result with a greater accuracy and Bayesian network works as a base classifier to predict the types of attack. Our experiment shows that the proposed framework exhibits a superior overall performance in terms of accuracy which is 98.2653 , error rate of 1.73 and keeps the false positive rate at a lower rate of 0.007. Our model performed better than other leading state-of-the-arts models such as KNN, Boosted DT, Hidden NB and Markov chain. The NSL-KDD is used as benchmark data set with Weka library functions in the experimental setup. General Terms Pattern Recognition. Intrusion detection system, Data Mining
TL;DR: 1. ENISA (European Network and Information Security Agency), “Risk Management /Risk Assessment “ (available on-line at http://www.enisa.europa.eu/rmra)
Abstract: 1. ENISA (European Network and Information Security Agency), “Risk Management /Risk Assessment “ (available on-line at http://www.enisa.europa.eu/rmra) 2. Walid Al-Ahmad and Bassil Mohammad. Addressing information security risks by adopting standards. International Journal of Information Security Science, 2(2):28_43, 2013. 3. Tom Carlson, HF Tipton, and M Krause. Understanding Information Security Management Systems. Auerbach Publications Boca Raton, FL, 2008.
TL;DR: This research describes a low cost and flexible security system which is basically based on Arduino with necessary interface to enable Internet and the control of power through Global System for Mobile Communication & Bluetooth module (HC-05).
Abstract: This research describes a low cost and flexible security system which is basically based on Arduino with necessary interface to enable Internet and the control of power through Global System for Mobile Communication (GSM) & Bluetooth module (HC-05). This paper consumed more real life interactions along with embedded software solutions. In this project a password is set for the access of all sensors, for this we use LCD Display and Keypad. Motion sensor, gas module, reed sensor, laser sensor, all the sensors are used to detect theft and unwanted occurrences. Control Panel Interface of Web Server and android voice control both are created to control of all lights of the organization and for the purpose of power savings. The proposed system requires minimum human intervention to control the system. This research ensures the safety of organization from unwanted occurrence and theft. The main contribution of this paper is that it not only helps to ensure the security of an organization but also energy efficient and time saving.
TL;DR: Diverse calculations will be examined for security of information in distributed computing for cloud computing services.
Abstract: Cloud computing provides services over web with powerful resizable resources. Cloud computing facilities give advantages to the end user in terms of cost and ease of use. Cloud computing services require security during transfer of important data and censorious applications to shared and public cloud environments. To store information on cloud, client needs to exchange their information to the outsider who will deal with and store the information. So it is imperative for any association to secure that information. Information is said to be secured if the classification, accessibility, security is available. Numerous calculations have been use to secure the information. In this paper diverse calculations will examine for security of information in distributed computing.
TL;DR: This paper has analyzed the Movie reviews using various techniques like Naïve Bayes, K-Nearest Neighbour and Random Forest to find the sentiment of the person with respect to a given source of content.
Abstract: Sentiment analysis is the analysis of emotions and opinions from any form of text. Sentiment analysis is also termed as opinion mining. Sentiment analysis of the data is very useful to express the opinion of the mass or group or any individual. This technique is used to find the sentiment of the person with respect to a given source of content. Social media and other online platforms contain a huge amount of the data in the form of tweets, blogs, and updates on the status, posts, etc. In this paper, we have analyzed the Movie reviews using various techniques like Naïve Bayes, K-Nearest Neighbour and Random Forest.
TL;DR: Novel methods to retrieve relevant images from large image databases are presented and it is shown that the proposed methods give better performance than other systems.
Abstract: Nowadays, rapid and effective searching for relevant images in large image databases has become an area of wide interest in many applications. The current image retrieval system is based on text-based approaches. This system has many challenges such as it cannot retrieve images that are context sensitive and the amount of effort required to manually annotate every image, as well as the difference in human perception when describing the images, which result in inaccuracies during the retrieval process. Content-based image retrieval (CBIR) supports an effective way to retrieve images depending on automatically derived image features. It retrieves relevant images using unique image features such as texture, color or shape. This paper presents novel methods to retrieve relevant images from large image databases. Two proposed methods are presented. The first proposed method improves the retrieval performance by identifying the most efficient gray-level cooccurrence matrix (GLCM) texture features and combine them with the appropriate Discrete Wavelet Transform (DWT) decomposition band. The second proposed method increases the system performance by combining color and texture features as one feature vector which is resulting in increasing the retrieval accuracy. The proposed methods have shown a promising and faster retrieval on a WANG image database containing 1000 color images. The retrieval performance has been evaluated with the existing systems that discussed in the literature. The proposed methods give better performance than other systems.
TL;DR: Detailed review in the field of Optical Character Recognition is presented and various techniques that have been proposed to realize the center of character recognition in an optical character recognition system are determined.
Abstract: At present scenario, there is growing demand for the software system to recognize characters in a computer system when information is scanned through paper documents. This paper presents detailed review in the field of Optical Character Recognition. Various techniques are determined that have been proposed to realize the center of character recognition in an optical character recognition system. OCR (Optical Character Recognition) translates images of typewritten or handwritten characters into the electronically editable format and it preserves font properties. Different techniques for preprocessing and segmentation have been surveyed and discussed in this paper.
TL;DR: This paper provides a brief to the concept of fiber communication and various modulation schemes and the developments that had been done in this work are defined in related work section.
Abstract: With the increase in the technology of networks and the internet, the need of the users also increases. The requirement of high bandwidth, high data transmission rate etc increases. To fulfill this need the concept of fiber optic was developed. Fiber optic communication is optical communication which is the combination of two communication methodologies and can be used for both wired and wireless communication systems. This form of communication is used by the users from many years but still it requires some advancements and developments to make it more refine. The conventional systems designed for RoF technology comprises of various drawbacks such as limited number of users, unwanted frequencies in the signals and quality of the system. This paper provides a brief to the concept of fiber communication and various modulation schemes along with this the developments that had been done in this work are also define in related work section.
TL;DR: This research synthesizes binary classification in which various approaches for binary classification are discussed and sockpuppet detection is based on binary.
Abstract: In the field of information extraction and retrieval, binary classification is the process of classifying given document/account on the basis of predefined classes. Sockpuppet detection is based on binary, in which given accounts are detected either sockpuppet or non-sockpuppet. Sockpuppets has become significant issues, in which one can have fake identity for some specific purpose or malicious use. Text categorization is also performed with binary classification. This research synthesizes binary classification in which various approaches for binary classification are discussed.
TL;DR: This paper aims to present an overview on BCI different EEG brain signal recording artifacts and the methodologies to remove these artifacts from the signal focusing on different novel trends at BCI research areas.
Abstract: Brain Computer Interface (BCI) is often directed at mapping, assisting, or repairing human cognitive or sensory-motor functions. Electroencephalogram (EEG) is a non-invasive method of acquisition brain electrical activities. Noises are impure the EEG recorded signal due to the physiologic and extra-physiologic artifacts. There are several techniques are intended to manipulate the EEG recorded signal during the BCI preprocessing stage of to achieve preferable results at the learning stage. This paper aims to present an overview on BCI different EEG brain signal recording artifacts and the methodologies to remove these artifacts from the signal focusing on different novel trends at BCI research areas.
TL;DR: This article considers the novel integration of machine learning and optimization for the complex and dynamic context of Robot learning and presents an effective framework for learning and solving the global optimization problem within the context of Robotics and learning.
Abstract: Machine learning is currently identified as one of the major parts of the research in Robotics. However the advanced concept of machine learning plus optimization reported effective for developing learning systems. This article considers the novel integration of machine learning and optimization for the complex and dynamic context of Robot learning. Further the proposed case study presents an effective framework for learning and solving the global optimization problem within the context of Robotics and learning.
TL;DR: This paper presents an empirical study of most widely used feature selection methods viz.
Abstract: Feature selection is one of the well known solution to high dimensionality problem of text categorization. In text categorization, selection of good features (terms) plays a very important role. Feature selection is a strategy that can be used to improve categorization accuracy, effectiveness and computational efficiency. This paper presents an empirical study of most widely used feature selection methods viz. Term Frequency-Inverse Document Frequency (tf·idf ), Information Gain (IG), Mutual Information(MI), CHI-Square (χ), Ambiguity Measure (AM), Term Strength (TS), Term Frequency-Relevance Frequency (tf·rf ) and Symbolic Feature Selection (SFS) with five different classifiers (Nave Bayes, KNearest Neighbor, Centroid Based Classifier, Support Vector Machine and Symbolic Classifier). Experimentations are carried out on standard bench mark datasets like Reuters-21578, 20-Newsgroups and 4 University dataset.