scispace - formally typeset
Search or ask a question

Showing papers in "International Journal of Computer Applications in 2014"


Journal ArticleDOI
TL;DR: An overview of the different routing strategies used in wireless sensor networks is given and the comparison of these different routing protocols based on metrics such as mobility support, stability, issues and latency is shown.
Abstract: This paper represents energy efficient routing protocols in WSN. It is a collection of sensor nodes with a set of limited Processor and limited memory unit embedded in it. Reliable routing of packets from the sensor node to its base station is the most important task for the networks. The routing protocols applied for the other networks cannot be used here due to its battery powered nodes This paper gives an overview of the different routing strategies used in wireless sensor networks and gives a brief working model of energy efficient routing protocols in WSN. It also shows the comparison of these different routing protocols based on metrics such as mobility support, stability, issues and latency.

579 citations


Journal Article
TL;DR: This research is working on the development of a hybrid model using LEACH based energy efficient and K-means based quick clustering algorithms to produce a new cluster scheme for WSNs with dynamic selection of the number of the clusters automatically.
Abstract: consist of hundreds of thousands of small and cost effective sensor nodes. Sensor nodes are used to sense the environmental or physiological parameters like temperature, pressure, etc. For the connectivity of the sensor nodes, they use wireless transceiver to send and receive the inter-node signals. Sensor nodes, because connect their selves wirelessly, use routing process to route the packet to make them reach from source to destination. These sensor nodes run on batteries and they carry a limited battery life. Clustering is the process of creating virtual sub-groups of the sensor nodes, which helps the sensor nodes to lower routing computations and to lower the size routing data. There is a wide space available for the research on energy efficient clustering algorithms for the WSNs. LEACH, PEGASIS and HEED are the popular energy efficient clustering protocols for WSNs. In this research, we are working on the development of a hybrid model using LEACH based energy efficient and K-means based quick clustering algorithms to produce a new cluster scheme for WSNs with dynamic selection of the number of the clusters automatically. In the proposed method, finding an optimum "k" value is performed by Elbow method and clustering is done by k-means algorithm, hence routing protocol LEACH which is a traditional energy efficient protocol takes the work ahead of sending data from the cluster heads to the base station. The results of simulation show that at the end of some certain part of running the proposed algorithm, at some point the marginal gain will drop dramatically and gives an angle in the graph. The correct "k" i.e. number of clusters is chosen at this point, hence the "elbow criterion".

476 citations


Journal ArticleDOI
TL;DR: This survey summarizes the security threats and privacy concerns of IoT.
Abstract: This paper introduces Internet of Things (IoTs), which offers capabilities to identify and connect worldwide physical objects into a unified system. As a part of IoTs, serious concerns are raised over access of personal information pertaining to device and individual privacy. This survey summarizes the security threats and privacy concerns of IoT..

286 citations


Journal ArticleDOI
TL;DR: The modified J48 classifier is used to increase the accuracy rate of the data mining procedure and Experimental results showed a significant improvement over the existing J-48 algorithm.
Abstract: research work deals with efficient data mining procedure for predicting the diabetes from medical records of patients. Diabetes is a very common disease these days in all populations and in all age groups. Diabetes contributes to heart disease, increases the risks of developing kidney disease, nerve damage, blood vessel damage and blindness. So mining the diabetes data in efficient manner is a critical issue. The Pima Indians Diabetes Data Set is used in this paper; which collects the information of patients with and without having diabetes. The modified J48 classifier is used to increase the accuracy rate of the data mining procedure. The data mining tool WEKA has been used as an API of MATLAB for generating the J-48 classifiers. Experimental results showed a significant improvement over the existing J-48 algorithm. KeywordsDecision Tree, MATLAB, Data Mining, Diabetes, WEKA.

236 citations


Journal Article
TL;DR: The meaning of quadratic assignment problem, solving techniques and a survey of some developments and researches are discussed.
Abstract: The quadratic assignment problem (QAP) is very challengeable and interesting problem that can model many real-life problems. In this paper, we will simply discuss the meaning of quadratic assignment problem, solving techniques and we will give a survey of some developments and researches.

180 citations


Journal ArticleDOI
TL;DR: A general survey of multicast routing protocols in Mobile adhoc Networks (MANETs) is given, which plays an important role in MANETs to provide group communication.
Abstract: There are many benefits of multicasting using with network. The communication cost reduced by multicasting for applications that sends the same data to many recipients instead of sending via multiple unicast. This paper gives a general survey of multicast routing protocols in Mobile adhoc Networks (MANETs). The multicast routing protocols are divided into two categories- multicast routing based on application independence and multicast routing based on application dependence. Multicast routing protocols plays an important role in MANETs to provide group communication. Multicasting is one of the major communication technologies primarily designed for bandwidth conservation and an efficient way of transferring data to a group of receivers in wireless mesh networks.

173 citations


Journal ArticleDOI
TL;DR: An efficient and effective hybrid clustering method, named BDE-DBSCAN, that combines Binary Differential Evolution and DBSCAN algorithm to simultaneously quickly and automatically specify appropriate parameter values for Eps and MinPts is presented.
Abstract: Over the last several years, DBSCAN (Density-Based Spatial Clustering of Applications with Noise) has been widely applied in many areas of science due to its simplicity, robustness against noise (outlier) and ability to discover clusters of arbitrary shapes. However, DBSCAN algorithm requires two initial input parameters, namely Eps (the radius of the cluster) and MinPts (the minimum data objects required inside the cluster) which both have a significant influence on the clustering results. Hence, DBSCAN is sensitive to its input parameters and it is hard to determine them a priori. This paper presents an efficient and effective hybrid clustering method, named BDE-DBSCAN, that combines Binary Differential Evolution and DBSCAN algorithm to simultaneously quickly and automatically specify appropriate parameter values for Eps and MinPts. Since the Eps parameter can largely degrades the efficiency of the DBSCAN algorithm, the combination of an analytical way for estimating Eps and Tournament Selection (TS) method is also employed. Experimental results indicate the proposed method is precise in determining appropriate input parameters of DBSCAN algorithm.

119 citations


Journal ArticleDOI
TL;DR: Comparing simulation results of different window, this paper has found Blackman window with best performance among them which is expected from the theory and found the same expected result.
Abstract: In the emerging field of medical image processing, computer vision, pattern recognition and other digital signal processing applications, window technique is vastly used. A window function is a mathematical function that is zero-valued outside of some chosen interval. When another function is multiplied by a window function, the product is also zero-valued outside the interval. In this paper, the performance of Hamming, Hanning and Blackman window have been mainly compared considering their magnitude response, phase response, equivalent noise bandwidth, sidelobe transition width, response in time and frequency domain using MATLAB simulation. To observe the responses, a FIR filter of low pass, high pass, band pass and band stop type have been designed and encountered them with each parameters stated above. The results that have been found is as same as its to be as stated in the theory. Comparing simulation results of different window, this paper has found Blackman window with best performance among them which is also expected from the theory. These windows have also been encountered with speech signal using MATLAB simulation and found the same expected result.

115 citations


Journal ArticleDOI
TL;DR: The survey results shows that Graph based representation is appropriate way of representing text document and improved result of analysis over traditional model for different text applications.
Abstract: A common and standard approach to model text document is bag-of-words. This model is suitable for capturing word frequency, however structural and semantic information is ignored. Graph representation is mathematical constructs and can model relationship and structural information effectively. A text can appropriately represented as Graph using vertex as feature term and edge relation can be significant relation between the feature terms. Text representation using Graph model provides computations related to various operations like term weight, ranking which is helpful in many applications in information retrieval. This paper presents a systematic survey of existing work on Graph based representation of text and also focused on Graph based analysis of text document for different operations in information retrieval. In this process taxonomy of Graph based representation and analysis of text document is derived and result of different methods of Graph based text representation and analysis are discussed. The survey results shows that Graph based representation is appropriate way of representing text document and improved result of analysis over traditional model for different text applications.

104 citations


Journal ArticleDOI
TL;DR: This paper focuses on public key cryptographic algorithms based on homomorphic encryption scheme for preserving security and various homomorphic algorithms using asymmetric key systems such as RSA, ElGamal, Paillier algorithms as well as various homomorph encryption schemes such as BrakerskiGentry-Vaikuntanathan (BGV), Enhanced homomorphic Cryptosystem (EHC), Algebra homomorphicryption scheme based on updated ElGam al (AHEE).
Abstract: Homomorphic encryption is the encryption scheme which means the operations on the encrypted data. Homomorphic encryption can be applied in any system by using various public key algorithms. When the data is transferred to the public area, there are many encryption algorithms to secure the operations and the storage of the data. But to process data located on remote server and to preserve privacy, homomorphic encryption is useful that allows the operations on the cipher text, which can provide the same results after calculations as the working directly on the raw data. In this paper, the main focus is on public key cryptographic algorithms based on homomorphic encryption scheme for preserving security. The case study on various principles and properties of homomorphic encryption is given and then various homomorphic algorithms using asymmetric key systems such as RSA, ElGamal, Paillier algorithms as well as various homomorphic encryption schemes such as BrakerskiGentry-Vaikuntanathan (BGV), Enhanced homomorphic Cryptosystem (EHC), Algebra homomorphic encryption scheme based on updated ElGamal (AHEE), Non-interactive exponential homomorphic encryption scheme (NEHE) are investigated.

103 citations


Journal ArticleDOI
TL;DR: This survey paper discusses such successful techniques and methods to give effectiveness over information retrieval in text mining, the types of situations where each technology may be useful in order to help users are discussed.
Abstract: In recent years growth of digital data is increasing, knowledge discovery and data mining have attracted great attention with coming up need for turning such data into useful information and knowledge. The use of the information and knowledge extracted from a large amount of data benefits many applications like market analysis and business management. In many applications database stores information in text form so text mining is the one of the most resent area for research. To extract user required information is the challenging issue. Text Mining is an important step of knowledge discovery process. Text mining extracts hidden information from notstructured to semi-structured data. Text mining is the discovery by automatically extracting information from different written resources and also by computer for extracting new, previously unknown information. This survey paper tries to cover the text mining techniques and methods that solve these challenges. In this survey paper we discuss such successful techniques and methods to give effectiveness over information retrieval in text mining. The types of situations where each technology may be useful in order to help users are also discussed.

Journal ArticleDOI
TL;DR: This paper presents automated detection of tumor in brain MRI using machine learning algorithms, which is divided into three parts: preprocessing steps are applied on brain MRI images, texture features are extracted using Gray Level Co-occurrence Matrix (GLCM), and classification is done using machineLearning algorithm.
Abstract: Automated defect detection in medical imaging has become the emergent field in several medical diagnostic applications. Automated detection of tumor in Magnetic Resonance Imaging (MRI) is very crucial as it provides information about abnormal tissues which is necessary for planning treatment. The conventional method for defect detection in magnetic resonance brain images is human inspection. This method is impractical for large amount of data. So, automated tumor detection methods are developed as it would save radiologist time. The MRI brain tumor detection is complicated task due to complexity and variance of tumors. In this paper, tumor is detected in brain MRI using machine learning algorithms. The proposed work is divided into three parts: preprocessing steps are applied on brain MRI images, texture features are extracted using Gray Level Co-occurrence Matrix (GLCM) and then classification is done using machine learning algorithm.

Journal ArticleDOI
TL;DR: A framework of a complete image stitching system based on feature based approaches will be introduced and the current challenges of image stitching will be discussed.
Abstract: stitching (Mosaicing) is considered as an active research area in computer vision and computer graphics. Image stitching is concerned with combining two or more images of the same scene into one high resolution image which is called panoramic image. Image stitching techniques can be categorized into two general approaches: direct and feature based techniques. Direct techniques compare all the pixel intensities of the images with each other, whereas feature based techniques aim to determine a relationship between the images through distinct features extracted from the processed images. The last approach has the advantage of being more robust against scene movement, faster, and has the ability to automatically discover the overlapping relationships among an unordered set of images. The purpose of this paper is to present a survey about the feature based image stitching. The main components of image stitching will be described. A framework of a complete image stitching system based on feature based approaches will be introduced. Finally, the current challenges of image stitching will be discussed. Keywordsstitching/mosaicing, panoramic image, features based detection, SIFT, SURF, image blending.

Journal ArticleDOI
TL;DR: This paper presents a review of some major work in area of flat and data centric routing technique and hierarchical routing technique for WSNs and compares the characteristics and performance issues of different routing protocols.
Abstract: Wireless sensor network is a self configured network being composed of a large number of sensors. Due to the fact that sensors in the wireless sensor network are powered with battery and it is difficult to replace and/or recharge their batteries, energy efficient routing is the major concern in the field of wireless sensor network to enhance the lifetime of the network. Consequently, Numbers of routing techniques have been proposed for wireless sensor network to make longer life time and low energy consumption. Mainly these are sorted into three categories such as Flat and data centric routing, Hierarchical routing, Location based routing. This paper presents a review of some major work in area of flat and data centric routing technique and hierarchical routing technique for WSNs. This article also compares the characteristics and performance issues of different routing protocols. General Terms Routing Techniques in Wireless Sensor Networks

Journal ArticleDOI
TL;DR: This research paper has proposed ANN-Bayesian Net-GR technique that means ensemble of Artificial Neural Network (ANN) and Bayesian Net with Gain Ratio (GR) feature selection technique and its ensemble model produces highest accuracy compare to others.
Abstract: Information security is extremely critical issues for every organization to protect information from unauthorized access. Intrusion detection system has one of the important roles to prevent data or information from malicious behaviours. Basically Intrusion detection system is a classifier that can classify the data as normal or attacks. In this research paper, we have proposed ANN-Bayesian Net-GR technique that means ensemble of Artificial Neural Network (ANN) and Bayesian Net with Gain Ratio (GR) feature selection technique. We have applied various individual classification techniques and its ensemble model on KDD99 and NSL-KDD data set to check the robustness of model. Due to irrelevant features in data set, also applied Gain Ratio feature selection technique on best model. Finally our proposed model produces highest accuracy compare to others.

Journal ArticleDOI
TL;DR: The fundamentals of data mining steps like preprocessing the data, feature selection (to select the relevant features and removing the irrelevant and redundant features), classification and evaluation of different classifier models using WEKA tool are given.
Abstract: basic principle of data mining is to analyze the data from different perspectives, classify it and recapitulate it. Data mining has become very popular in each and every application. Though we have large amount of data but we don't have useful information in every field. There are many data mining tools and software to facilitate us the useful information. This paper gives the fundamentals of data mining steps like preprocessing the data (removing the noisy data, replacing the missing values etc.), feature selection (to select the relevant features and removing the irrelevant and redundant features), classification and evaluation of different classifier models using WEKA tool. The WEKA tool is not useful for only one type of application, though it can be used in various applications. This tool consists of various algorithms for feature selection, classification and clustering as well. Keywordsfeature selection, classification, clustering, evaluation of classifier models, evaluation of cluster models.

Journal ArticleDOI
TL;DR: This paper has proposed an appearance feature-based approach which process data using Histogram of Oriented Gradients (HOG), a very efficient feature descriptor for handwritten digits which is stable on illumination variation because it is a gradient-based descriptor.
Abstract: Automatic Handwritten Digits Recognition (HDR) is the process of interpreting handwritten digits by machines. There are several approaches for handwritten digits recognition. In this paper we have proposed an appearance feature-based approach which process data using Histogram of Oriented Gradients (HOG). HOG is a very efficient feature descriptor for handwritten digits which is stable on illumination variation because it is a gradient-based descriptor. Moreover, linear SVM has been employed as classifier which has better responses than polynomial, RBF and sigmoid kernels. We have analyzed our model on MNIST dataset and 97.25% accuracy rate has been achieved which is comparable with the state of the art. General Terms Image Processing, Computer Vision, Artificial Intelligence

Journal ArticleDOI
TL;DR: This paper presents a decision tree based data mining technique for early detection of breast cancer and discusses various data mining approaches that have been utilized for breast cancer diagnosis, and summarizes breast cancer in general.
Abstract: is a big issue all around the world. It is a disease, which is fatal in many cases and has affected the lives of many and will continue to affect the lives of many more. Breast cancer represents the second primary cause of cancer deaths in women today and has become the most common cancer among women both in the developed and the developing world in the last years. 40,000 women die in a year from this disease, which is one woman every 13 minute dying from this disease everyday. Early detection of breast cancer is far easier to cure. This paper presents a decision tree based data mining technique for early detection of breast cancer. Breast cancer diagnosis differentiates benign (lacks ability to invade neighboring tissue) from malignant (ability to invade neighboring tissue) breast tumors. This paper also discusses various data mining approaches that have been utilized for breast cancer diagnosis, and also summarizes breast cancer in general (types, risk factors, symptoms and treatment).

Journal ArticleDOI
TL;DR: Though few images used, experimentation proves that K-Means significantly segment images much better in L*a*b* color space as compared to RGB feature space.
Abstract: K-Means reasonably divides the data into k groups is an important question that arises when one works on Image Segmentation? Which color space one should choose and how to ascertain that the k we determine is valid? The purpose of this study was to explore the answers to aforementioned questions. We perform K-Means on a number of 2-cluster, 3- cluster and k-cluster color images (k>3) in RGB and L*a*b* feature space. Ground truth (GT) images have been used to accomplish validation task. Silhouette analysis supports the peaks for given k-cluster image. Model accuracy in RGB space falls between 30% and 55% while in L*a*b* color space it ranges from 30% to 65%. Though few images used, but experimentation proves that K-Means significantly segment images much better in L*a*b* color space as compared to RGB feature space. Keywordsevaluation, L*a*b* Color Space, Precision Recall

Journal ArticleDOI
TL;DR: The genetic algorithm and simulated annealing enable searching for a low-cost WMN configuration with constraints and determine the number of used gateways to minimize WMN network costs while satisfying quality of service.
Abstract: Mesh clients, mesh routers and gateways are components of Wireless Mesh Network (WMN). In WMN, gateways connect to Internet using wireline links and supply Internet access services for users. Multiple gateways are needed, which take time and cost budget to set up, due to the limited wireless channel bit rate. WMN is a highly developed technology that offers to end users a wireless broadband access. It offers a high degree of flexibility contrasted to conventional networks; however, this attribute comes at the expense of a more complex construction. Therefore, a challenge is the planning and optimization of WMNs. This paper concentrates on the challenge using a genetic algorithm and simulated annealing. The genetic algorithm and simulated annealing enable searching for a low-cost WMN configuration with constraints and determine the number of used gateways. Experimental results proved that the performance of the genetic algorithm and simulated annealing in minimizing WMN network costs while satisfying quality of service. The proposed models are presented to significantly outperform the existing solutions.

Journal ArticleDOI
TL;DR: This review paper provides an overview of existing mobile ad-hoc proactive and reactive routing protocols depending on their reactive and reactive nature respectively by presenting their characteristics, functionality, benefits and limitations and then makes their comparative analysis so to analyze their performance.
Abstract: mobile ad-hoc network is characterized as network without any physical connections. In this network there is no fixed topology due to the mobility of nodes, interference, multipath propagation and path loss. Many Routing protocols have been developed to overcome these characteristics. The purpose of this paper is to review existing mobile ad-hoc proactive and reactive routing protocols depending on their proactive and reactive nature respectively. This review paper provides an overview of these protocols by presenting their characteristics, functionality, benefits and limitations and then makes their comparative analysis so to analyze their performance. The objective of this review paper is to provide analysis about improvement of these existing protocols.

Journal ArticleDOI
TL;DR: A comprehensive evaluation on structures, techniques and different algorithms in this field is done and a new categorization of techniques in NNS is presented and variety of these techniques has made them suitable for different applications such as pattern recognition.
Abstract: Nowadays, the need to techniques, approaches, and algorithms to search on data is increased due to improvements in computer science and increasing amount of information. This ever increasing information volume has led to time and computation complexity. Recently, different methods to solve such problems are proposed. Among the others, nearest neighbor search is one of the best techniques to this end which is focused by many researchers. Different techniques are used for nearest neighbor search. In addition to put an end to some complexities, variety of these techniques has made them suitable for different applications such as pattern recognition, searching in multimedia data, information retrieval, databases, data mining, and computational geometry to name but a few. In this paper, by opening a new view to this problem, a comprehensive evaluation on structures, techniques and different algorithms in this field is done and a new categorization of techniques in NNS is presented. This categorization is consists of seven groups: Weighted, Reductional, Additive, Reverse, Continuous, Principal Axis and Other techniques which are studied, evaluated and compared in this paper. Complexity of used structures, techniques and their algorithms are discussed, as well.

Journal ArticleDOI
TL;DR: Analysis and applications of the most important techniques in data mining; multilayer perception neural network (MLPNN), tree augmented Naive Bayes (TAN) known as Bayesian networks, Nominal regression or logistic regression (LR), and Ross Quinlan new decision tree model (C5.0) are introduced.
Abstract: bank marketing campaigns are dependent on customers' huge electronic data. The size of these data sources is impossible for a human analyst to come up with interesting information that will help in the decision-making process. Data mining models are completely helping in the performance of these campaigns. This paper introduces analysis and applications of the most important techniques in data mining; multilayer perception neural network (MLPNN), tree augmented Naive Bayes (TAN) known as Bayesian networks, Nominal regression or logistic regression (LR), and Ross Quinlan new decision tree model (C5.0). The objective is to examine the performance of MLPNN, TAN, LR and C5.0 techniques on a real-world data of bank deposit subscription. The purpose is increasing the campaign effectiveness by identifying the main characteristics that affect a success (the deposit subscribed by the client) based on MLPNN, TAN, LR and C5.0. The experimental results demonstrate, with higher accuracies, the success of these models in predicting the best campaign contact with the clients for subscribing deposit. The performances are calculated by three statistical measures; classification accuracy, sensitivity, and specificity.

Journal ArticleDOI
TL;DR: Social popularity factor are incorporated in SVD++ factorization method as implicit feedback to improve accuracy and scalability of recommendations.
Abstract: Recommender systems have shown a lot of awareness in the past decade. Due to their great business value, recommender systems have also been successfully deployed in business, such as product recommendation at flipkart, HomeShop18, and music recommendation at Last.fm, Pandora, and movie recommendation at Flixstreet, MovieLens, and Jinni. In the past few years, the incredible growth of Web 2.0 web sites and applications constitute new challenges for Traditional recommender systems. Traditional recommender systems always ignore social interaction among users. But in our real life, when we are asking our friends or looking opinions, reviews for recommendations of Mobile or heart touching music, movies, electronic gadgets, restaurant, book, games, software Apps, we are actually using social information for recommendations. In this paper social popularity factor are incorporated in SVD++ factorization method as implicit feedback to improve accuracy and scalability of recommendations.

Journal ArticleDOI
TL;DR: Focus is on outlining the Stylometric features that allow distinguishing between authors and on listing the diverse techniques used to classify an author's texts.
Abstract: objective in this paper is to provide a review of the different studies done on authorship analysis. Focus is on outlining the Stylometric features that allow distinguishing between authors and on listing the diverse techniques used to classify an author's texts.

Journal ArticleDOI
TL;DR: This study provides the researchers with a single platform to analyze the conventional and contemporary nature inspired algorithms in terms of required input parameters, their key evolutionary strategies and application areas to overcome the problem of ‘curse of dimensionality’.
Abstract: Nature-inspired algorithms have gained immense popularity in recent years to tackle hard real world (NP hard and NP complete) problems and solve complex optimization functions whose actual solution doesn’t exist. The paper presents a comprehensive review of 12 nature inspired algorithms. This study provides the researchers with a single platform to analyze the conventional and contemporary nature inspired algorithms in terms of required input parameters, their key evolutionary strategies and application areas. A list of automated toolboxes available for directly evaluating these nature inspired algorithms over numerical optimization problems indicates the need for unified toolbox for all nature inspired algorithms. It also elucidates the users with the minimum and maximum dimensions over which these algorithms have already been evaluated on benchmark test functions. Hence this study would aid the research community to know what all algorithms could be examined for large scale global optimization to overcome the problem of ‘curse of dimensionality’.

Journal ArticleDOI
TL;DR: A set of attributes are first defined for a group of students majoring in Computer Science in some undergraduate colleges in Kolkata and it was found that the best results were obtained with the decision tree class of algorithms.
Abstract: Anal Acharya, Department of Computer Science, St Xavier’s College, Kolkata, India. Devadatta Sinha, Department of Computer Science and Engineering, University of Calcutta, Kolkata, India. ABSTRACT In recent years Educational Data Mining (EDM) has emerged as a new field of research due to the development of several statistical approaches to explore data in educational context. One such application of EDM is early prediction of student results. This is necessary in higher education for identifying the “weak” students so that some form of remediation may be organized for them. In this paper a set of attributes are first defined for a group of students majoring in Computer Science in some undergraduate colleges in Kolkata. Since the numbers of attributes are reasonably high, feature selection algorithms are applied on the data set to reduce the number of features. Five classes of Machine Learning Algorithm (MLA) are then applied on this data set and it was found that the best results were obtained with the decision tree class of algorithms. It was also found that the prediction results obtained with this model are comparable with other previously developed models.

Journal ArticleDOI
TL;DR: A brief review of a variety of Data Mining techniques that have been applied to model data from or about the agricultural domain and the application of data mining techniques and predictive modeling application in the agriculture field.
Abstract: As with many other sectors the amount of agriculture data based are increasing on a daily basis. However, the application of data mining methods and techniques to discover new insights or knowledge is a relatively a novel research area. In this paper we provide a brief review of a variety of Data Mining techniques that have been applied to model data from or about the agricultural domain. The Data Mining techniques applied on Agricultural data include k-means, bi clustering, k nearest neighbor, Neural Networks (NN) Support Vector Machine (SVM), Naive Bayes Classifier and Fuzzy cmeans. As can be seen the appropriateness of data mining techniques is to a certain extent determined by the different types of agricultural data or the problems being addressed. This survey summarize the application of data mining techniques and predictive modeling application in the agriculture field.

Journal ArticleDOI
TL;DR: This paper presents amalgam KNN and ANFIS algorithm, which combines the features of adaptive neural network and Fuzzy Inference System, and aims to provide higher classification accuracy than the existing approaches.
Abstract: mellitus or simply diabetes is a disease caused due to the increase level of blood glucose. Various available traditional methods for diagnosing diabetes are based on physical and chemical tests. These methods can have errors due to different uncertainties. A number of Data mining algorithms were designed to overcome these uncertainties. Among these algorithms, amalgam KNN and ANFIS provides higher classification accuracy than the existing approaches. The main data mining algorithms discussed in this paper are EM algorithm, KNN algorithm, K-means algorithm, amalgam KNN algorithm and ANFIS algorithm. EM algorithm is the expectation-maximization algorithm used for sampling, to determine and maximize the expectation in successive iteration cycles. KNN algorithm is used for classifying the objects and used to predict the labels based on some closest training examples in the feature space. K means algorithm follows partitioning methods based on some input parameters on the datasets of n objects. Amalgam combines both the features of KNN and K means with some additional processing. ANFIS is the Adaptive Neuro Fuzzy Inference System which combines the features of adaptive neural network and Fuzzy Inference System. The data set chosen for classification and experimental simulation is based on Pima Indian Diabetic Set from University of California, Irvine (UCI) Repository of Machine Learning databases. Keywordsmining, Diabetes, EM algorithm, KNN algorithm, K- means algorithm, amalgam KNN algorithm, ANFIS algorithm

Journal ArticleDOI
TL;DR: Comparison study of various text summarization methods based on different types of application and taxonomy of summarization systems and statistical and linguistic approaches for summarization are given.
Abstract: Text summarization is one of application of natural language processing and is becoming more popular for information condensation. Text summarization is a process of reducing the size of original document and producing a summary by retaining important information of original document. This paper gives comparative study of various text summarization methods based on different types of application. The paper discusses in detail two main categories of text summarization methods these are extractive and abstractive summarization methods. The paper also presents taxonomy of summarization systems and statistical and linguistic approaches for summarization.