scispace - formally typeset
Search or ask a question

Showing papers in "Procedia Computer Science in 2015"


Journal ArticleDOI
TL;DR: This paper presents k-means clustering algorithm, an unsupervised algorithm used to segment the interest area from the background, and subtractive cluster, a data clustering method, which generates the centroid based on the potential value of the data points.
Abstract: Image segmentation is the classification of an image into different groups. Many researches have been done in the area of image segmentation using clustering. There are different methods and one of the most popular methods is k-means clustering algorithm. K -means clustering algorithm is an unsupervised algorithm and it is used to segment the interest area from the background. But before applying K -means algorithm, first partial stretching enhancement is applied to the image to improve the quality of the image. Subtractive clustering method is data clustering method where it generates the centroid based on the potential value of the data points. So subtractive cluster is used to generate the initial centers and these centers are used in k-means algorithm for the segmentation of image. Then finally medial filter is applied to the segmented image to remove any unwanted region from the image.

709 citations


Journal ArticleDOI
TL;DR: This research proposed a new definition of systems thinking that integrates these components both individually and holistically and was tested for fidelity against a System Test and against three widely accepted system archetypes.
Abstract: This paper proposes a definition of systems thinking for use in a wide variety of disciplines, with particular emphasis on the development and assessment of systems thinking educational efforts. The definition was derived from a review of the systems thinking literature combined with the application of systems thinking to itself. Many different definitions of systems thinking can be found throughout the systems community, but key components of a singular definition can be distilled from the literature. This researcher considered these components both individually and holistically, then proposed a new definition of systems thinking that integrates these components as a system. The definition was tested for fidelity against a System Test and against three widely accepted system archetypes. Systems thinking is widely believed to be critical in handling the complexity facing the world in the coming decades; however, it still resides in the educational margins. In order for this important skill to receive mainstream educational attention, a complete definition is required. Such a definition has not yet been established. This research is an attempt to rectify this deficiency by providing such a definition.

625 citations


Journal ArticleDOI
TL;DR: An overview on the data mining techniques that have been used to predict students performance and how the prediction algorithm can be used to identify the most important attributes in a students data is provided.
Abstract: Predicting students performance becomes more challenging due to the large volume of data in educational databases. Currently in Malaysia, the lack of existing system to analyze and monitor the student progress and performance is not being addressed. There are two main reasons of why this is happening. First, the study on existing prediction methods is still insufficient to identify the most suitable methods for predicting the performance of students in Malaysian institutions. Second is due to the lack of investigations on the factors affecting students achievements in particular courses within Malaysian context. Therefore, a systematical literature review on predicting student performance by using data mining techniques is proposed to improve students achievements. The main objective of this paper is to provide an overview on the data mining techniques that have been used to predict students performance. This paper also focuses on how the prediction algorithm can be used to identify the most important attributes in a students data. We could actually improve students achievement and success more effectively in an eff i cient way using educational data mining techniques. It could bring the benefits and impacts to students, educators and academic institutions.

558 citations


Journal ArticleDOI
TL;DR: This paper reviews various data mining techniques for anomaly detection to provide better understanding among the existing techniques that may help interested researchers to work future in this direction.
Abstract: In the present world huge amounts of data are stored and transferred from one location to another. The data when transferred or stored is primed exposed to attack. Although various techniques or applications are available to protect data, loopholes exist. Thus to analyze data and to determine various kind of attack data mining techniques have emerged to make it less vulnerable. Anomaly detection uses these data mining techniques to detect the surprising behaviour hidden within data increasing the chances of being intruded or attacked. Various hybrid approaches have also been made in order to detect known and unknown attacks more accurately. This paper reviews various data mining techniques for anomaly detection to provide better understanding among the existing techniques that may help interested researchers to work future in this direction.

474 citations


Journal ArticleDOI
TL;DR: The Simulation results show that the NDVI is highly useful in detecting the surface features of the visible area which are extremely beneficial for policy makers in decision making and can be helpful in predicting the unfortunate natural disasters.
Abstract: This article presents an enhanced Change Detection method for the analysis of Satellite image based on Normalized Difference Vegetation Index (NDVI). NDVI employs the Multi-Spectral Remote Sensing data technique to find Vegetation Index, land cover classification, vegetation, water bodies, open area, scrub area, hilly areas, agricultural area, thick forest, thin forest with few band combinations of the remote sensed data. Land Resources are easily interpreted by computing their Normalized Difference Vegetation Index for Land Cover classification. Remote Sensing data from Landsat TM image along with NDVI and DEM data layers have been used to perform multi-source classification. The Change Detection method used was NDVI differencing. NDVI method is applied according to its characteristic like vegetation at different NDVI threshold values such as 0.1, 0.15, 0.2, 0.25, 0.3, 0.35, 0.4 and 0.5. The Simulation results show that the NDVI is highly useful in detecting the surface features of the visible area which are extremely beneficial for policy makers in decision making. The Vegetation analysis can be helpful in predicting the unfortunate natural disasters to provide humanitarian aid, damage assessment and furthermore to device new protection strategies. From the empirical study, the forest or shrub land and Barren land cover types have decreased by about 6% and 23% from 2001 to 2006 respectively, while agricultural land, built-up and water areas have increased by about 19%, 4% and 7% respectively. Curvature, Plan curvature, Profile curvature and Wetness Index areas are also estimated.

367 citations


Journal ArticleDOI
TL;DR: In this article, the authors study the challenges inherent to the real-time processing of massive data flows from the IoT and provide a detailed analysis of traces gathered from a well-known healthcare sport-oriented application in order to illustrate their conclusions from a big data perspective.
Abstract: The Internet of Things (IoT) generates massive streams of data which call for ever more efficient real time processing. Designing and implementing a big data service for the real time processing of such data requires an extensive knowledge of both input load and data distribution in order to provide a service which can cope with the workload. In this context, we study in this paper the challenges inherent to the real time processing of massive data flows from the IoT. We provide a detailed analysis of traces gathered from a well-known healthcare sport-oriented application in order to illustrate our conclusions from a big data perspective.

364 citations


Journal ArticleDOI
TL;DR: A Multi-Level Smart City architecture is proposed based on semantic web technologies and Dempster-Shafer uncertainty theory and described and explained in terms of its functionality and some real-time context-aware scenarios.
Abstract: Wireless sensor networks have increasingly become contributors of very large amounts of data. The recent deployment of wireless sensor networks in Smart City infrastructures has led to very large amounts of data being generated each day across a variety of domains, with applications including environmental monitoring, healthcare monitoringand transport monitoring. To take advantage of the increasing amounts of data there is a need for new methods and techniques for effective data management and analysis to generate information that can assist in managing the utilization of resources intelligently and dynamically. Through this research,a Multi-Level Smart City architecture is proposed based on semantic web technologies and Dempster-Shafer uncertainty theory. The proposed architecture is described and explained in terms of its functionality and some real-time context-aware scenarios.

332 citations


Journal ArticleDOI
TL;DR: A comparative study of the basic Block-Based image segmentation techniques is presented, which shows how these techniques have to be combined with domain knowledge in order to effectively solve an image segmentsation problem for a problem domain.
Abstract: Due to the advent of computer technology image-processing techniques have become increasingly important in a wide variety of applications. Image segmentation is a classic subject in the field of image processing and also is a hotspot and focus of image processing techniques. Several general-purpose algorithms and techniques have been developed for image segmentation. Since there is no general solution to the image segmentation problem, these techniques often have to be combined with domain knowledge in order to effectively solve an image segmentation problem for a problem domain. This paper presents a comparative study of the basic Block-Based image segmentation techniques.

318 citations


Journal ArticleDOI
TL;DR: The purpose of this study is to develop a systematic review of literature on the real cases that applied AHP to evaluate how the criteria are being defined and measured.
Abstract: The Analytic Hierarchy Process (AHP) is widely used by decision makers and researchers. The definition of criteria and the calculation of their weight are central in this method to assess the alternatives. However, there are few studies that focus on them. The purpose of this study is to develop a systematic review of literature on the real cases that applied AHP to evaluate how the criteria are being defined and measured. In the 33 cases selected, they mainly used literature to build the criteria and AHP or Fuzzy AHP to calculate their weight, while other techniques were used to evaluate alternatives.

276 citations


Journal ArticleDOI
TL;DR: This paper presents the 5Vs characteristics of big data and the technique and technology used to handle big data in a wide variety of scalable database tools and techniques.
Abstract: Big data is a collection of massive and complex data sets and data volume that include the huge quantities of data, data management capabilities, social media analytics and real-time data. Big data analytics is the process of examining large amounts of data. There exist large amounts of heterogeneous digital data. Big data is about data volume and large data set's measured in terms of terabytes or petabytes. This phenomenon is called Bigdata. After examining of Bigdata, the data has been launched as Big Data analytics. In this paper, presenting the 5Vs characteristics of big data and the technique and technology used to handle big data. The challenges include capturing, analysis, storage, searching, sharing, visualization, transferring and privacy violations. It can neither be worked upon by using traditional SQL queries nor can the relational database management system (RDBMS) be used for storage. Though, a wide variety of scalable database tools and techniques has evolved. Hadoop is an open source distributed data processing is one of the prominent and well known solutions. The NoSQL has a non-relational database with the likes of MongoDB from Apache.

253 citations


Journal ArticleDOI
TL;DR: Experimental result shows the efficiency of YCbCr color space for the segmentation and detection of skin color in color images.
Abstract: This paper presented a comparative study of human skin color detection HSV and YCbCr color space. Skin color detection is the process of separation between skin and non-skin pixels. It is difficult to develop uniform method for the segmentation or detection of human skin detection because the color tone of human skin is drastically varied for people from one region to another. Literature survey shows that there is a variety of color space is applied for the skin color detection. RGB color space is not preferred for color based detection and color analysis because of mixing of color (chrominance) and intensity (luminance) information and its non uniform characteristics. Luminance and Hue based approaches discriminate color and intensity information even under uneven illumination conditions. Experimental result shows the efficiency of YCbCr color space for the segmentation and detection of skin color in color images.

Journal ArticleDOI
TL;DR: This paper trained various data mining techniques used in credit card fraud detection and evaluated each methodology based on certain design criteria, and introduced the bagging classifier based on decision three, as the best classifier to construct the fraud detection model.
Abstract: Credit card fraud is increasing considerably with the development of modern technology and the global superhighways of communication. Credit card fraud costs consumers and the financial company billions of dollars annually, and fraudsters continuously try to find new rules and tactics to commit illegal actions. Thus, fraud detection systems have become essential for banks and financial institution, to minimize their losses. However, there is a lack of published literature on credit card fraud detection techniques, due to the unavailable credit card transactions dataset for researchers. The most commonly techniques used fraud detection methods are Naive Bayes (NB), Support Vector Machines (SVM), K-Nearest Neighbor algorithms (KNN). These techniques can be used alone or in collaboration using ensemble or meta-learning techniques to build classifiers. But amongst all existing method, ensemble learning methods are identified as popular and common method, not because of its quite straightforward implementation, but also due to its exceptional predictive performance on practical problems. In this paper we trained various data mining techniques used in credit card fraud detection and evaluate each methodology based on certain design criteria. After several trial and comparisons; we introduced the bagging classifier based on decision three, as the best classifier to construct the fraud detection model. The performance evaluation is performed on real life credit card transactions dataset to demonstrate the benefit of the bagging ensemble algorithm.

Journal ArticleDOI
TL;DR: A computer aided method for the detection of Melanoma Skin Cancer using Image Processing tools is presented and by applying novel image processing techniques, it analyses it to conclude about the presence of skin cancer.
Abstract: In recent days, skin cancer is seen as one of the most Hazardous form of the Cancers found in Humans. Skin cancer is found in various types such as Melanoma, Basal and Squamous cell Carcinoma among which Melanoma is the most unpredictable. The detection of Melanoma cancer in early stage can be helpful to cure it. Computer vision can play important role in Medical Image Diagnosis and it has been proved by many existing systems. In this paper, we present a computer aided method for the detection of Melanoma Skin Cancer using Image Processing tools. The input to the system is the skin lesion image and then by applying novel image processing techniques, it analyses it to conclude about the presence of skin cancer. The Lesion Image analysis tools checks for the various Melanoma parameters Like Asymmetry, Border, Colour, Diameter,(ABCD) etc. by texture, size and shape analysis for image segmentation and feature stages. The extracted feature parameters are used to classify the image as Normal skin and Melanoma cancer lesion.

Journal ArticleDOI
TL;DR: Insight is given of how to uncover additional value from the data generated by healthcare and government and how Hadoop plays an effective role in performing meaningful real-time analysis on the huge volume of data.
Abstract: This paper gives an insight of how we can uncover additional value from the data generated by healthcare and government. Large amount of heterogeneous data is generated by these agencies. But without proper data analytics methods these data became useless. Big Data Analytics using Hadoop plays an effective role in performing meaningful real-time analysis on the huge volume of data and able to predict the emergency situations before it happens. It describes about the big data use cases in healthcare and government.

Journal ArticleDOI
TL;DR: This paper focus on identifying the slow learners among students and displaying it by a predictive data mining model using classification based algorithms and a knowledge flow model is also shown among all five classifiers.
Abstract: Educational Data Mining field concentrate on Prediction more often as compare to generate exact results for future purpose. In order to keep a check on the changes occurring in curriculum patterns, a regular analysis is must of educational databases. This paper focus on identifying the slow learners among students and displaying it by a predictive data mining model using classification based algorithms. Real World data set from a high school is taken and filtration of desired potential variables is done using WEKA an Open Source Tool. The dataset of student academic records is tested and applied on various classification algorithms such as Multilayer Perception, Naive Bayes, SMO, J48 and REPTree using WEKA an Open source tool. As a result, statistics are generated based on all classification algorithms and comparison of all five classifiers is also done in order to predict the accuracy and to find the best performing classification algorithm among all. In this paper, a knowledge flow model is also shown among all five classifiers. This paper showcases the importance of Prediction and Classification based data mining algorithms in the field of education and also presents some promising future lines.

Journal ArticleDOI
TL;DR: This review paper has consolidated the papers reviewed inline to the disciplines, model, tasks and methods involved in data mining in terms of method, algorithms and results.
Abstract: The knowledge discovery in database (KDD) is alarmed with development of methods and techniques for making use of data. One of the most important step of the KDD is the data mining. Data mining is the process of pattern discovery and extraction where huge amount of data is involved. Both the data mining and healthcare industry have emerged some of reliable early detection systems and other various healthcare related systems from the clinical and diagnosis data. In regard to this emerge, we have reviewed the various paper involved in this field in terms of method, algorithms and results. This review paper has consolidated the papers reviewed inline to the disciplines, model, tasks and methods. Results and evaluation methods are discussed for selected papers and a summary of the finding is presented to conclude the paper.

Journal ArticleDOI
TL;DR: A novel technique of forecasting by segregating a time series dataset into linear and nonlinear components through DWT is suggested, which achieves best forecasting accuracies for each series.
Abstract: Recently Discrete Wavelet Transform (DWT) has led to a tremendous surge in many domains of science and engineering. In this study, we present the advantage of DWT to improve time series forecasting precision. This article suggests a novel technique of forecasting by segregating a time series dataset into linear and nonlinear components through DWT. At first, DWT is used to decompose the in-sample training dataset of the time series into linear (detailed) and non-linear (approximate) parts. Then, the Autoregressive Integrated Moving Average (ARIMA) and Artificial Neural Network (ANN) models are used to separately recognize and predict the reconstructed detailed and approximate components, respectively. In this manner, the proposed approach tactically utilizes the unique strengths of DWT, ARIMA, and ANN to improve the forecasting accuracy. Our hybrid method is tested on four real-world time series and its forecasting results are compared with those of ARIMA, ANN, and Zhang's hybrid models. Results clearly show that the proposed method achieves best forecasting accuracies for each series.

Journal ArticleDOI
TL;DR: This paper outlines the merits and limitations of the clustering schemes in WSNs, and proposes a taxonomy of cluster based routing methods, which are broadly classified into three categories: flat routing, hierarchical or clusterbased routing, and location based routing.
Abstract: Latest advancements in micro-electro-mechanical-system (MEMS) and wireless communication technology, opens the way for the growth in applications of wireless sensor networks (WSNs). Wireless sensor network is comprised of huge number of small and cheap devices known as sensor nodes. The sensor nodes communicate together by many wireless strategies and these communication strategies are administered by routing protocols. Performance of sensor networks largely depends on the routing protocols, which are application based. Keeping this in mind, we have carried out extensive survey on WSN routing protocols. Based on structure of network, routing protocols in WSN can be broadly classified into three categories: flat routing, hierarchical or cluster based routing, and location based routing. Due to certain advantages, clustering is flattering as an active stem in routing technology. In this paper, authors have been reported a comprehensive survey on cluster based routing protocols in wireless sensor networks. We outline the merits and limitations of the clustering schemes in WSNs, and propose a taxonomy of cluster based routing methods. Finally, we summarize and conclude the paper with some future directions.

Journal ArticleDOI
TL;DR: This paper attempts to present a comprehensive survey of MF model like SVD to address the challenges of CF algorithms, which can be served as a roadmap for research and practice in this area.
Abstract: Recommendation Systems (RSs) are becoming tools of choice to select the online information relevant to a given user. Collaborative Filtering (CF) is the most popular approach to build Recommendation System and has been successfully employed in many applications. Collaborative Filtering algorithms are much explored technique in the field of Data Mining and Information Retrieval. In CF, past user behavior are analyzed in order to establish connections between users and items to recommend an item to a user based on opinions of other users. Those customers, who had similar likings in the past, will have similar likings in the future. In the past decades due to the rapid growth of Internet usage, vast amount of data is generated and it has becomea challenge for CF algorithms. So, CF faces issues with sparsity of rating matrix and growing nature of data. These challenges are well taken care of by Matrix Factorization (MF). In this paper we are going to discuss different Matrix Factorization models such as Singular Value Decomposition (SVD), Principal Component Analysis (PCA) and Probabilistic Matrix Factorization (PMF). This paper attempts to present a comprehensive survey of MF model like SVD to address the challenges of CF algorithms, which can be served as a roadmap for research and practice in this area.

Journal ArticleDOI
TL;DR: Results obtained from experiments with a particle physics data set show MPI/OpenMP outperforms Spark by more than one order of magnitude in terms of processing speed and provides more consistent performance, however, Spark shows better data manage- ment infrastructure and the possibility of dealing with other aspects such as node failure and data replication.
Abstract: One of the biggest challenges of the current big data landscape is our inability to pro- cess vast amounts of information in a reasonable time. In this work, we explore and com- pare two distributed computing frameworks implemented on commodity cluster architectures: MPI/OpenMP on Beowulf that is high-performance oriented and exploits multi-machine/multi- core infrastructures, and Apache Spark on Hadoop which targets iterative algorithms through in-memory computing. We use the Google Cloud Platform service to create virtual machine clusters, run the frameworks, and evaluate two supervised machine learning algorithms: KNN and Pegasos SVM. Results obtained from experiments with a particle physics data set show MPI/OpenMP outperforms Spark by more than one order of magnitude in terms of processing speed and provides more consistent performance. However, Spark shows better data manage- ment infrastructure and the possibility of dealing with other aspects such as node failure and data replication.

Journal ArticleDOI
TL;DR: Recommendations and practices to be used in the future of smart grid and Internet of things are provided and the different applications of smart sensor networks in the domain of smart power grid are explored.
Abstract: Smart sensor networks provide numerous opportunities for smart grid applications including power monitoring, demand-side energy management, coordination of distributed storage, and integration of renewable energy generators. Because of their low cost and ease-of-deployment, smart sensor networks are likely to be used on a large scale in future of smart power grids. The result is a huge volume of different variety of data sets. Processing and analyzing these data reveals deeper insights that can help expert to improve the operation of power grid to achieve better performance. The technology to collect massive amounts of data is available today, but managing the data efficiently and extracting the most useful information out of it remains a challenge. This paper discusses and provides recommendations and practices to be used in the future of smart grid and Internet of things. We explore the different applications of smart sensor networks in the domain of smart power grid. Also we discuss the techniques used to manage big data generated by sensors and meters for application processing.

Journal ArticleDOI
TL;DR: This paper compares machine learning classifiers (J48 Decision Tree, K-Nearest Neighbors, and Random Forest, Support Vector Machines) used to classify patients with diabetes mellitus to compare in terms of Accuracy, Sensitivity, and Specificity.
Abstract: Diabetes is one of the common and growing diseases in several countries and all of them are working to prevent this disease at early stage by predicting the symptoms of diabetes using several methods. The main aim of this study is to compare the performance of algorithms those are used to predict diabetes using data mining techniques. In this paper we compare machine learning classifiers (J48 Decision Tree, K-Nearest Neighbors, and Random Forest, Support Vector Machines) to classify patients with diabetes mellitus. These approaches have been tested with data samples downloaded from UCI machine learning data repository. The performances of the algorithms have been measured in both the cases i.e dataset with noisy data (before pre-processing) and dataset set without noisy data (after pre-processing) and compared in terms of Accuracy, Sensitivity, and Specificity.

Journal ArticleDOI
TL;DR: An integrated static and dynamic analysis method to analyses and classify an unknown executable file is proposed in which known malware and benign programs are used as training data and the integrated method gives better accuracy.
Abstract: The number of malware is increasing rapidly regardless of the common use of anti-malware software. Detection of malware continues to be a challenge as attackers device new techniques to evade from the detection methods. Most of the anti-virus software uses signature based detection which is inefficient in the present scenario due to the rapid increase in the number and variants of malware. The signature is a unique identification for a binary file, which is created by analyzing the binary file using static analysis methods. Dynamic analysis uses the behavior and actions while in execution to identify whether the executable is a malware or not. Both methods have its own advantages and disadvantages. This paper proposes an integrated static and dynamic analysis method to analyses and classify an unknown executable file. The method uses machine learning in which known malware and benign programs are used as training data. The feature vector is selected by analyzing the binary code as well as dynamic behavior. The proposed method utilizes the benefits of both static and dynamic analysis thus the efficiency and the classification result are improved. Our experimental results shows an accuracy of 95.8% using static, 97.1% using dynamic and 98.7% using integrated method. Comparing with the standalone dynamic and static methods, our integrated method gives better accuracy.

Journal ArticleDOI
TL;DR: The translation, cultural adaptation and a contribution to the validation of the European Portuguese version of the SUS are presented, equivalent to the original in terms of semantic and content.
Abstract: The System Usability Scale (SUS) is a widely used self-administered instrument for the evaluation of usability of a wide range of products and user interfaces. The principal value of the SUS is that it provides a single reference score for participants’ view of the usability of a product or service. This paper presents the translation, cultural adaptation and a contribution to the validation of the European Portuguese version of SUS. The conducted work comprised two phases, the scale translation, and the scale validation. The first phase resulted in a European Portuguese version equivalent to the original in terms of semantic and content. The second phase involved the assessment of the validity and reliability of the scale. The instrument has construct validity as it presents a high and significant correlation with other two usability metrics, the Post-Study System Usability Questionnaire (PSSUQ) (r = 0.70) and a general usability question (r = 0.48). The reliability results show less than satisfactory ICC values (ICC = 0.36), however the percentage of agreement is satisfactory (76.67%). Further studies are needed to investigate the reliability of the Portuguese version.

Journal ArticleDOI
TL;DR: A review on recently published research work on different variants of artificial neural network in the field of short term load forecasting of hybrid networks which is a combination of neural network with stochastic learning techniques such as genetic algorithm, particle swarm optimization, etc. which has been successfully applied for shortterm load forecasting (STLF) is discussed thoroughly.
Abstract: The electrical short term load forecasting has been emerged as one of the most essential field of research for efficient and reliable operation of power system in last few decades. It plays very significant role in the field of scheduling, contingency analysis, load flow analysis, planning and maintenance of power system. This paper addresses a review on recently published research work on different variants of artificial neural network in the field of short term load forecasting. In particular, the hybrid networks which is a combination of neural network with stochastic learning techniques such as genetic algorithm(GA), particle swarm optimization (PSO) etc. which has been successfully applied for short term load forecasting (STLF) is discussed thoroughly.

Journal ArticleDOI
TL;DR: From the simulation study it is observed that the proposed approach reduces the dimension of the input features by identifying the most informative gene subset and improve classification accuracy when compared to other approaches.
Abstract: DNA microarray technology can monitor the expression levels of thousands of genes simultaneously during important biological processes and across collections of related samples. Knowledge gained through microarray data analysis is increasingly important as they are useful for phenotype classification of diseases. This paper presents an effective method for gene classification using Support Vector Machine (SVM). SVM is a supervised learning algorithm capable of solving complex classification problems. Mutual information (MI) between the genes and the class label is used for identifying the informative genes. The selected genes are utilized for training the SVM classifier and the testing ability is evaluated using Leave-one-Out Cross Validation (LOOCV) method. The performance of the proposed approach is evaluated using two cancer microarray datasets. From the simulation study it is observed that the proposed approach reduces the dimension of the input features by identifying the most informative gene subset and improve classification accuracy when compared to other approaches.

Journal ArticleDOI
TL;DR: A web based tool that helps farmers for identifying fruit disease by uploading fruit image to the system and is effective and 82% accurate to identify pomegranate disease is proposed.
Abstract: Crops are being affected by uneven climatic conditions leading to decreased agricultural yield. This affects global agricultural economy. Moreover, condition becomes even worst when the crops are infected by any disease. Also, increasing population burdens farmers to increase yield. This is where modern agricultural techniques and systems are needed to detect and prevent the crops from being effected by different diseases. In this paper, we propose a web based tool that helps farmers for identifying fruit disease by uploading fruit image to the system. The system has an already trained dataset of images for the pomegranate fruit. Input image given by the user undergoes several processing steps to detect the severity of disease by comparing with the trained dataset images. First the image is resized and then its features are extracted on parameters such as color, morphology, and CCV and clustering is done by using k-means algorithm. Next, SVM is used for classification to classify the image as infected or non-infected. An intent search technique is also provided which is very useful to find the user intension. Out of three features extracted we got best results using morphology. Experimental evaluation of the proposed approach is effective and 82% accurate to identify pomegranate disease.

Journal ArticleDOI
TL;DR: This paper highlights data related security challenges in cloud based environment and solutions to overcome and provides a roadmap to overcome them.
Abstract: Cloud Computing trend is rapidly increasing that has an technology connection with Grid Computing, Utility Computing, Distributed Computing. Cloud service providers such as Amazon IBM, Google's Application, Microsoft Azure etc., provide the users in developing applications in cloud environment and to access them from anywhere. Cloud data are stored and accessed in a remote server with the help of services provided by cloud service providers. Providing security is a major concern as the data is transmitted to the remote server over a channel (internet). Before implementing Cloud Computing in an organization, security challenges needs to be addressed first. In this paper, we highlight data related security challenges in cloud based environment and solutions to overcome.

Journal ArticleDOI
TL;DR: A novel approach for automatic segmentation and classification of skin lesions is proposed and SVM and k-NN classifiers are used along with their fusion for the classification using the extracted features.
Abstract: In this paper, a novel approach for automatic segmentation and classification of skin lesions is proposed. Initially, skin images are filtered to remove unwanted hairs and noise and then the segmentation process is carried out to extract lesion areas. For segmentation, a region growing method is applied by automatic initialization of seed points. The segmentation performance is measured with different well known measures and the results are appreciable. Subsequently, the extracted lesion areas are represented by color and texture features. SVM and k-NN classifiers are used along with their fusion for the classification using the extracted features. The performance of the system is tested on our own dataset of 726 samples from 141 images consisting of 5 different classes of diseases. The results are very promising with 46.71% and 34% of F-measure using SVM and k-NN classifier respectively and with 61% of F-measure for fusion of SVM and k-NN.

Journal ArticleDOI
TL;DR: This work proposes mathematical model using Load Balancing Mutation (balancing) a particle swarm optimization (LBMPSO) based schedule and allocation for cloud computing that takes into account reliability, execution time, Transmission time, make span, round trip time, transmission cost and load balancing between tasks and virtual machine.
Abstract: The most important requirement in cloud computing environment is the task scheduling which plays the key role of efficiency of the whole cloud computing facilities. Task scheduling in cloud computing means that to allocate best suitable resources for the task to be execute with the consideration of different parameters like time, cost, scalability, make span, reliability, availability, throughput, resource utilization and so on. The proposed algorithm considers reliability and availability. Most scheduling algorithms do not consider reliability and availability of the cloud computing environment because the complexity to achieve these parameters. We propose mathematical model using Load Balancing Mutation (balancing) a particle swarm optimization (LBMPSO) based schedule and allocation for cloud computing that takes into account reliability, execution time, transmission time, make span, round trip time, transmission cost and load balancing between tasks and virtual machine .LBMPSO can play a role in achieving reliability of cloud computing environment by considering the resources available and reschedule task that failure to allocate. Our approach LBMPSO compared with standard PSO, random algorithm and Longest Cloudlet to Fastest Processor (LCFP) algorithm to show that LBMPSO can save in make span, execution time, round trip time, transmission cost.