scispace - formally typeset
Search or ask a question
Author

Sunita Soni

Bio: Sunita Soni is an academic researcher from Bhilai Institute of Technology – Durg. The author has contributed to research in topics: Association rule learning & Computer science. The author has an hindex of 8, co-authored 13 publications receiving 632 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: A survey of current techniques of knowledge discovery in databases using data mining techniques that are in use in today’s medical research particularly in Heart Disease Prediction reveals that Decision Tree outperforms and some time Bayesian classification is having similar accuracy as of decision tree but other predictive methods are not performing well.
Abstract: The successful application of data mining in highly visible fields like e-business, marketing and retail has led to its application in other industries and sectors. Among these sectors just discovering is healthcare. The healthcare environment is still „information rich‟ but „knowledge poor‟. There is a wealth of data available within the healthcare systems. However, there is a lack of effective analysis tools to discover hidden relationships and trends in data. This research paper intends to provide a survey of current techniques of knowledge discovery in databases using data mining techniques that are in use in today‟s medical research particularly in Heart Disease Prediction. Number of experiment has been conducted to compare the performance of predictive data mining technique on the same dataset and the outcome reveals that Decision Tree outperforms and some time Bayesian classification is having similar accuracy as of decision tree but other predictive methods like KNN, Neural Networks, Classification based on clustering are not performing well. The second conclusion is that the accuracy of the Decision Tree and Bayesian Classification further improves after applying genetic algorithm to reduce the actual data size to get the optimal subset of attribute sufficient for heart disease prediction.

573 citations

Journal ArticleDOI
TL;DR: The combined approach that integrates association rule mining and classification rule mining called Associative Classification (AC) is introduced, which gives a new type of Associative classifiers with small refinement in the definition of support and confidence that satisfies the validation of downward closure property.
Abstract: Association rule mining is one of the most important and well researched techniques of data mining for descriptive task, initially used for market basket analysis. It finds all the rules existing in the transactional database that satisfy some minimum support and minimum confidence constraints. Classification using Association rule mining is another major Predictive analysis technique that aims to discover a small set of rule in the database that forms an accurate classifier. In this paper, we introduce the combined approach that integrates association rule mining and classification rule mining called Associative Classification (AC). This is new classification approach. The integration is done by focusing on mining a special subset of association rules called classification association rule (CAR). And then classification is being performed using these CAR. Using association rule mining for constructing classification systems is a promising approach. Given the readability of the associative classifiers, they are especially fit to applications were the model may assist domain experts in their decisions. Medical field is a good example was such applications may appear. Consider an example were a physician has to examine a patient. There is a considerable amount of information associated with the patient (e.g. personal data, medical tests, etc.). A classification system can assist the physician in this process. The system can predict if the patient is likely to have a certain disease or present incompatibility with some treatments. Considering the output of the classification model, the physician can make a better decision on the treatment to be applied to this patient. There are many associative classification approaches that have been proposed recently such as CBA, CMAR, CPAR and MCAR and MMAC. Also Combining the Advanced association rule mining with classifiers gives a new type of Associative classifiers with small refinement in the definition of support and confidence that satisfies the validation of downward closure property. We will discuss advanced associative classifiers being proposed in recent years to provide better accuracy as compare to traditional Classifiers.

60 citations

Journal ArticleDOI
TL;DR: Investigation of the performance criterion of a machine learning tool, Naive Bayes Classifier with a new weighted approach in classifying breast cancer is done, and experiments show that a weighted naive bayes approach outperforms naive Bayes.
Abstract: this paper investigation of the performance criterion of a machine learning tool, Naive Bayes Classifier with a new weighted approach in classifying breast cancer is done . Naive Bayes is one of the most effective classification algorithms. In many decision making system, ranking performance is an interesting and desirable concept than just classification. So to extend traditional Naive Bayes, and to improve its performance, weighted concept is incorporated. Exploration of Domain knowledge based weight assignment on UCI machine learning repository dataset of breast cancer is performed. As Breast cancer is considered to be second leading cause of death in women today. The experiments show that a weighted naive bayes approach outperforms naive bayes. KeywordsMining, Breast cancer, Naive bayes classifier, Domain based weight, Weights, Posterior probability, UCI machine learning repository, Prediction.

49 citations

Journal ArticleDOI
TL;DR: This work is to design a Graphical User Interface to enter the patient screening record and detect the probability of having Breast cancer disease in women in her future using Naive Bayes Classifiers, a Probabilistic Classifier.
Abstract: Naive Bayes is one of the most effective statistical and probabilistic classification algorithms. As health care environment is “information loaded” but “knowledge deprived”. So to extract knowledge, effective analysis tools are constructed to discover hidden relationships in data. The aim of this work is to design a Graphical User Interface to enter the patient screening record and detect the probability of having Breast cancer disease in women in her future using Naive Bayes Classifiers, a Probabilistic Classifier. As breast cancer is considered to be second leading cause of cancer deaths in women today so early detection can improve the survival rate of women. The prediction is performed from mining the patient’s historical data or data repository. Further from the experimental results it has been found that Naive Bayes Classifiers is providing improved accuracy with low computational effort and very high speed. The system has been implemented using java platform and trained using benchmark data from UCI machine learning repository. The system is expandable for the new dataset.

41 citations

Journal ArticleDOI
29 Feb 2012
TL;DR: A theoretical model is proposed to introduce new associative classifier that takes advantage of Fuzzy Weighted Association rule mining and can be used to generating strong rules instead of weak irrelevant rules.
Abstract: In this paper we extend the problem of classification using Fuzzy Association Rule Mining and propose the concept of Fuzzy Weighted Associative Classifier (FWAC). Classification based on Association rules is considered to be effective and advantageous in many cases. Associative classifiers are especially fit to applications where the model may assist the domain experts in their decisions. Weighted Associative Classifiers that takes advantage of weighted Association Rule Mining is already being proposed. However, there is a so-called "sharp boundary" problem in association rules mining with quantitative attribute domains. This paper proposes a new Fuzzy Weighted Associative Classifier (FWAC) that generates classification rules using Fuzzy Weighted Support and Confidence framework. The naive approach can be used to generating strong rules instead of weak irrelevant rules. where fuzzy logic is used in partitioning the domains. The problem of Invalidation of Downward Closure property is solved and the concept of Fuzzy Weighted Support and Fuzzy Weighted Confidence frame work for Boolean and quantitative item with weighted setting is generalized. We propose a theoretical model to introduce new associative classifier that takes advantage of Fuzzy Weighted Association rule mining.

14 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: A survey of current techniques of knowledge discovery in databases using data mining techniques that are in use in today’s medical research particularly in Heart Disease Prediction reveals that Decision Tree outperforms and some time Bayesian classification is having similar accuracy as of decision tree but other predictive methods are not performing well.
Abstract: The successful application of data mining in highly visible fields like e-business, marketing and retail has led to its application in other industries and sectors. Among these sectors just discovering is healthcare. The healthcare environment is still „information rich‟ but „knowledge poor‟. There is a wealth of data available within the healthcare systems. However, there is a lack of effective analysis tools to discover hidden relationships and trends in data. This research paper intends to provide a survey of current techniques of knowledge discovery in databases using data mining techniques that are in use in today‟s medical research particularly in Heart Disease Prediction. Number of experiment has been conducted to compare the performance of predictive data mining technique on the same dataset and the outcome reveals that Decision Tree outperforms and some time Bayesian classification is having similar accuracy as of decision tree but other predictive methods like KNN, Neural Networks, Classification based on clustering are not performing well. The second conclusion is that the accuracy of the Decision Tree and Bayesian Classification further improves after applying genetic algorithm to reduce the actual data size to get the optimal subset of attribute sufficient for heart disease prediction.

573 citations

Journal ArticleDOI
TL;DR: To determine how the selection of instances and attributes, the use of different classification algorithms and the date when data is gathered affect the accuracy and comprehensibility of the prediction, a new Moodle module for gathering forum indicators was developed and different executions were carried out.
Abstract: On-line discussion forums constitute communities of people learning from each other, which not only inform the students about their peers' doubts and problems but can also inform instructors about their students' knowledge of the course contents In fact, nowadays there is increasing interest in the use of discussion forums as an indicator of student performance In this respect, this paper proposes the use of different data mining approaches for improving prediction of students' final performance starting from participation indicators in both quantitative, qualitative and social network forums Our objective is to determine how the selection of instances and attributes, the use of different classification algorithms and the date when data is gathered affect the accuracy and comprehensibility of the prediction A new Moodle's module for gathering forum indicators was developed and different executions were carried out using real data from 114 university students during a first-year course in computer science A representative set of traditional classification algorithms have been used and compared versus classification via clustering algorithms for predicting whether students will pass or fail the course on the basis of data about their forum usage The results obtained indicate the suitability of performing both a final prediction at the end of the course and an early prediction before the end of the course; of applying clustering plus class association rules mining instead of traditional classification for obtaining highly interpretable student performance models; and of using a subset of attributes instead of all available attributes, and not all forum messages but only students' messages with content related to the subject of the course for improving classification accuracy

485 citations

Journal ArticleDOI
31 Oct 2013
TL;DR: This survey explores the utility of various Data Mining techniques such as classification, clustering, association, regression in health domain and a brief introduction of these techniques and their advantages and disadvantages.
Abstract: Data Mining is one of the most motivating area of research that is become increasingly popular in health organization. Data Mining plays an important role for uncovering new trends in healthcare organization which in turn helpful for all the parties associated with this field. This survey explores the utility of various Data Mining techniques such as classification, clustering, association, regression in health domain. In this paper, we present a brief introduction of these techniques and their advantages and disadvantages. This survey also highlights applications, challenges and future issues of Data Mining in healthcare. Recommendation regarding the suitable choice of available Data Mining technique is also discussed in this paper.

415 citations

Journal ArticleDOI
TL;DR: This work proposes a highly accurate hybrid method for the diagnosis of coronary artery disease that is able to increase the performance of neural network by approximately 10% through enhancing its initial weights using genetic algorithm which suggests better weights for neural network.

343 citations