scispace - formally typeset
Search or ask a question
Author

Shelly Gupta

Bio: Shelly Gupta is an academic researcher from Amity University. The author has contributed to research in topics: Tree (data structure) & Cancer. The author has an hindex of 4, co-authored 7 publications receiving 170 citations.

Papers
More filters
01 Jan 2011
TL;DR: An overview of the current research being carried out on various breast cancer datasets using the data mining techniques to enhance the breast cancer diagnosis and prognosis is presented.
Abstract: Breast Cancer Diagnosis and Prognosis are two medical applications pose a great challenge to the researchers. The use of machine learning and data mining techniques has revolutionized the whole process of breast cancer Diagnosis and Prognosis. Breast Cancer Diagnosis distinguishes benign from malignant breast lumps and Breast Cancer Prognosis predicts when Breast Cancer is likely to recur in patients that have had their cancers excised. Thus, these two problems are mainly in the scope of the classification problems. This study paper summarizes various review and technical articles on breast cancer diagnosis and prognosis. In this paper we present an overview of the current research being carried out using the data mining techniques to enhance the breast cancer diagnosis and prognosis. Breast cancer has become the leading cause of death in women in developed countries. The most effective way to reduce breast cancer deaths is detect it earlier. Early diagnosis requires an accurate and reliable diagnosis procedure that allows physicians to distinguish benign breast tumors from malignant ones without going for surgical biopsy. The objective of these predictions is to assign patients to either a "benign" group that is non- cancerous or a "malignant" group that is cancerous. The prognosis problem is the long-term outlook for the disease for patients whose cancer has been surgically removed. In this problem a patient is classified as a 'recur' if the disease is observed at some subsequent time to tumor excision and a patient for whom cancer has not recurred and may never recur. The objective of these predictions is to handle cases for which cancer has not recurred (censored data) as well as case for which cancer has recurred at a specific time. Thus, breast cancer diagnostic and prognostic problems are mainly in the scope of the widely discussed classification problems. These problems have attracted many researchers in computational intelligence, data mining, and statistics fields. Cancer research is generally clinical and/or biological in nature, data driven statistical research has become a common complement. Predicting the outcome of a disease is one of the most interesting and challenging tasks where to develop data mining applications. As the use of computers powered with automated tools, large volumes of medical data are being collected and made available to the medical research groups. As a result, Knowledge Discovery in Databases (KDD), which includes data mining techniques, has become a popular research tool for medical researchers to identify and exploit patterns and relationships among large number of variables, and made them able to predict the outcome of a disease using the historical cases stored within datasets. The objective of this study is to summarise various review and technical articles on diagnosis and prognosis of breast cancer. It gives an overview of the current research being carried out on various breast cancer datasets using the data mining techniques to enhance the breast cancer diagnosis and prognosis.

140 citations

Journal ArticleDOI
TL;DR: The present study aimed to do the performance analysis of several data mining classification techniques using three different machine learning tools over the healthcare datasets.
Abstract: Health care data includes patient centric data, their treatment data and resource management data. It is very massive and information rich. Valuable knowledge i.e. hidden relationships and trends in data can be discovered from the application of data mining techniques on healthcare data. Data mining techniques have been used in healthcare research and known to be effective. The present study aimed to do the performance analysis of several data mining classification techniques using three different machine learning tools over the healthcare datasets. In this study, different data mining classification techniques have been tested on four different healthcare datasets. The standards used are percentage of accuracy and error rate of every applied classification technique. The experiments are done using the 10 fold cross validation method. A suitable technique for a particular dataset is chosen based on highest classification accuracy and least error rate.

31 citations

Proceedings ArticleDOI
01 Feb 2016
TL;DR: This study paper is aimed to provide a brief review of the various research papers and journals on recent research done for cluster analysis of gene microarray data.
Abstract: In bioinformatics research, the biggest challenge is to extract information from large datasets according to ones interestingness criteria. For research in genetics the DNA microarray technology provides better results in comparison with the standard approach as it has computerized the parallel analysis of thousands of genes for monitoring expression levels of genes. Thus, for the biologists the challenge to analyze gene data which consists of a huge number of measurements is in the range of clustering techniques. For the above purpose till now various clustering techniques have been developed and applied on gene microarray data. Thus, this study paper is aimed to provide a brief review of the various research papers and journals on recent research done for cluster analysis of gene microarray data.

8 citations

Journal ArticleDOI
TL;DR: The goal of this research is to collect all such reviews from the web and generate rating based on it with the help of a data mining classification method that is decision tree.
Abstract: Websites like TripAdvisor has become an intricate part of our lives. It gives us various reviews about the hotels and helps us to decide which hotels to stay in or visit. Customer reviews play a major role in influencing others. Hence, these reviews become very important in controlling the quality of the hotel. The goal of this research is to collect all such reviews from the web and generate rating based on it with the help of a data mining classification method that is decision tree. The C4.5 decision tree method is applied for the above purpose using Tanagra Machine learning tool to generate the data model. The advantage of using Decision tree algorithm is that the rule set can be easily generated and by analyzing each level of the tree we can improve a particular service quality.

5 citations

Journal ArticleDOI
TL;DR: There is a drop in the number of people who are ready to take the vaccine as compared to the number before the arrival of the vaccine, which may lead to the conclusion that the faith of general people of India has declined in the COVID-19 vaccine post vaccination drive.
Abstract: The acceptance of any vaccine relies on the belief and perception towards it. After a wait of almost 10 months, the COVID-19 vaccine is ready with the first phase in progress in India. The aim of this study is to assess the impact on the acceptance intentions of COVID-19 vaccine among the general population of India after the vaccine is inoculated to health care workers in the first phase. An empirical study was conducted by analyzing the data collected by a self-administered questionnaire. The various variables that were addressed were the socio-demographic variables, past behavior of participants towards such seasonal influenza vaccine, awareness about the vaccine and adoption intention post vaccination drive. Logistic regression was used to identify the association between various variables and the predicting variables for the vaccine acceptance. Majority of them were ready for the COVID-19 vaccine. However, there was a decline in the acceptance rate post vaccination drive. Age, Gender and Region were found as the major factors affecting this decision. Aims: To analyse, 1. The shift in confidence level in COVID-19 vaccine; 2. The role of Social Influence (SI) towards COVID-19 vaccine; 3. The role of past behavior towards seasonal influenza vaccines (Swine Flu, Ebola or similar) in acceptance of COVID-19 vaccine; 4. The association bet ween awareness of the COVID-19 vaccine and rate of adoption. Materials and Methods: The study was conducted by analyzing the data collected by a self-administered questionnaire that was shared online across India in January 2021 – February 2021. The variables that were addressed through the questionnaire were the socio-demographic variables, past behavior of participants towards such seasonal influenza vaccine, awareness about the vaccine and adoption intention post vaccination drive. Associations between various variables were observed during analysis. Logistic regression was also used to identify the predicting

5 citations


Cited by
More filters
01 Jan 2002

9,314 citations

Journal ArticleDOI
31 Oct 2013
TL;DR: This survey explores the utility of various Data Mining techniques such as classification, clustering, association, regression in health domain and a brief introduction of these techniques and their advantages and disadvantages.
Abstract: Data Mining is one of the most motivating area of research that is become increasingly popular in health organization. Data Mining plays an important role for uncovering new trends in healthcare organization which in turn helpful for all the parties associated with this field. This survey explores the utility of various Data Mining techniques such as classification, clustering, association, regression in health domain. In this paper, we present a brief introduction of these techniques and their advantages and disadvantages. This survey also highlights applications, challenges and future issues of Data Mining in healthcare. Recommendation regarding the suitable choice of available Data Mining technique is also discussed in this paper.

415 citations

Journal ArticleDOI
TL;DR: The proposed WAUCE model achieves a higher accuracy with a significantly lower variance for breast cancer diagnosis compared to five other ensemble mechanisms and two common ensemble models, i.e., adaptive boosting and bagging classification tree.

284 citations

Journal ArticleDOI
TL;DR: Healthcare data mining can enable healthcare organizations to predict trends in the patient conditions and their behaviors, which is accomplished by data analysis from different perspectives and discovering connections and relations from seemingly unrelated information.
Abstract: Tendency for data mining application in healthcare today is great, because healthcare sector is rich with information, and data mining is becoming a necessity. Healthcare organizations produce and collect large volumes of information on daily basis. Use of information technologies allows automatization of processes for extraction of data that help to get interesting knowledge and regularities, which means the elimination of manual tasks and easier extraction of data directly from electronic records, transferring onto secure electronic system of medical records which will save lives and reduce the cost of the healthcare services, as well and early discovery of contagious diseases with the advanced collection of data. Data mining can enable healthcare organizations to predict trends in the patient conditions and their behaviors, which is accomplished by data analysis from different perspectives and discovering connections and relations from seemingly unrelated information. Raw data from healthcare organizations are voluminous and heterogeneous. They need to be collected and stored in the organized forms, and their integration enables forming of hospital information system. Healthcare data mining provides countless possibilities for hidden pattern investigation from these data sets. These patterns can be used by physicians to determine diagnoses, prognoses and treatments for patients in healthcare organizations.

151 citations

Journal ArticleDOI
30 Apr 2012
TL;DR: In this paper, the authors have discussed various data mining approaches that have been utilized for breast cancer diagnosis and prognosis and discussed the current research being carried out using the data mining techniques.
Abstract: Breast cancer is one of the leading cancers for women in developed countries including India. It is the second most common cause of cancer death in women. The high incidence of breast cancer in women has increased significantly in the last years. In this paper we have discussed various data mining approaches that have been utilized for breast cancer diagnosis and prognosis. Breast Cancer Diagnosis is distinguishing of benign from malignant breast lumps and Breast Cancer Prognosis predicts when Breast Cancer is to recur in patients that have had their cancers excised. This study paper summarizes various review and technical articles on breast cancer diagnosis and prognosis also we focus on current research being carried out using the data mining techniques to enhance the breast cancer diagnosis and prognosis.

117 citations