An effective hybridized classifier for breast cancer diagnosis

doi:10.1109/AIM.2015.7222674

Home
/
Papers
/
An effective hybridized classifier for breast cancer diagnosis

Proceedings Article•DOI•

An effective hybridized classifier for breast cancer diagnosis

Dishant Mittal¹, Dev Gaurav¹, Sanjiban Sekhar Roy¹•Institutions (1)

VIT University¹

07 Jul 2015-pp 1026-1031

TL;DR: The paper proposes an effective hybridized classifier for breast cancer diagnosis made by combining an unsupervised artificial neural network method named self organizing maps (SOM) with a supervised classifier called stochastic gradient descent (SGD).

read less

Abstract: After lung cancer, breast cancer is known to be the greatest cause for death among females [20] The improving effectiveness of machine learning approaches is being given a lot of importance by medical practitioners for breast cancer diagnosis The paper proposes an effective hybridized classifier for breast cancer diagnosis The classifier is made by combining an unsupervised artificial neural network (ANN) method named self organizing maps (SOM) with a supervised classifier called stochastic gradient descent (SGD) Also a comparative analysis is performed between the proposed approach and three supervised state of the art machine learning techniques decision tree (DTs), random forests (RF) and support vector machine (SVM) Initially SGD method is used in isolation for the classification task and then it is made to perform the classification after being hybridized with the unsupervised ANN technique on Wisconsin Breast Cancer Database (WBCD) [10] The comparison is based up on classification accuracy that is produced by generating a confusion matrix For verifying consistency of accuracy values, the classification task was repeated with Internet Advertisements Dataset [11] The results of the classification experimentation using hybridization of SOM with SGD are much more superior to SGD in isolation All the accuracy values have been computed after achieving a ten-fold cross validation on the both the datasets to further verify the classifier's performance

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Reviewing ensemble classification methods in breast cancer.

[...]

Mohamed Hosni, Ibtissam Abnane, Ali Idri, Juan Manuel Carrillo de Gea¹, José Luis Fernández Alemán¹ - Show less +1 more•Institutions (1)

University of Murcia¹

01 Aug 2019-Computer Methods and Programs in Biomedicine

TL;DR: This study found that of the six medical tasks that exist, the diagnosis medical task was that most frequently researched, and that the experiment-based empirical type and evaluation-based research type were the most dominant approaches adopted in the selected studies.

...read moreread less

128 citations

Book Chapter•DOI•

A Deep Learning Based Artificial Neural Network Approach for Intrusion Detection

[...]

Sanjiban Sekhar Roy¹, Abhinav Mallik¹, Rishab Gulati¹, Mohammad S. Obaidat², Mohammad S. Obaidat³, Parimala Venkata Krishna⁴ - Show less +2 more•Institutions (4)

VIT University¹, Fordham University², University of Jordan³, Sri Padmavati Mahila Visvavidyalayam⁴

17 Jan 2017

TL;DR: The experimental results show that the accuracy of intrusion detection using Deep Neural Network is satisfactory and the potential capability of Deep Neural network as a classifier for the different types of intrusion attacks is checked.

...read moreread less

Abstract: Security of data is considered to be one of the most important concerns in today’s world. Data is vulnerable to various types of intrusion attacks that may reduce the utility of any network or systems. Constantly changing and the complicated nature of intrusion activities on computer networks cannot be dealt with IDSs that are currently operational. Identifying and preventing such attacks is one of the most challenging tasks. Deep Learning is one of the most effective machine learning techniques which is getting popular recently. This paper checks the potential capability of Deep Neural Network as a classifier for the different types of intrusion attacks. A comparative study has also been carried out with Support Vector Machine (SVM). The experimental results show that the accuracy of intrusion detection using Deep Neural Network is satisfactory.

...read moreread less

102 citations

Journal Article•DOI•

A New Ensemble-Based Intrusion Detection System for Internet of Things

[...]

Adeel Abbas¹, Muazzam A. Khan¹, Muazzam A. Khan², Shahid Latif³, Maria Ajaz¹, Awais Aziz Shah⁴, Jawad Ahmad⁵ - Show less +3 more•Institutions (5)

Quaid-i-Azam University¹, Pakistan Academy of Sciences², Fudan University³, Instituto Politécnico Nacional⁴, Edinburgh Napier University⁵

30 Aug 2021-Arabian Journal for Science and Engineering

TL;DR: An ensemble-based intrusion detection model that combines logistic regression, naive Bayes, and decision tree have been deployed with voting classifier after analyzing model’s performance with some prominent existing state-of-the-art techniques and results illustrate significant improvement in terms of accuracy as compared to existing models.

...read moreread less

Abstract: The domain of Internet of Things (IoT) has witnessed immense adaptability over the last few years by drastically transforming human lives to automate their ordinary daily tasks. This is achieved by interconnecting heterogeneous physical devices with different functionalities. Consequently, the rate of cyber threats has also been raised with the expansion of IoT networks which puts data integrity and stability on stake. In order to secure data from misuse and unusual attempts, several intrusion detection systems (IDSs) have been proposed to detect the malicious activities on the basis of predefined attack patterns. The rapid increase in such kind of attacks requires improvements in the existing IDS. Machine learning has become the key solution to improve intrusion detection systems. In this study, an ensemble-based intrusion detection model has been proposed. In the proposed model, logistic regression, naive Bayes, and decision tree have been deployed with voting classifier after analyzing model’s performance with some prominent existing state-of-the-art techniques. Moreover, the effectiveness of the proposed model has been analyzed using CICIDS2017 dataset. The results illustrate significant improvement in terms of accuracy as compared to existing models in terms of both binary and multi-class classification scenarios.

...read moreread less

43 citations

Book Chapter•DOI•

Predicting Ozone Layer Concentration Using Multivariate Adaptive Regression Splines, Random Forest and Classification and Regression Tree

[...]

Sanjiban Sekhar Roy¹, Chitransh Pratyush¹, Cornel Barna²•Institutions (2)

VIT University¹, Aurel Vlaicu University of Arad²

24 Aug 2016

TL;DR: Evaluation of the prediction models indicates that the Multivariate Adaptive Regression Splines model describes the dataset better and has achieved significantly better prediction accuracy as compared to the Random Forest and Classification and Regression Tree.

...read moreread less

Abstract: Air pollution is one of the major environmental worries in recent time. Abrupt increase in the concentration of any gas leads to air pollution. The cities are mostly affected due to the abundance of population there. One of the worst gaseous pollutants is OZONE (O3). In this paper, we propose three predictive models for estimation of concentration of ozone gases in the air which are Random Forest, Multivariate Adaptive Regression Splines and Classification and Regression Tree. Evaluation of the prediction models indicates that the Multivariate Adaptive Regression Splines model describes the dataset better and has achieved significantly better prediction accuracy as compared to the Random Forest and Classification and Regression Tree. A detailed comparative study has been carried out on the performances of Random Forest, Multivariate Adaptive Regression Splines and Classification and Regression Tree. MARS gives the result by considering less variables as compared to other two. Moreover, Random Forest takes a little more time for building the tree as the elapsed time was calculated to 45 s in this case. In addition, variable importance for each model has been predicted. Observing all the graphs Multivariate Adaptive Regression Splines gives the closest curve of both train and test set when compared. It can be concluded that multivariate adaptive regression splines can be a valuable tool in predicting ozone for future.

...read moreread less

12 citations

Book Chapter•DOI•

Prediction of Customer Satisfaction Using Naive Bayes, MultiClass Classifier, K-Star and IBK

[...]

Sanjiban Sekhar Roy¹, Deeksha Kaul¹, Reetika Roy¹, Cornel Barna², Suhasini Mehta¹, Anusha Misra¹ - Show less +2 more•Institutions (2)

VIT University¹, Aurel Vlaicu University of Arad²

24 Aug 2016

TL;DR: Three classification models Naive Bayes, MultiClass Classifier, K-Star and IBK are adopted as potential classifiers for prediction of customer satisfaction at San Francisco International Airport to find the least amount of deviation from the actual values.

...read moreread less

Abstract: Customer satisfaction is an important term in business as well as marketing as it surely indicates how well the customer expectations have been met with by the product or the service. Thus a good prediction model for customer satisfaction can help any organization make better decisions with respect to its services and work in a more informed matter to improvise on the same. The problem considered in this study is optimization of customer satisfaction for the customers of San Francisco International Airport. This paper adopts three classification models Naive Bayes, MultiClass Classifier, K-Star and IBK as potential classifiers for prediction of customer satisfaction. The customer satisfaction depends on various factors. The factors which we consider are the user ratings for artwork and exhibitions, restaurants, variety stores, concessions, signage, directions inside SFO, information booths near baggage claim and departure, Wi-Fi, parking facilities, walkways, air train and an overall rating for the airport services. The ratings are obtained from a detailed customer survey conducted by the mentioned airport in 2015. The original survey focused on questions including airlines, destination airport, delays of flights, conveyance to and from the airport, security/immigration etc. but our study focuses on the previously mentioned questions. Graphs are plotted for actual and predicted values and compared to find the least amount of deviation from the actual values. The model which shows least deviation from actual values is considered optimal for the above mentioned problem.

...read moreread less

12 citations

1
2
3
4
…
5
6

Collapse

References

PDF

Open Access

More filters

Journal Article•

Scikit-learn: Machine Learning in Python

[...]

Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel¹, Peter Prettenhofer², Ron Weiss³, Vincent Dubourg, Jake Vanderplas⁴, Alexandre Passos⁵, David Cournapeau, Matthieu Brucher⁶, Matthieu Perrot, Edouard Duchesnay - Show less +12 more•Institutions (6)

Kobe University¹, Bauhaus University, Weimar², Google³, University of Washington⁴, University of Massachusetts Amherst⁵, Total S.A.⁶

01 Feb 2011-Journal of Machine Learning Research

TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.

...read moreread less

Abstract: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems. This package focuses on bringing machine learning to non-specialists using a general-purpose high-level language. Emphasis is put on ease of use, performance, documentation, and API consistency. It has minimal dependencies and is distributed under the simplified BSD license, encouraging its use in both academic and commercial settings. Source code, binaries, and documentation can be downloaded from http://scikit-learn.sourceforge.net.

...read moreread less

47,974 citations

Posted Content•

Scikit-learn: Machine Learning in Python

[...]

Fabian Pedregosa¹, Gaël Varoquaux¹, Alexandre Gramfort¹, Vincent Michel¹, Bertrand Thirion¹, Olivier Grisel, Mathieu Blondel, Andreas Müller², Joel Nothman, Gilles Louppe², Peter Prettenhofer, Ron Weiss, Vincent Dubourg, Jake Vanderplas, Alexandre Passos, David Cournapeau, Matthieu Brucher, Matthieu Perrot, Edouard Duchesnay - Show less +15 more•Institutions (2)

French Institute for Research in Computer Science and Automation¹, University of Liège²

02 Jan 2012-arXiv: Learning

TL;DR: Scikit-learn as mentioned in this paper is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems.

...read moreread less

28,898 citations

Book•

World Cancer Report

[...]

B. W. Stewart, Paul Kleihues

01 Apr 2003

3,950 citations

Book Chapter•DOI•

Stochastic Gradient Descent Tricks

[...]

Léon Bottou¹•Institutions (1)

Microsoft¹

01 Jan 2012

TL;DR: This chapter provides background material, explains why SGD is a good learning algorithm when the training set is large, and provides useful recommendations.

...read moreread less

Abstract: Chapter 1 strongly advocates the stochastic back-propagation method to train neural networks. This is in fact an instance of a more general technique called stochastic gradient descent (SGD). This chapter provides background material, explains why SGD is a good learning algorithm when the training set is large, and provides useful recommendations.

...read moreread less

1,666 citations

"An effective hybridized classifier ..." refers background in this paper

...Donald and Robert [5] conducted a classification on the data set of dense canopy pine plantation....
[...]

Journal Article•DOI•

Breast Cancer: Magnitude of the Problem and Descriptive Epidemiology

[...]

Jennifer L. Kelsey¹, Pamela L. Horn-Ross•Institutions (1)

Stanford University¹

01 Jan 1993-Epidemiologic Reviews

TL;DR: The most notable characteristic of the descriptive epidemiology of breast cancer in recent years is perhaps the rapidly increasing incidence rates in developing countries.

...read moreread less

Abstract: Breast cancer is the most common cancer among women in the United States. Knowledge of the descriptive epidemiology of breast cancer is useful both in suggesting etiologic hypotheses and, if preventive measures can be identified, in delineating high-risk groups to be targeted for preventive efforts. Demographic risk factors include increasing age (in Western countries), being white for breast cancer diagnosed at age 45 years or more, being black for breast cancer diagnosed at less than 40 years of age, high socioeconomic status, having never married, being of the Jewish faith, urban residence, and residence in the northern (as compared with the southern) United States. Incidence rates are generally highest in North American and Northern European countries, intermediate in Southern and Eastern European and South American countries, and lowest in Asia and Africa. The most notable characteristic of the descriptive epidemiology of breast cancer in recent years is perhaps the rapidly increasing incidence rates in developing countries. Identification of specific reasons for these increasing rates would contribute substantially to our understanding of the epidemiology of breast cancer.

...read moreread less

414 citations