Investigations on Impact of Feature Normalization Techniques on Classifier's Performance in Breast Tumor Classification

doi:10.5120/20443-2793

Home
/
Papers
/
Investigations on Impact of Feature Normalization Techniques on Classifier's Performance in Breast Tumor Classification

Journal Article•DOI•

Investigations on Impact of Feature Normalization Techniques on Classifier's Performance in Breast Tumor Classification

Bikesh Kumar Singh, Kesari Verma, A. S. Thoke

22 Apr 2015-International Journal of Computer Applications (Foundation of Computer Science (FCS))-Vol. 116, Iss: 19, pp 11-15

TL;DR: This paper investigates and evaluates some popular feature normalization techniques and studies their impact on performance of classifier with application to breast tumor classification using ultrasound images and shows that that normalization of features has significant effect on the classification accuracy.

read less

Abstract: Feature extraction and feature normalization is an important preprocessing technique, usually employed before classification. Feature normalization is a useful step to restrict the values of all features within predetermined ranges. However, appropriate choice of normalization technique and normalization range is an important issue, since, applying normalization on the input could change the structure of data and thereby affecting the outcome of multivariate analysis and calibration used in data mining and pattern recognition problems. This paper investigates and evaluates some popular feature normalization techniques and studies their impact on performance of classifier with application to breast tumor classification using ultrasound images. For evaluating the feature normalization techniques, back-propagation artificial neural network [BPANN] and support vector machine [SVM] classifier models are used. Results show that that normalization of features has significant effect on the classification accuracy. General Terms Pattern Recognition, Medical Image Processing.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Automatic detection of Parkinson’s disease based on acoustic analysis of speech

[...]

Diogo Braga¹, Ana Madureira¹, Luís Pinto Coelho¹, Reuel Ajith²•Institutions (2)

International Student Exchange Programs¹, Vilnius University²

01 Jan 2019-Engineering Applications of Artificial Intelligence

TL;DR: The results reveal the potential in using Random Forest (RF) or Support Vector Machine (SVM) techniques for estimating the presence of PD with a very high accuracy.

...read moreread less

93 citations

Journal Article•DOI•

Massive LMS log data analysis for the early prediction of course-agnostic student performance

[...]

Moises Riestra-González¹, Maria del Puerto Paule-Ruíz², Francisco Ortin³, Francisco Ortin²•Institutions (3)

Accenture¹, University of Oviedo², Cork Institute of Technology³

01 Apr 2021-Computers in Education

TL;DR: This work uses machine learning to create models for the early prediction of students’ performance in solving LMS assignments, by just analyzing the LMS log files generated up to the moment of prediction, and detects at-risk, fail and excellent students in the early stages of the course.

...read moreread less

Abstract: The early prediction of students' performance is a valuable resource to improve their learning. If we are able to detect at-risk students in the initial stages of the course, we will have more time to improve their performance. Likewise, excellent students could be motivated with customized additional activities. This is why there are research works aimed to early detect students’ performance. Some of them try to achieve it with the analysis of LMS log files, which store information about student interaction with the LMS. Many works create predictive models with the log files generated for the whole course, but those models are not useful for early prediction because the actual log information used for predicting is different to the one used to train the models. Other works do create predictive models with the log information retrieved at the early stages of courses, but they are just focused on a particular type of course. In this work, we use machine learning to create models for the early prediction of students' performance in solving LMS assignments, by just analyzing the LMS log files generated up to the moment of prediction. Moreover, our models are course agnostic, because the datasets are created with all the University of Oviedo1 courses for one academic year. We predict students' performance at 10%, 25%, 33% and 50% of the course length. Our objective is not to predict the exact student's mark in LMS assignments, but to detect at-risk, fail and excellent students in the early stages of the course. That is why we create different classification models for each of those three student groups. Decision tree, nave Bayes, logistic regression, multilayer perceptron (MLP) neural network, and support vector machine models are created and evaluated. Accuracies of all the models grow as the moment of prediction increases. Although all the algorithms but nave Bayes show accuracy differences lower than 5%, MLP obtains the best performance: from 80.1% accuracy when 10% of the course has been delivered to 90.1% when half of it has taken place. We also discuss the LMS log entries that most influence the students' performance. By using a clustering algorithm, we detect six different clusters of students regarding their interaction with the LMS. Analyzing the interaction patterns of each cluster, we find that those patterns are repeated in all the early stages of the course. Finally, we show how four out of those six student-LMS interaction patterns have a strong correlation with students' performance.

...read moreread less

55 citations

Cites methods from "Investigations on Impact of Feature..."

...We normalize all the feature values between 0 and 1, since some classifiers such as MLP and SVN show better performance with normalized features [38]....
[...]

Journal Article•DOI•

DWTLSTM for electronic nose signal processing in beef quality monitoring

[...]

Dedy Rahman Wijaya¹, Riyanarto Sarno², Enny Zulaika²•Institutions (2)

Telkom University¹, Sepuluh Nopember Institute of Technology²

01 Jan 2021-Sensors and Actuators B-chemical

TL;DR: Results indicate that the DWTLSTM outperforms conventional methods such as k-nearest neighbor (k-NN), linear discriminant analysis (LDA), support vector machine/support vector regression (SVM/SVR), multilayer perceptron (MLP), and even standard long-short term memory (LSTM).

...read moreread less

Abstract: The smart packaging system is needed to continuously monitor the quality of beef and microbial population for both the meat industries as well as end consumers. Moreover, several feasibility studies of electronic nose (e-nose) for rapid beef quality assessment are also conducted in recent years. The characteristics of e-nose are fast, cheap, and easy to use make it suitable and scalable for beef quality monitoring applications. It is also potential to be integrated with consumer electronics such as refrigerator and meat chiller. However, the inevitable challenge is how to handle time-series data that is contaminated with noise. In this paper, discrete wavelet transform and long short-term memory (DWTLSTM) is proposed to overcome the e-nose signal contaminated with noise in monitoring beef quality. In beef quality classification task, our proposed has a favorable performance with 94.83% of average accuracy and 85.05% of average F-measure. Moreover, it presents a satisfactory performance in the prediction of microbial population (RMSE = 0.0515 and R2 = 0.9712). These results indicate that the DWTLSTM outperforms conventional methods such as k-nearest neighbor (k-NN), linear discriminant analysis (LDA), support vector machine/support vector regression (SVM/SVR), multilayer perceptron (MLP), and even standard long-short term memory (LSTM).

...read moreread less

54 citations

Journal Article•DOI•

Fuzzy cluster based neural network classifier for classifying breast tumors in ultrasound images

[...]

Bikesh Kumar Singh¹, Kesari Verma, A. S. Thoke•Institutions (1)

National Institute of Technology, Raipur¹

30 Dec 2016-Expert Systems With Applications

TL;DR: The empirical results suggest that eliminating doubtful training examples can improve the decision making performance of expert systems, and the proposed approach show promising results and need further evaluation in other applications of expert and intelligent systems.

...read moreread less

Abstract: New classification approach is developed.Proposed approach eliminates ambiguous and doubtful samples from training dataset.Proposed approach is used to classify breast tumors in ultrasound images.Proposed classifier outperforms conventional ones.Proposed method achieves high classification accuracy. The performance of supervised classification algorithms is highly dependent on the quality of training data. Ambiguous training patterns may misguide the classifier leading to poor classification performance. Further, the manual exploration of class labels is an expensive and time consuming process. An automatic method is needed to identify noisy samples in the training data to improve the decision making process. This article presents a new classification technique by combining an unsupervised learning technique (i.e. fuzzy c-means clustering (FCM)) and supervised learning technique (i.e. back-propagation artificial neural network (BPANN)) to categorize benign and malignant tumors in breast ultrasound images. Unsupervised learning is employed to identify ambiguous examples in the training data. Experiments were conducted on 178 B-mode breast ultrasound images containing 88 benign and 90 malignant cases on MATLABź software platform. A total of 457 features were extracted from ultrasound images followed by feature selection to determine the most significant features. Accuracy, sensitivity, specificity, area under the receiver operating characteristic curve (AUC) and Mathew's correlation coefficient (MCC) were used to access the performance of different classifiers. The result shows that the proposed approach achieves classification accuracy of 95.862% when all the 457 features were used for classification. However, the accuracy is reduced to 94.138% when only 19 most relevant features selected by multi-criterion feature selection approach were used for classification. The results were discussed in light of some recently reported studies. The empirical results suggest that eliminating doubtful training examples can improve the decision making performance of expert systems. The proposed approach show promising results and need further evaluation in other applications of expert and intelligent systems.

...read moreread less

49 citations

Journal Article•DOI•

Breast Cancer Detection, Segmentation and Classification on Histopathology Images Analysis: A Systematic Review

[...]

R. Krithiga¹, P. Geetha¹•Institutions (1)

Anna University¹

01 Jun 2021-Archives of Computational Methods in Engineering

TL;DR: This study starts with an overview of tissue preparation, analysis of stained images, and a prognosis for cancer patients, and the performance of the machine learning and deep learning techniques applied to predict breast cancer recurrence rates is evaluated.

...read moreread less

Abstract: Digital pathology represents a major evolution in modern medicine. Pathological examinations constitute the standard in medical protocols and the law, and call for specific action in the diagnostic process. Advances in digital pathology have made it possible for image analysis to take advantage of the information analysis from hematoxylin and eosin stained images. In spite of concern, it is recorded in the majority of breast cancer datasets, which makes research more difficult in prediction. The objective of our work is to evaluate the performance of the machine learning and deep learning techniques applied to predict breast cancer recurrence rates. This study starts with an overview of tissue preparation, analysis of stained images, and a prognosis for cancer patients. The high accuracy results recorded are compromised in terms of sensitivity and specificity. The missing loss function and class imbalance problems are rarely addressed, and most often the chosen performance measures are context-inappropriate. The challenge that presents itself is to analyse whole slide images for the content imaging required with diagnostic biomarkers, and prognosis support backed by digital pathology.

...read moreread less

47 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Textural Features for Image Classification

[...]

Robert M. Haralick, K. Shanmugam¹, Its'hak Dinstein²•Institutions (2)

Wichita State University¹, University of Kansas²

01 Nov 1973

TL;DR: These results indicate that the easily computable textural features based on gray-tone spatial dependancies probably have a general applicability for a wide variety of image-classification applications.

...read moreread less

Abstract: Texture is one of the important characteristics used in identifying objects or regions of interest in an image, whether the image be a photomicrograph, an aerial photograph, or a satellite image. This paper describes some easily computable textural features based on gray-tone spatial dependancies, and illustrates their application in category-identification tasks of three different kinds of image data: photomicrographs of five kinds of sandstones, 1:20 000 panchromatic aerial photographs of eight land-use categories, and Earth Resources Technology Satellite (ERTS) multispecial imagery containing seven land-use categories. We use two kinds of decision rules: one for which the decision regions are convex polyhedra (a piecewise linear decision rule), and one for which the decision regions are rectangular parallelpipeds (a min-max decision rule). In each experiment the data set was divided into two parts, a training set and a test set. Test set identification accuracy is 89 percent for the photomicrographs, 82 percent for the aerial photographic imagery, and 83 percent for the satellite imagery. These results indicate that the easily computable textural features probably have a general applicability for a wide variety of image-classification applications.

...read moreread less

20,442 citations

"Investigations on Impact of Feature..." refers background in this paper

...Summary of texture and shape features used in classification of breast tumor [10-23]...
[...]

Book•

Digital Image Processing Using MATLAB

[...]

Rafael C. Gonzalez, Richard E. Woods, Steven L. Eddins

01 Dec 2003

TL;DR: 1. Fundamentals of Image Processing, 2. Intensity Transformations and Spatial Filtering, and 3. Frequency Domain Processing.

...read moreread less

Abstract: 1. Introduction. 2. Fundamentals. 3. Intensity Transformations and Spatial Filtering. 4. Frequency Domain Processing. 5. Image Restoration. 6. Color Image Processing. 7. Wavelets. 8. Image Compression. 9. Morphological Image Processing. 10. Image Segmentation. 11. Representation and Description. 12. Object Recognition.

...read moreread less

6,306 citations

Journal Article•DOI•

Statistical and structural approaches to texture

[...]

Robert M. Haralick¹•Institutions (1)

Virginia Tech¹

01 Jan 1979

TL;DR: This survey reviews the image processing literature on the various approaches and models investigators have used for texture, including statistical approaches of autocorrelation function, optical transforms, digital transforms, textural edgeness, structural element, gray tone cooccurrence, run lengths, and autoregressive models.

...read moreread less

Abstract: In this survey we review the image processing literature on the various approaches and models investigators have used for texture. These include statistical approaches of autocorrelation function, optical transforms, digital transforms, textural edgeness, structural element, gray tone cooccurrence, run lengths, and autoregressive models. We discuss and generalize some structural approaches to texture based on more complex primitives than gray tone. We conclude with some structural-statistical generalizations which apply the statistical techniques to the structural primitives.

...read moreread less

5,112 citations

A comparative study of texture measures for terrain classification.

[...]

J. S. Weszka, A. Rosenfeld

01 Mar 1975

TL;DR: Three standard approaches to automatic texture classification make use of features based on the Fourier power spectrum, on second-order gray level statistics, and on first-order statistics of gray level differences, respectively; it was found that the Fouriers generally performed more poorly, while the other feature sets all performned comparably.

...read moreread less

Abstract: Three standard approaches to automatic texture classification make use of features based on the Fourier power spectrum, on second-order gray level statistics, and on first-order statistics of gray level differences, respectively. Feature sets of these types, all designed analogously, were used to classify two sets of terrain samples. It was found that the Fourier features generally performed more poorly, while the other feature sets all performned comparably.

...read moreread less

1,526 citations

"Investigations on Impact of Feature..." refers background in this paper

...Summary of texture and shape features used in classification of breast tumor [10-23]...
[...]

Journal Article•DOI•

A Comparative Study of Texture Measures for Terrain Classification

[...]

Joan S. Weszka¹, Charles R. Dyer¹, Azriel Rosenfeld¹•Institutions (1)

University of Maryland, College Park¹

01 Apr 1976

TL;DR: In this paper, three standard approaches to automatic texture classification make use of features based on the Fourier power spectrum, on second-order gray level statistics, and on first-order statistics of gray level differences, respectively.

...read moreread less

1,379 citations