scispace - formally typeset
Search or ask a question
Journal ArticleDOI

A clustering approach for autistic trait classification.

TL;DR: A new semi-supervised ML framework approach called Clustering-based Autistic Trait Classification (CATC) is proposed that uses a clustering technique and that validates classifiers using classification techniques that identifies potential autism cases based on their similarity traits as opposed to a scoring function used by many ASD screening tools.
Abstract: Machine learning (ML) techniques can be utilized by physicians, clinicians, as well as other users, to discover Autism Spectrum Disorder (ASD) symptoms based on historical cases and controls to enhance autism screening efficiency and accuracy. The aim of this study is to improve the performance of detecting ASD traits by reducing data dimensionality and eliminating redundancy in the autism dataset. To achieve this, a new semi-supervised ML framework approach called Clustering-based Autistic Trait Classification (CATC) is proposed that uses a clustering technique and that validates classifiers using classification techniques. The proposed method identifies potential autism cases based on their similarity traits as opposed to a scoring function used by many ASD screening tools. Empirical results on different datasets involving children, adolescents, and adults were verified and compared to other common machine learning classification techniques. The results showed that CATC offers classifiers with higher predictive accuracy, sensitivity, and specificity rates than those of other intelligent classification approaches such as Artificial Neural Network (ANN), Random Forest, Random Trees, and Rule Induction. These classifiers are useful as they are exploited by diagnosticians and other stakeholders involved in ASD screening.

Summary (3 min read)

1: Introduction

  • Autism Spectrum Disorder (ASD) is a neurodevelopmental condition that contributes to the delay of social and communication behaviors of individuals.
  • The official diagnosis process of ASD involves multiple examinations, which in turn cause the waiting time for patients to be lengthy 40 .
  • 6, 7, 24, 38 Most of these screening methods have been developed using existing clinical autism diagnosis methods and are represented as questionnaires in which each question is associated with a few possible answers in a multiple-choice fashion.
  • The screening of ASD traits can be considered a classification problem in which historical data that have been already classified with and without ASD traits is utilized as an input to construct a classification system.
  • Thus, by having clustering at the pre-processing phase will enhance the predictability of the classification algorithm and improve the classifier accuracy, sensitivity, specificity, and error rates among others.

2: Literature Review

  • Crane, et. al. 17 , highlighted some of challenges for a timely and adequate ASD diagnosis including the inadequate of the tools used to aid screening of ASD.
  • Thabtah et al., 41 improved the efficiency of the screening process by reducing the number of items in the self-assessment screening tool called AQ-10, 3 .
  • The authors applied their datasets to Random Forests classifiers.
  • The author also pointed out that while the studies showed promising results, none were embedded in a screening tool.
  • The authors were able to prove that only ten items can be used for screening first level ASD traits.

3: The Proposed Clustering based Autistic Trait Classification (CATC)

  • The authors discuss the proposed CATC method based on the architecture shown in Figure 1 below.
  • Three data sets (adult, adolescent, and child) are collected via a mobile screening app called ASDTest 37, 38 .
  • The data is then cleaned for their experimentations and is ran through an unsupervised machine learning clustering algorithm.
  • The result of this process is used as their initial model that is loaded to a classifier for the predictive phase.
  • Further details for each of the steps are outlined in the subsections that follow.

3.1: Data Collection

  • Initially, data is collected using a mobile screening tool called ASDTests 37, 38 .
  • The child, adolescent and adult datasets that have been collected contain instances for individuals between 4-11 years old, 12-16 years old and above 16 years respectively.
  • A score of 6 and above based on 3 indicates that the individual has some ASD traits and the class label is labeled as YES.
  • Otherwise, the class is given a value of NO.
  • The size of the datasets varies between the three groups.

Ethical Considerations

  • The data is published and made public 25 by its prospective author Thabtah et al., 40 .
  • The authors of the datasets had obtained ethical approval from the University of Huddersfield, Huddersfield, UK.

3.2: The initial Dataset and Data Transformation

  • The initial datasets are of multivariable nature with categorical, continuous and binary attributes that contain a total of 23 features (see Table 2 ).
  • A "slightly disagree" or "definitely disagree" had a score of "1" on all remaining questions.
  • The authors modified the dataset to include only 18 attributes by removing features marked 16-22 in Table 2 below in the three datasets.
  • The said features are general questions regarding the user and the app.
  • The "Screening Score" (Feature #19 in Table.

3.3: Clustering Phase

  • The datasets are pre-processed by applying an unsupervised machine learning clustering method.
  • The authors employ the OMCOKE algorithm which groups all items into two clusters.
  • The centroids are recomputed and the process is repeated until there is no movement or change in the assignment of data points to their closest centroid.
  • Algorithm 1 below summarizes the OMCOKE clustering.

3.4: Clustering Phase

  • The datasets contain a Boolean attribute named "Class" that has a value of YES/NO based on a score.
  • This attribute Class is used to assess whether the user has been screened to have ASD or not and is used in the supervised learning algorithm for their predictions.
  • These assignments are then compared to the attribute Class to see if they match.
  • Where there is a match the authors keep that instance, otherwise they discard it and remove it from the dataset.

Key features of applying CATC process includes:

  • (1) Grouping the data items into two clusters based on their strong attributes.
  • The clustering algorithm has assisted in identifying relevant and strong features that were only used in the supervised learning models.
  • (2) Reduce data dimensionality by eliminating redundancy.
  • The authors adopt the clustering based autistic traits dataset which has been efficiently streamlined and enhanced to be used in the learning phase in the machine learning process.
  • Assume the following simple dataset represented in figure 4 below as their original data.

4.1: Experimental Settings

  • The authors experiments are conducted on real-life ASD screening datasets to measure the effectiveness of the enhanced screening data used to identify and predict diagnosis.
  • The three datasets of adult, adolescence, and child have a wide diversity in their ethnicity, language, and age group and are all in the application domain of the study, hence making it suitable for use as benchmarks.
  • The authors utilized a number of evaluation measures to show the benefits and negatives of the proposed algorithm when compared with other classification algorithms in ML.
  • For ML predictive models, a matrix called the error table, or the confusion matrix, has been adopted.
  • Once this data has been pre-processed, then it is run using the classification algorithms above.

4.2: Results and Analysis

  • The experiments were conducted for the three datasets i.e. adult, adolescent, and child.
  • No significant change is noted in the ANN method.
  • This shows overall better accuracy and lower error rates for all datasets including those that have large numbers of instances, i.e. adult dataset, and those with a lower number of instances, i.e. the adolescent dataset.
  • These cases tend to confuse the learning algorithm in the classification process hence causing large false positives and false negatives.
  • The specificity rates as shown in Figure 7 has seen an improvement of 2.2%, 0.8%, 4.7% and 12% for the adult dataset on the classifiers RIPPER, PART, Random Forest, and Random Tree respectively when CATC was applied.

Figure 9. ROC Area of the classifiers

  • The authors also note that the number of rules generated while running the three datasets on RIPPER and PART decrease when CATC is applied as shown in figure 10 .
  • This can be attributed to the fact that redundant rules have been removed in the building of the classifier due to the pre-processing of the dataset and clustering them based on their strong attributes.
  • Thus, the pre-processing with clustering algorithm have assisted in identifying relevant and strong features that were only used in the supervised learning models.
  • This is useful for diagnosticians as fewer rules could mean a reduced amount of time needed in the screening of autism patients.

5: Conclusion

  • The utilization of clustering and classification together as a semi-supervised learning is rare in autism screening research.
  • The authors proposed a method that utilizes both clustering and classification in autism screening, a first that they are aware of.
  • (4) Clustering the data before application in the learning phase streamlined the data based on only strong features resulting in reduced number of rules generated by the classifiers.
  • The datasets were limited in size and the adult dataset was slightly imbalanced.
  • In conclusion, the paper shows employing CATC in the screening phase significantly improved the performance of the classifiers in all measures and especially the accuracy and sensitivity rates.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

University of Huddersfield Repository
Baadel, Said, Thabtah, Fadi and Lu, Joan
A Clustering Approach for Autism based Autistic Trait Classification
Original Citation
Baadel, Said, Thabtah, Fadi and Lu, Joan (2019) A Clustering Approach for Autism based Autistic
Trait Classification. Informatics for Health and Social Care, 45 (3). pp. 309-326. ISSN 1753-8165
This version is available at http://eprints.hud.ac.uk/id/eprint/35055/
The University Repository is a digital collection of the research output of the
University, available on Open Access. Copyright and Moral Rights for the items
on this site are retained by the individual author and/or other copyright owners.
Users may access full items free of charge; copies of full text items generally
can be reproduced, displayed or performed and given to third parties in any
format or medium for personal research or study, educational or not-for-profit
purposes without prior permission or charge, provided:
The authors, title and full bibliographic details is credited in any copy;
A hyperlink and/or URL is included for the original metadata page; and
The content is not changed in any way.
For more information, including our policy and submission procedure, please
contact the Repository Team at: E.mailbox@hud.ac.uk.
http://eprints.hud.ac.uk/

A Clustering Approach for Autism based Autistic Trait Classification
Said Baadel
1,2 *
, Fadi Thabtah,
3
Joan Lu
1
1. Faculty of Engineering and Computing Science, University of Huddersfield, Huddersfield,
UK.
2. Faculty of Communication, Arts and Sciences, Canadian University Dubai, Dubai, UAE
3. Dept of Digital Technologies, Manukau Institute of Technology, Manukau, New Zealand
* s.baadel@gmail.com

2
A Clustering Approach for Autism based Autistic Trait Classification
Machine learning (ML) techniques can be utilized by physicians, clinicians, as well as other
users, to discover Autism Spectrum Disorder (ASD) symptoms based on historical cases
and controls to enhance autism screening efficiency and accuracy. The aim of this study is
to improve the performance of detecting ASD traits by reducing data dimensionality and
eliminating redundancy in the autism dataset. To achieve this, a new semi-supervised ML
framework approach called Clustering-based Autistic Trait Classification (CATC) is
proposed that uses a clustering technique and validation of the classifiers is done by
classification techniques. The proposed method identifies potential autism cases based on
their similarity traits as opposed to a scoring function used by many ASD screening tools.
Empirical results on different datasets involving children, adolescents, and adults were
verified and compared to other common machine learning classification techniques. The
results showed that CATC offers classifiers with higher predictive accuracy, sensitivity, and
specificity rates than those of other intelligent classification approaches such as Artificial
Neural Network (ANN), Random Forest, and Random Trees, and Rule Induction. These
classifiers are useful as they are exploited by diagnosticians and other stakeholders involved
in ASD screening.
Keywords: Autism Diagnosis; Classification; Clustering; Machine Learning; OMCOKE;
Predictive Models
1: Introduction
Autism Spectrum Disorder (ASD) is a neurodevelopmental condition that contributes to the
delay of social and communication behaviors of individuals.
8,10.
Typically, ASD diagnosis is
done by clinicians in a clinical set up using observable behavioral indicators in a process
referred to as clinical judgment (CJ).
37, 45.
The official diagnosis process of ASD involves
multiple examinations, which in turn cause the waiting time for patients to be lengthy
40
. For
instance, the waiting time for an ASD diagnosis in the UK averages over 3 years
16
. Therefore, it
is vital that the administration time needed for both screening and diagnosis be reduced to cater
for the growing number of ASD patients.
25, 27
Autism screening is a fundamental step that addresses whether individuals exhibit

3
potential autistic traits related to communication, social or repeated behaviour.
1
This step is
crucial as the individual and the concerned family become aware of the possibility of ASD traits
early and hence can search for the needed formal assessments. There are many ASD screening
tools developed by researchers such as Autism Spectrum Quotient (AQ) and Childhood Autism
Rating Scale (CARS).
6, 7, 24, 38
Most of these screening methods have been developed using
existing clinical autism diagnosis methods and are represented as questionnaires in which each
question is associated with a few possible answers in a multiple-choice fashion. The
questionnaires used contain measurable indicators (variables/questions) that address
communication, behavior and social skills, of individuals. For example, the Child Behavior
Checklist (CBCL) screening method contains more than 100 questions,
2
and the AQ method
contains 50 questions
7
. These make the process of screening lengthy besides inaccessible as
most existing screening methods normally do not exist in simply accessible platforms such as
mobile.
40, 41
Most of the existing autism screening methods utilize scoring functions that compute a
final score based on the answers given by users undergoing the screening (caregivers, parents,
medical staff, teachers or even the adult patients). To be specific, the screening methods take
the answers given in the questionnaire as an input for the scoring function, which in turn
processes the input and computes a final score to reflect whether the individual is associated
with ASD traits. For instance, in AQ method, a cut-off score of larger than 32 is an indication of
autistic traits.
4, 7
Therefore, the final decision of having ASD traits lay solely on the score
calculated by the function. This function in most cases just sums up the behavioural indicators’
answers and does not attempt to seek for correlations among these indicators and the target class
(ASD traits).
To address these shortcomings, there is a need for intelligent methods that can replace
the scoring function and improve the efficiency of the screening. Since ASD screening involves

4
forecasting whether individuals have the possibility of ASD traits based on a predefined
characterized variable then this issue be a predictive analysis problem in ML. The screening of
ASD traits can be considered a classification problem in which historical data that have been
already classified with and without ASD traits is utilized as an input to construct a classification
system. This system is then used to guess whether a new individual exhibits any autistic traits.
ML can be utilized for ASD screening to improve the classification of the screening and to
reduce the process of the screening time. More importantly, ML may provide models that can
contain useful information about ASD traits to the diagnosticians especially the correlation
among behavioral indicators and how they relate to ASD screening. ML techniques use artificial
intelligence and statistics to create intelligent models by discovering hidden patterns in data, so
users can improve decisions.
41
There have been recent attempts to adopt ML techniques in autism screening and
diagnosis, i.e.
1, 9, 11, 15, 25, 37, 40
. These studies focused primarily on improving time, accuracy, and
reducing the dimensionality of the dataset by pinpointing influential autistic symptoms. Thabtah
et al.,
41
proposed a new feature selection method called Variable Analysis (Va) to determine the
most influential features related to ASD based on datasets related to adults, adolescents, and
children. The authors were able to minimize the number of features to 5-7 based on predictive
analysis and filter methods. Abbas et al.,
1
used Random Forest to improve the diagnosis process
of autism and Levy et al.,
25
compared 17 different classification-based ML algorithms to seek
improvements on the diagnosis performance of autism for children.
In this paper, we propose a new semi-supervised learning method called Clustering
based Autistic Trait Classification (CATC), to improve the accuracy of the autism screening
problem. The utilization of clustering and classification together as a semi-supervised learning is
rare in autism screening research. Unlike existing methods that primarily focused on the
classification phase of cases and controls, we intend to utilize clustering with classification to

Citations
More filters
Journal ArticleDOI
TL;DR: This study investigates and analyzes up-to-date studies on machine learning methods for feature selection and classification of autism and recommends methods to enhance machine learning’s speedy execution for processing complex data for conceptualization and implementation in ASD diagnostic research.
Abstract: Autism Spectrum Disorder (ASD), according to DSM-5 in the American Psychiatric Association, is a neurodevelopmental disorder that includes deficits of social communication and social interaction with the presence of restricted and repetitive behaviors. Children with ASD have difficulties in joint attention and social reciprocity, using non-verbal and verbal behavior for communication. Due to these deficits, children with autism are often socially isolated. Researchers have emphasized the importance of early identification and early intervention to improve the level of functioning in language, communication, and well-being of children with autism. However, due to limited local assessment tools to diagnose these children, limited speech-language therapy services in rural areas, etc., these children do not get the rehabilitation they need until they get into compulsory schooling at the age of seven years old. Hence, efficient approaches towards early identification and intervention through speedy diagnostic procedures for ASD are required. In recent years, advanced technologies like machine learning have been used to analyze and investigate ASD to improve diagnostic accuracy, time, and quality without complexity. These machine learning methods include artificial neural networks, support vector machines, a priori algorithms, and decision trees, most of which have been applied to datasets connected with autism to construct predictive models. Meanwhile, the selection of features remains an essential task before developing a predictive model for ASD classification. This review mainly investigates and analyzes up-to-date studies on machine learning methods for feature selection and classification of ASD. We recommend methods to enhance machine learning’s speedy execution for processing complex data for conceptualization and implementation in ASD diagnostic research. This study can significantly benefit future research in autism using a machine learning approach for feature selection, classification, and processing imbalanced data.

52 citations

Journal ArticleDOI
TL;DR: In this paper, an improved transfer-learning-based autism face recognition framework is proposed to identify kids with ASD in the early stages more precisely, and the improved MobileNet-V1 model showed the highest accuracy (92.10%) for k = 2 autism subtypes.
Abstract: Autism spectrum disorder (ASD) is a complex neuro-developmental disorder that affects social skills, language, speech and communication. Early detection of ASD individuals, especially children, could help to devise and strategize right therapeutic plan at right time. Human faces encode important markers that can be used to identify ASD by analyzing facial features, eye contact, and so on. In this work, an improved transfer-learning-based autism face recognition framework is proposed to identify kids with ASD in the early stages more precisely. Therefore, we have collected face images of children with ASD from the Kaggle data repository, and various machine learning and deep learning classifiers and other transfer-learning-based pre-trained models were applied. We observed that our improved MobileNet-V1 model demonstrates the best accuracy of 90.67% and the lowest 9.33% value of both fall-out and miss rate compared to the other classifiers and pre-trained models. Furthermore, this classifier is used to identify different ASD groups investigating only autism image data using k-means clustering technique. Thus, the improved MobileNet-V1 model showed the highest accuracy (92.10%) for k = 2 autism sub-types. We hope this model will be useful for physicians to detect autistic children more explicitly at the early stage.

23 citations

Book ChapterDOI
17 Sep 2021
TL;DR: In this article, a machine learning model was proposed to generate autism subtypes and identify discriminatory factors among them, which achieved the highest results for primary dataset and achieved the greatest results for ROS and SMOTENC datasets.
Abstract: Autism spectrum disorder (ASD) is a neuro-developmental disease that has a lifetime impact on a person’s ability to interact and communicate with others. Early discovery of autism can assist to prepare a plan for suitable therapy and reduce its impact on patients at an appropriate time. The aim of this work is to propose a machine learning model which generates autism subtypes and identifies discriminatory factors among them. In this work, we use Quantitative Checklist for Autism in Toddlers-10 (Q-CHAT-10) of toddler and Autism Spectrum Quotient-10 (AQ-10) datasets of child, adolescent, and adult screening datasets respectively. Then, only autism records are merged and implemented k-means algorithm to extract various autism subtypes. According to Silhoutte score, we select the best autism dataset and balance its subtypes using random oversampling (ROS) and synthetic minority oversampling technique for numeric and categorical values (SMOTENC). Afterwards, various classifiers are employed into both primary dataset and its balanced subtypes. In this work, logistic regression shows the highest result for primary dataset. Also, it achieves the greatest results for ROS and SMOTENC datasets. Hence, shapely adaptive explanation (SHAP) technique is used to rank features and scrutinized discriminatory factors of these autism subtypes.

14 citations

Journal ArticleDOI
TL;DR: In this article, the authors systematically reviewed recent articles on the application of ML in the behavioral assessment of ASD, and highlighted common challenges in the studies, and proposed vital considerations for real-life implementation of ML-based ASD screening and diagnostic systems.
Abstract: Autism spectrum disorder (ASD) is associated with significant social, communication, and behavioral challenges. The insufficient number of trained clinicians coupled with limited accessibility to quick and accurate diagnostic tools resulted in overlooking early symptoms of ASD in children around the world. Several studies have utilized behavioral data in developing and evaluating the performance of machine learning (ML) models toward quick and intelligent ASD assessment systems. However, despite the good evaluation metrics achieved by the ML models, there is not enough evidence on the readiness of the models for clinical use. Specifically, none of the existing studies reported the real-life application of the ML-based models. This might be related to numerous challenges associated with the data-centric techniques utilized and their misalignment with the conceptual basis upon which professionals diagnose ASD. The present work systematically reviewed recent articles on the application of ML in the behavioral assessment of ASD, and highlighted common challenges in the studies, and proposed vital considerations for real-life implementation of ML-based ASD screening and diagnostic systems. This review will serve as a guide for researchers, neuropsychiatrists, psychologists, and relevant stakeholders on the advances in ASD screening and diagnosis using ML.

14 citations

References
More filters
Journal ArticleDOI
01 Oct 2001
TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Abstract: Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, aaa, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.

79,257 citations

Book
15 Oct 1992
TL;DR: A complete guide to the C4.5 system as implemented in C for the UNIX environment, which starts from simple core learning methods and shows how they can be elaborated and extended to deal with typical problems such as missing data and over hitting.
Abstract: From the Publisher: Classifier systems play a major role in machine learning and knowledge-based systems, and Ross Quinlan's work on ID3 and C4.5 is widely acknowledged to have made some of the most significant contributions to their development. This book is a complete guide to the C4.5 system as implemented in C for the UNIX environment. It contains a comprehensive guide to the system's use , the source code (about 8,800 lines), and implementation notes. The source code and sample datasets are also available on a 3.5-inch floppy diskette for a Sun workstation. C4.5 starts with large sets of cases belonging to known classes. The cases, described by any mixture of nominal and numeric properties, are scrutinized for patterns that allow the classes to be reliably discriminated. These patterns are then expressed as models, in the form of decision trees or sets of if-then rules, that can be used to classify new cases, with emphasis on making the models understandable as well as accurate. The system has been applied successfully to tasks involving tens of thousands of cases described by hundreds of properties. The book starts from simple core learning methods and shows how they can be elaborated and extended to deal with typical problems such as missing data and over hitting. Advantages and disadvantages of the C4.5 approach are discussed and illustrated with several case studies. This book and software should be of interest to developers of classification-based intelligent systems and to students in machine learning and expert systems courses.

21,674 citations


"A clustering approach for autistic ..." refers background in this paper

  • ...5.(36,37) The results analysis showed that VA selected influential features for the three datasets (6, 8, and 8 respectively) without compromising on the specificity, sensitivity, and prediction accuracies measurements....

    [...]

Journal ArticleDOI
TL;DR: Algorithm sensitivities and specificities for autism and PD DNOS relative to nonspectrum disorders were excellent, with moderate differentiation of autism from PDDNOS.
Abstract: The Autism Diagnostic Observation Schedule-Generic (ADOS-G) is a semistructured, standardized assessment of social interaction, communication, play, and imaginative use of materials for individuals suspected of having autism spectrum disorders. The observational schedule consists of four 30-minute modules, each designed to be administered to different individuals according to their level of expressive language. Psychometric data are presented for 223 children and adults with Autistic Disorder (autism), Pervasive Developmental Disorder Not Otherwise Specified (PDDNOS) or nonspectrum diagnoses. Within each module, diagnostic groups were equivalent on expressive language level. Results indicate substantial interrater and test-retest reliability for individual items, excellent interrater reliability within domains and excellent internal consistency. Comparisons of means indicated consistent differentiation of autism and PDDNOS from nonspectrum individuals, with some, but less consistent, differentiation of autism from PDDNOS. A priori operationalization of DSM-IV/ICD-10 criteria, factor analyses, and ROC curves were used to generate diagnostic algorithms with thresholds set for autism and broader autism spectrum/PDD. Algorithm sensitivities and specificities for autism and PDDNOS relative to nonspectrum disorders were excellent, with moderate differentiation of autism from PDDNOS.

7,012 citations

Journal ArticleDOI
01 Mar 2002
TL;DR: This presentation discusses the design and implementation of machine learning algorithms in Java, as well as some of the techniques used to develop and implement these algorithms.
Abstract: 1. What's It All About? 2. Input: Concepts, Instances, Attributes 3. Output: Knowledge Representation 4. Algorithms: The Basic Methods 5. Credibility: Evaluating What's Been Learned 6. Implementations: Real Machine Learning Schemes 7. Moving On: Engineering The Input And Output 8. Nuts And Bolts: Machine Learning Algorithms In Java 9. Looking Forward

5,936 citations

Frequently Asked Questions (14)
Q1. What contributions have the authors mentioned in the paper "A clustering approach for autism based autistic trait classification" ?

Copyright and Moral Rights for the items on this site are retained by the individual author and/or other copyright owners. 

Their future work will be to build a mobile screening app that will embed their clustering algorithm to assist clinicians in the diagnosis process of ASD in a clinical setting by considering wider options of diagnosis methods. 

On the adolescent dataset, when CATC was applied, the percentage increment of the RIPPER, PART, Random Forest, and Random Tree classifiersare 21.2%, 0.5%, 2.4%, and 11.5% respectively. 

Classificationalgorithms are generally divided into a two-step process where the dataset is divided into training data and testing data. 

the authors adopted RIPPER17, PART21, Random Forest13, Random Trees19, and Artificial Neural Network [ANN]45 algorithms to process the considered autism datasets with and without clustering. 

In addition, PART classifier's sensitivity rate went up by 0.9%, 6.9% and 7.5% on the adult, adolescent, and child respectively, when CATC was integrated. 

Since ASD screening involvesforecasting whether individuals have the possibility of ASD traits based on a predefined characterized variable then this issue be a predictive analysis problem in ML. 

They concluded that SVM and logistic regression performed best with ROC of 93% and 92% respectively and logistic regression and Lasso performed best on module 3 with a ROC of 93%. 

They concluded that function based algorithms such as regression models performed better with high classification accuracy compared to the decision tree based algorithms such as Random Forest. 

The researchers used the support vector machine algorithm and could predict the ASD diagnosis with a classification accuracy of 79%. 

In addition, PART predictive accuracy has improved by 0.8%, 3.8% and 5.5% on the three datasets respectively when CATC was applied. 

In conclusion, the paper shows employing CATC in the screening phase significantlyimproved the performance of the classifiers in all measures and especially the accuracy and sensitivity rates. 

a good predictor of the model performance would be the true positive rate (sensitivity) and the true negative rate (specificity). 

Their results suggest that combining the video and questionnaire into a single assessment boosted the sensitivity and specificity rates and overall performance of the study sample.