scispace - formally typeset
Search or ask a question

Showing papers on "Dysarthria published in 2021"


Journal ArticleDOI
TL;DR: In this article, the authors provide initial recommendations for a standardized way of recording the voice and speech of patients with hypokinetic or hyperkinetic dysarthria; thus allowing clinicians and researchers to reliably collect, acoustically analyze, and compare vocal data across different centers and patient cohorts.
Abstract: Most patients with movement disorders have speech impairments resulting from sensorimotor abnormalities that affect phonatory, articulatory, and prosodic speech subsystems. There is widespread cross-discipline use of speech recordings for diagnostic and research purposes, despite which there are no specific guidelines for a standardized method. This review aims to combine the specific clinical presentations of patients with movement disorders, existing acoustic assessment protocols, and technological advances in capturing speech to provide a basis for future research in this field and to improve the consistency of clinical assessments. We considered 3 areas: the recording environment (room, seating, background noise), the recording process (instrumentation, vocal tasks, elicitation of speech samples), and the acoustic outcome data. Four vocal tasks, namely, sustained vowel, sequential and alternating motion rates, reading passage, and monologues, are integral aspects of motor speech assessment. Fourteen acoustic vocal speech features, including their hypothesized pathomechanisms with regard to typical occurrences in hypokinetic or hyperkinetic dysarthria, are hereby recommended for quantitative exploratory analysis. Using these acoustic features and experimental speech data, we demonstrated that the hyperkinetic dysarthria group had more affected speech dimensions compared with the healthy controls than had the hypokinetic speakers. Several contrasting speech patterns between both dysarthrias were also found. This article is the first attempt to provide initial recommendations for a standardized way of recording the voice and speech of patients with hypokinetic or hyperkinetic dysarthria; thus allowing clinicians and researchers to reliably collect, acoustically analyze, and compare vocal data across different centers and patient cohorts. © 2020 International Parkinson and Movement Disorder Society.

58 citations


Journal ArticleDOI
TL;DR: The authors used simple speech recording and high-end pattern analysis to provide sensitive and reliable noninvasive biomarkers of prodromal versus manifest α-synucleinopathy in patients with idiopathic rapid eye movement sleep behavior disorder (iRBD) and early stage Parkinson disease (PD).
Abstract: OBJECTIVE This multilanguage study used simple speech recording and high-end pattern analysis to provide sensitive and reliable noninvasive biomarkers of prodromal versus manifest α-synucleinopathy in patients with idiopathic rapid eye movement sleep behavior disorder (iRBD) and early-stage Parkinson disease (PD). METHODS We performed a multicenter study across the Czech, English, German, French, and Italian languages at 7 centers in Europe and North America. A total of 448 participants (337 males), including 150 with iRBD (mean duration of iRBD across language groups 0.5-3.4 years), 149 with PD (mean duration of disease across language groups 1.7-2.5 years), and 149 healthy controls were recorded; 350 of the participants completed the 12-month follow-up. We developed a fully automated acoustic quantitative assessment approach for the 7 distinctive patterns of hypokinetic dysarthria. RESULTS No differences in language that impacted clinical parkinsonian phenotypes were found. Compared with the controls, we found significant abnormalities of an overall acoustic speech severity measure via composite dysarthria index for both iRBD (p = 0.002) and PD (p < 0.001). However, only PD (p < 0.001) was perceptually distinct in a blinded subjective analysis. We found significant group differences between PD and controls for monopitch (p < 0.001), prolonged pauses (p < 0.001), and imprecise consonants (p = 0.03); only monopitch was able to differentiate iRBD patients from controls (p = 0.004). At the 12-month follow-up, a slight progression of overall acoustic speech impairment was noted for the iRBD (p = 0.04) and PD (p = 0.03) groups. INTERPRETATION Automated speech analysis might provide a useful additional biomarker of parkinsonism for the assessment of disease progression and therapeutic interventions. ANN NEUROL 2021;90:62-75.

55 citations


Journal ArticleDOI
30 Apr 2021
TL;DR: In this article, a dysarthric-specific ASR system called Speech Vision (SV) is proposed, which uses visual data augmentation techniques and leverages transfer learning to address the data scarcity problem.
Abstract: Dysarthria is a disorder that affects an individual’s speech intelligibility due to the paralysis of muscles and organs involved in the articulation process. As the condition is often associated with physically debilitating disabilities, not only do such individuals face communication problems, but also interactions with digital devices can become a burden. For these individuals, automatic speech recognition (ASR) technologies can make a significant difference in their lives as computing and portable digital devices can become an interaction medium, enabling them to communicate with others and computers. However, ASR technologies have performed poorly in recognizing dysarthric speech, especially for severe dysarthria, due to multiple challenges facing dysarthric ASR systems. We identified these challenges are due to the alternation and inaccuracy of dysarthric phonemes, the scarcity of dysarthric speech data, and the phoneme labeling imprecision. This paper reports on our second dysarthric-specific ASR system, called Speech Vision (SV) that tackles these challenges by adopting a novel approach towards dysarthric ASR in which speech features are extracted visually, then SV learns to see the shape of the words pronounced by dysarthric individuals. This visual acoustic modeling feature of SV eliminates phoneme-related challenges. To address the data scarcity problem, SV adopts visual data augmentation techniques, generates synthetic dysarthric acoustic visuals, and leverages transfer learning. Benchmarking with other state-of-the-art dysarthric ASR considered in this study, SV outperformed them by improving recognition accuracies for 67% of UA-Speech speakers, where the biggest improvements were achieved for severe dysarthria.

37 citations


Journal ArticleDOI
TL;DR: A new version of the FLF, the FL (four-level) model is presented, which further explicates and differentiates between speech motor planning, programming, and execution levels or phases of processing and identifies the loci and nature of disruption in the motor planning phase which could explain the pathophysiology and core features of AOS.
Abstract: Background: The complexity of speech motor control, and the incomplete conceptualisation of phases in the transformation of the speech code from linguistic symbols to a code amenable to a motor sys...

29 citations


Proceedings ArticleDOI
06 Jun 2021
TL;DR: This paper proposed a data augmentation method using voice conversion that allows dysarthric ASR systems to accurately recognize words outside of the training set vocabulary, and demonstrated that a small amount of disordered speech data can be used to capture the relevant vocal characteristics of a speaker with dysarthria through a parallel voice conversion system.
Abstract: Dysarthria is a condition where people experience a reduction in speech intelligibility due to a neuromotor disorder. Previous works in dysarthric speech recognition have focused on accurate recognition of words encountered in training data. Due to the rarity of dysarthria in the general population, a relatively small amount of publicly-available training data exists for dysarthric speech. The number of unique words in these datasets is small, so ASR systems trained with existing dysarthric speech data are limited to recognition of those words. In this paper, we propose a data augmentation method using voice conversion that allows dysarthric ASR systems to accurately recognize words outside of the training set vocabulary. We demonstrate that a small amount of dysarthric speech data can be used to capture the relevant vocal characteristics of a speaker with dysarthria through a parallel voice conversion system. We show that it’s possible to synthesize utterances of new words that were never recorded by speakers with dysarthria, and that these synthesized utterances can be used to train a dysarthric ASR system.

24 citations


Journal ArticleDOI
TL;DR: In this article, a Residual Network (ResNet)-based technique was proposed to detect dysarthria severity-level based on short speech segments, which can help in improving the performance and applicability of speech-based interactive systems.

21 citations


Journal ArticleDOI
TL;DR: Experimental results showed good classification accuracies for the glottal features, indicating their effectiveness in the intelligibility level assessment in speakers with dysarthria even in the challenging coded condition.

21 citations


Journal ArticleDOI
TL;DR: In this paper, an ensemble of convolutional neural networks (CNNs) was used for the detection of Parkinson's disease from the voice recordings of 50 healthy people and 50 people with PD obtained from PC-GITA, a publicly available database.

20 citations


Journal ArticleDOI
TL;DR: In this paper, the authors examined the acoustic features and feature selection methods that can be used to improve the classification of dysarthric speech, based on the severity of the impairment, and employed six classification algorithms like Support Vector Machine (SVM), Linear Discriminant Analysis (LDA), Artificial Neural Network (ANN), Classification and Regression Tree (CART), Naive Bayes (NB), and Random Forest (RF) in their experiment.
Abstract: The automatic speech recognition (ASR) system is increasingly being applied as assistive technology in the speech impaired community, for individuals with physical disabilities such as dysarthric speakers. However, the effectiveness of the ASR system in recognizing dysarthric speech can be disadvantaged by data sparsity, either in the coverage of the language, or the size of the existing speech database, not counting the severity of the speech impairment. This study examines the acoustic features and feature selection methods that can be used to improve the classification of dysarthric speech, based on the severity of the impairment. For the purpose of this study, we incorporated four acoustic features including prosody, spectral, cepstral, and voice quality and seven feature selection methods which encompassed Interaction Capping (ICAP), Conditional Information Feature Extraction (CIFE), Conditional Mutual Information Maximization (CMIM), Double Input Symmetrical Relevance (DISR), Joint Mutual Information (JMI), Conditional redundancy (Condred) and Relief. Further to that, we engaged six classification algorithms like Support Vector Machine (SVM), Linear Discriminant Analysis (LDA), Artificial Neural Network (ANN), Classification and Regression Tree (CART), Naive Bayes (NB), and Random Forest (RF) in our experiment. The classification accuracy of our experiments ranges from 40.41% to 95.80%.

19 citations


Proceedings ArticleDOI
24 Jan 2021
TL;DR: In this article, a detailed study on dysarthia severity classification using various deep learning architectural choices, namely deep neural network (DNN), convolutional neural network(CNN) and long short-term memory network (LSTM) is carried out.
Abstract: Dysarthria is a neuro-motor speech disorder that renders speech unintelligible, in proportional to its severity. Assessing the severity level of dysarthria, apart from being a diagnostic step to evaluate the patient's improvement, is also capable of aiding automatic dysarthric speech recognition systems. In this paper, a detailed study on dysarthia severity classification using various deep learning architectural choices, namely deep neural network (DNN), convolutional neural network (CNN) and long short-term memory network (LSTM) is carried out. Mel frequency cepstral coefficients (MFCCs) and its derivatives are used as features. Performance of these models are compared with a baseline support vector machine (SVM) classifier using the UA-Speech corpus and the TORGO database. The highest classification accuracy of 96.18% and 93.24% are reported for TORGO and UA-Speech respectively. Detailed analysis on performance of these models shows that a proper choice of a deep learning architecture can ensure better performance than the conventionally used SVM classifier.

18 citations


Journal ArticleDOI
TL;DR: In this paper, the prevalence of speech abnormalities in the de novo PD cohort was 56% for male and 65% for female patients, mainly manifested with monopitch, monoloudness, and articulatory decay.
Abstract: BACKGROUND The mechanisms underlying speech abnormalities in Parkinson's disease (PD) remain poorly understood, with most of the available evidence based on male patients. This study aimed to estimate the occurrence and characteristics of speech disorder in early, drug-naive PD patients with relation to gender and dopamine transporter imaging. METHODS Speech samples from 60 male and 40 female de novo PD patients as well as 60 male and 40 female age-matched healthy controls were analyzed. Quantitative acoustic vocal assessment of 10 distinct speech dimensions related to phonation, articulation, prosody, and speech timing was performed. All patients were evaluated using [123]I-2b-carbomethoxy-3b-(4-iodophenyl)-N-(3-fluoropropyl) nortropane single-photon emission computed tomography and Montreal Cognitive Assessment. RESULTS The prevalence of speech abnormalities in the de novo PD cohort was 56% for male and 65% for female patients, mainly manifested with monopitch, monoloudness, and articulatory decay. Automated speech analysis enabled discrimination between PD and controls with an area under the curve of 0.86 in men and 0.93 in women. No gender-specific speech dysfunction in de novo PD was found. Regardless of disease status, females generally showed better performance in voice quality, consonant articulation, and pauses production than males, who were better only in loudness variability. The extent of monopitch was correlated to nigro-putaminal dopaminergic loss in men (r = 0.39, p = 0.003) and the severity of imprecise consonants was related to cognitive deficits in women (r = -0.44, p = 0.005). CONCLUSIONS Speech abnormalities represent a frequent and early marker of motor abnormalities in PD. Despite some gender differences, our findings demonstrate that speech difficulties are associated with nigro-putaminal dopaminergic deficits.

Journal ArticleDOI
TL;DR: In this paper, the authors used support vector machines to analyze prosodic, articulatory, and phonemic identifiability features to identify patients with Parkinson's disease in cognitively heterogeneous, cognitively preserved, and cognitively impaired groups through tasks with low (reading) and high (retelling) processing demands.
Abstract: Background Dysarthric symptoms in Parkinson's disease (PD) vary greatly across cohorts. Abundant research suggests that such heterogeneity could reflect subject-level and task-related cognitive factors. However, the interplay of these variables during motor speech remains underexplored, let alone by administering validated materials to carefully matched samples with varying cognitive profiles and combining automated tools with machine learning methods. Objective We aimed to identify which speech dimensions best identify patients with PD in cognitively heterogeneous, cognitively preserved, and cognitively impaired groups through tasks with low (reading) and high (retelling) processing demands. Methods We used support vector machines to analyze prosodic, articulatory, and phonemic identifiability features. Patient groups were compared with healthy control subjects and against each other in both tasks, using each measure separately and in combination. Results Relative to control subjects, patients in cognitively heterogeneous and cognitively preserved groups were best discriminated by combined dysarthric signs during reading (accuracy = 84% and 80.2%). Conversely, patients with cognitive impairment were maximally discriminated from control subjects when considering phonemic identifiability during retelling (accuracy = 86.9%). This same pattern maximally distinguished between cognitively spared and impaired patients (accuracy = 72.1%). Also, cognitive (executive) symptom severity was predicted by prosody in cognitively preserved patients and by phonemic identifiability in cognitively heterogeneous and impaired groups. No measure predicted overall motor dysfunction in any group. Conclusions Predominant dysarthric symptoms appear to be best captured through undemanding tasks in cognitively heterogeneous and preserved cohorts and through cognitively loaded tasks in patients with cognitive impairment. Further applications of this framework could enhance dysarthria assessments in PD. © 2021 International Parkinson and Movement Disorder Society.

Journal ArticleDOI
TL;DR: In this article, the effects of dopaminergic treatment on speech in patients with Parkinson's disease (PD) are often mixed and unclear, and the aim of this study was to better elucidate those discrepancies.
Abstract: Importance: The effects of dopaminergic treatment on speech in patients with Parkinson's disease (PD) are often mixed and unclear. The aim of this study was to better elucidate those discrepancies. Methods: Full retrospective data from advanced PD patients before and after an acute levodopa challenge were collected. Acoustic analysis of spontaneous monologue and sustained phonation including several quantitative parameters [i.e., maximum phonation time (MPT); shimmer local dB] as well as the Unified Parkinson's Disease Rating Scale (UPDRS) (total scores, subscores, and items) and the Clinical Dyskinesia Rating Scale (CDRS) were performed in both the defined-OFF and -ON conditions. The primary outcome was the changes of speech parameters after levodopa intake. Secondary outcomes included the analysis of possible correlations of motor features and levodopa-induced dyskinesia (LID) with acoustic speech parameters. Statistical analysis included paired t -test between the ON and OFF data (calculated separately for male and female subgroups) and Pearson correlation between speech and motor data. Results: In 50 PD patients (male: 32; female: 18), levodopa significantly increased the MPT of sustained phonation in female patients ( p < 0.01). In the OFF-state, the UPDRS part-III speech item negatively correlated with MPT ( p = 0.02), whereas in the ON-state, it correlated positively with the shimmer local dB ( p = 0.01), an expression of poorer voice quality. The total CDRS score and axial subscores strongly correlated with the ON-state shimmer local dB ( p = 0.01 and p < 0.01, respectively). Conclusions: Our findings emphasize that levodopa has a poor effect on speech acoustic parameters. The intensity and location of LID negatively influenced speech quality.

Journal ArticleDOI
TL;DR: Testing the performance of three state-of-the-art ASR platforms on two groups of people with neurodegenerative disease and healthy controls and suggested potential methods to improve ASR for those with impaired speech.
Abstract: Automatic speech recognition (ASR) could potentially improve communication by providing transcriptions of speech in real time. ASR is particularly useful for people with progressive disorders that lead to reduced speech intelligibility or difficulties performing motor tasks. ASR services are usually trained on healthy speech and may not be optimized for impaired speech, creating a barrier for accessing augmented assistance devices. We tested the performance of three state-of-the-art ASR platforms on two groups of people with neurodegenerative disease and healthy controls. We further examined individual differences that may explain errors in ASR services within groups, such as age and sex. Speakers were recorded while reading a standard text. Speech was elicited from individuals with multiple sclerosis, Friedreich’s ataxia, and healthy controls. Recordings were manually transcribed and compared to ASR transcriptions using Amazon Web Services, Google Cloud, and IBM Watson. Accuracy was measured as the proportion of words that were correctly classified. ASR accuracy was higher for controls than clinical groups, and higher for multiple sclerosis compared to Friedreich’s ataxia for all ASR services. Amazon Web Services and Google Cloud yielded higher accuracy than IBM Watson. ASR accuracy decreased with increased disease duration. Age and sex did not significantly affect ASR accuracy. ASR faces challenges for people with neuromuscular disorders. Until improvements are made in recognizing less intelligible speech, the true value of ASR for people requiring augmented assistance devices and alternative communication remains unrealized. We suggest potential methods to improve ASR for those with impaired speech.

Journal ArticleDOI
TL;DR: In this article, the authors proposed to automatically discriminate between healthy and dysarthric speakers exploiting spectro-temporal subspaces of speech using singular value decomposition and applied a subspace-based discriminant analysis.
Abstract: To assist the clinical diagnosis and treatment of speech dysarthria, automatic dysarthric speech detection techniques providing reliable and cost-effective assessment are indispensable. Based on clinical evidence on spectro-temporal distortions associated with dysarthric speech, we propose to automatically discriminate between healthy and dysarthric speakers exploiting spectro-temporal subspaces of speech. Spectro-temporal subspaces are extracted using singular value decomposition, and dysarthric speech detection is achieved by applying a subspace-based discriminant analysis. Experimental results on databases of healthy and dysarthric speakers for different languages and pathologies show that the proposed subspace-based approach using temporal subspaces is more advantageous than using spectral subspaces, also outperforming several state-of-the-art automatic dysarthric speech detection techniques.

Journal ArticleDOI
TL;DR: This article found that poor communication is a central feature of SETBP1 haploinsufficiency disorder, confirming this gene as a strong candidate for speech and language disorders, with apraxia of speech (CAS) being the most common diagnosis.
Abstract: Expressive communication impairment is associated with haploinsufficiency of SETBP1, as reported in small case series. Heterozygous pathogenic loss-of-function (LoF) variants in SETBP1 have also been identified in independent cohorts ascertained for childhood apraxia of speech (CAS), warranting further investigation of the roles of this gene in speech development. Thirty-one participants (12 males, aged 0; 8-23; 2 years, 28 with pathogenic SETBP1 LoF variants, 3 with 18q12.3 deletions) were assessed for speech, language and literacy abilities. Broader development was examined with standardised motor, social and daily life skills assessments. Gross and fine motor deficits (94%) and intellectual impairments (68%) were common. Protracted and aberrant speech development was consistently seen, regardless of motor or intellectual ability. We expand the linguistic phenotype associated with SETBP1 LoF syndrome (SETBP1 haploinsufficiency disorder), revealing a striking speech presentation that implicates both motor (CAS, dysarthria) and language (phonological errors) systems, with CAS (80%) being the most common diagnosis. In contrast to past reports, the understanding of language was rarely better preserved than language expression (29%). Language was typically low, to moderately impaired, with commensurate expression and comprehension ability. Children were sociable with a strong desire to communicate. Minimally verbal children (32%) augmented speech with sign language, gestures or digital devices. Overall, relative to general development, spoken language and literacy were poorer than social, daily living, motor and adaptive behaviour skills. Our findings show that poor communication is a central feature of SETBP1 haploinsufficiency disorder, confirming this gene as a strong candidate for speech and language disorders.

Journal ArticleDOI
TL;DR: The relationship between disease severity and speech metrics through perceptual (listener based) and objective acoustic analysis and deviations of acoustic metrics in people with no perceivable dysarthria are characterized.
Abstract: BACKGROUND AND PURPOSE: Objective measurement of speech has shown promising results to monitor disease state in multiple sclerosis. In this study, we characterize the relationship between disease severity and speech metrics through perceptual (listener based) and objective acoustic analysis. We further look at deviations of acoustic metrics in people with no perceivable dysarthria. METHODS: Correlations and regression were calculated between speech measurements and disability scores, brain volume, lesion load and quality of life. Speech measurements were further compared between three subgroups of increasing overall neurological disability: mild (as rated by the Expanded Disability Status Scale ≤2.5), moderate (≥3 and ≤5.5) and severe (≥6). RESULTS: Clinical speech impairment occurred majorly in people with severe disability. An experimental acoustic composite score differentiated mild from moderate (P < 0.001) and moderate from severe subgroups (P = 0.003), and correlated with overall neurological disability (r = 0.6, P < 0.001), quality of life (r = 0.5, P < 0.001), white matter volume (r = 0.3, P = 0.007) and lesion load (r = 0.3, P = 0.008). Acoustic metrics also correlated with disability scores in people with no perceivable dysarthria. CONCLUSIONS: Acoustic analysis offers a valuable insight into the development of speech impairment in multiple sclerosis. These results highlight the potential of automated analysis of speech to assist in monitoring disease progression and treatment response.

Journal ArticleDOI
TL;DR: Practical applicability of this technology is realistic: the mPASS platform is accurate, and it could be easily used by individuals with dysarthria.
Abstract: Introduction The use of commercially available automatic speech recognition (ASR) software is challenged when dysarthria accompanies a physical disability. To overcome this issue, a mobile and personal speech assistant (mPASS) platform was developed, using a speaker-dependent ASR software. Objective The aim of this study was to evaluate the performance of the proposed platform and to compare mPASS recognition accuracy to a commercial speaker-independent ASR software. In addition, secondary aims were to investigate the relationship between severity of dysarthria and accuracy and to explore people with dysarthria perceptions on the proposed platform. Methods Fifteen individuals with dysarthric speech and 20 individuals with nondysarthric speech recorded 24 words and 5 sentences in a clinical environment. Differences in recognition accuracy between the two systems were evaluated. In addition, mPASS usability was assessed with a technology acceptance model (TAM) questionnaire. Results In both groups, mean accuracy rates were significantly higher with mPASS compared to the commercial ASR for words and for sentences. mPASS reached good levels of usefulness and ease of use according to the TAM questionnaire. Conclusions Practical applicability of this technology is realistic: the mPASS platform is accurate, and it could be easily used by individuals with dysarthria.

Journal ArticleDOI
TL;DR: In this paper, a speech and language phenotype of individuals with FOXP1-related disorders was delineated by a standardized test battery to examine speech and oral motor function, receptive and expressive language, non-verbal cognition, and adaptive behaviour.
Abstract: AIM To delineate the speech and language phenotype of a cohort of individuals with FOXP1-related disorder. METHOD We administered a standardized test battery to examine speech and oral motor function, receptive and expressive language, non-verbal cognition, and adaptive behaviour. Clinical history and cognitive assessments were analysed together with speech and language findings. RESULTS Twenty-nine patients (17 females, 12 males; mean age 9y 6mo; median age 8y [range 2y 7mo-33y]; SD 6y 5mo) with pathogenic FOXP1 variants (14 truncating, three missense, three splice site, one in-frame deletion, eight cytogenic deletions; 28 out of 29 were de novo variants) were studied. All had atypical speech, with 21 being verbal and eight minimally verbal. All verbal patients had dysarthric and apraxic features, with phonological deficits in most (14 out of 16). Language scores were low overall. In the 21 individuals who carried truncating or splice site variants and small deletions, expressive abilities were relatively preserved compared with comprehension. INTERPRETATION FOXP1-related disorder is characterized by a complex speech and language phenotype with prominent dysarthria, broader motor planning and programming deficits, and linguistic-based phonological errors. Diagnosis of the speech phenotype associated with FOXP1-related dysfunction will inform early targeted therapy. What this paper adds Individuals with FOXP1-related disorder have a complex speech and language phenotype. Dysarthria, which impairs intelligibility, is the dominant feature of the speech profile. No participants were receiving speech therapy for dysarthria, but were good candidates for therapy Features of speech apraxia occur alongside persistent phonological errors. Language abilities are low overall; however, expressive language is a relative strength.

Journal ArticleDOI
TL;DR: In this paper, an empirical classification system for speech severity in patients with dysarthria secondary to amyotrophic lateral sclerosis (ALS) by explorin-ing exploratory explorations was presented.
Abstract: Purpose: The main purpose of this study was to create an empirical classification system for speech severity in patients with dysarthria secondary to amyotrophic lateral sclerosis (ALS) by explorin...

Journal ArticleDOI
TL;DR: In this article, the authors compare speech disorders in early-onset Parkinson's disease (EOPD) and late-onset PD (LOPD), in drug-naive patients at early stages of disease.
Abstract: Substantial variability and severity of dysarthric patterns across Parkinson’s disease (PD) patients may reflect distinct phenotypic differences. We aimed to compare patterns of speech disorder in early-onset PD (EOPD) and late-onset PD (LOPD) in drug-naive patients at early stages of disease. Speech samples were acquired from a total of 96 participants, including two subgroups of 24 de-novo PD patients and two subgroups of 24 age- and sex-matched young and old healthy controls. The EOPD group included patients with age at onset below 51 (mean 42.6, standard deviation 6.1) years and LOPD group patients with age at onset above 69 (mean 73.9, standard deviation 3.0) years. Quantitative acoustic vocal assessment of 10 unique speech dimensions related to respiration, phonation, articulation, prosody, and speech timing was performed. Despite similar perceptual dysarthria severity in both PD subgroups, EOPD showed weaker inspirations (p = 0.03), while LOPD was characterized by decreased voice quality (p = 0.02) and imprecise consonant articulation (p = 0.03). In addition, age-independent occurrence of monopitch (p < 0.001), monoloudness (p = 0.008), and articulatory decay (p = 0.04) was observed in both PD subgroups. The worsening of consonant articulation was correlated with the severity of axial gait symptoms (r = 0.38, p = 0.008). Speech abnormalities in EOPD and LOPD share common features but also show phenotype-specific characteristics, likely reflecting the influence of aging on the process of neurodegeneration. The distinct pattern of imprecise consonant articulation can be interpreted as an axial motor symptom of PD.

Journal ArticleDOI
TL;DR: In this article, the authors analyzed cognitive functions, language comprehension, and speech in SMA1 children according to age and subtypes, to develop cognitive and language benchmarks that provide outcomes for clinical medication trials that are changing SMA 1 course/trajectory.
Abstract: Spinal muscular atrophy (SMA) is a chronic, neuromuscular disease characterized by degeneration of spinal cord motor neurons, resulting in progressive muscular atrophy and weakness SMA1 is the most severe form characterized by significant bulbar, respiratory, and motor dysfunction SMA1 prevents children from speaking a clearly understandable and fluent language, with their communication being mainly characterized by eye movements, guttural sounds, and anarthria (type 1a); severe dysarthria (type 1b); and nasal voice and dyslalia (type 1c) The aim of this study was to analyze for the first time cognitive functions, language comprehension, and speech in natural history SMA1 children according to age and subtypes, to develop cognitive and language benchmarks that provide outcomes for the clinical medication trials that are changing SMA1 course/trajectory This is a retrospective study including 22 children with SMA1 (10 affected by subtype 1a-1b: AB and 12 by 1c: C) aged 3–11 years in clinical stable condition with a coded way to communicate “yes” and “no” Data from the following assessments have been retrieved from patient charts: one-dimensional Raven test (RCPM), to evaluate cognitive development (IQ); ALS Severity Score (ALSSS) to evaluate speech disturbances; Brown Bellugy modified for Italian standards (TCGB) to evaluate language comprehension; and Children’s Hospital of Philadelphia Infant Test of Neuromuscular Disorders (CHOP-INTEND) to assess motor functioning SMA 1AB and 1C children were similar in age, with the former characterized by lower CHOP-INTEND scores compared to the latter All 22 children had collaborated to RCPM and their median IQ was 120 with no difference (p = 0945) between AB and C Global median score of the speech domain of the ALSSS was 5; however, it was 2 in AB children, being significantly lower than C (65, p < 0001) TCGB test had been completed by 13 children, with morphosyntactic comprehension being in the normal range (50) Although ALSSS did not correlate with both IQ and TCGB, it had a strong (p < 0001) correlation with CHOP-INTEND described by an exponential rise to maximum Although speech and motor function were severely compromised, children with SMA1 showed general intelligence and language comprehension in the normal range Speech impairment was strictly related to global motor impairment

Journal ArticleDOI
TL;DR: In this article, the authors examined the effect of dual-focus speech treatment targeting increased articulatory excursion and vocal intensity on intelligibility of narrative speech, speech acoustics, and communicative participation in children with dysarthria.
Abstract: Purpose Children with dysarthria secondary to cerebral palsy may experience reduced speech intelligibility and diminished communicative participation. However, minimal research has been conducted examining the outcomes of behavioral speech treatments in this population. This study examined the effect of Speech Intelligibility Treatment (SIT), a dual-focus speech treatment targeting increased articulatory excursion and vocal intensity, on intelligibility of narrative speech, speech acoustics, and communicative participation in children with dysarthria. Method American English-speaking children with dysarthria (n = 17) received SIT in a 3-week summer camplike setting at Columbia University. SIT follows motor-learning principles to train the child-friendly, dual-focus strategy, "Speak with your big mouth and strong voice." Children produced a story narrative at baseline, immediate posttreatment (POST), and at 6-week follow-up (FUP). Outcomes were examined via blinded listener ratings of ease of understanding (n = 108 adult listeners), acoustic analyses, and questionnaires focused on communicative participation. Results SIT resulted in significant increases in ease of understanding at POST, that were maintained at FUP. There were no significant changes to vocal intensity, speech rate, or vowel spectral characteristics, with the exception of an increase in second formant difference between vowels following SIT. Significantly enhanced communicative participation was evident at POST and FUP. Considerable variability in response to SIT was observed between children. Conclusions Dual-focus treatment shows promise for improving intelligibility and communicative participation in children with dysarthria, although responses to treatment vary considerably across children. Possible mechanisms underlying the intelligibility gains, enhanced communicative participation, and variability in treatment effects are discussed.

Journal ArticleDOI
01 Mar 2021-Stroke
TL;DR: In this article, the authors assessed frequencies, co-occurrence, and cooccurrence patterns of dysphagia, dysarthria, and aphasia following adult stroke.
Abstract: Background and Purpose: Following adult stroke, dysphagia, dysarthria, and aphasia are common sequelae. Little is known about these impairments in pediatric stroke. We assessed frequencies, co-occu...

Journal ArticleDOI
TL;DR: This article investigated the impact of lexical and articulatory characteristics of quasirandomly selected target words on intelligibility in a large sample of dysarthric speakers under clinical examination conditions.
Abstract: Purpose The clinical assessment of intelligibility must be based on a large repository and extensive variation of test materials, to render test stimuli unpredictable and thereby avoid expectancies and familiarity effects in the listeners. At the same time, it is essential that test materials are systematically controlled for factors influencing intelligibility. This study investigated the impact of lexical and articulatory characteristics of quasirandomly selected target words on intelligibility in a large sample of dysarthric speakers under clinical examination conditions. Method Using the clinical assessment tool KommPaS, a total of 2,700 sentence-embedded target words, quasirandomly drawn from a large corpus, were spoken by a group of 100 dysarthric patients and later transcribed by listeners recruited via online crowdsourcing. Transcription accuracy was analyzed for influences of lexical frequency, phonological neighborhood structure, articulatory complexity, lexical familiarity, word class, stimulus length, and embedding position. Classification and regression analyses were performed using random forests and generalized linear mixed models. Results Across all degrees of severity, target words with higher frequency, fewer and less frequent phonological neighbors, higher articulatory complexity, and higher lexical familiarity received significantly higher intelligibility scores. In addition, target words were more challenging sentence-initially than in medial or final position. Stimulus length had mixed effects; word length and word class had no effect. Conclusions In a large-scale clinical examination of intelligibility in speakers with dysarthria, several well-established influences of lexical and articulatory parameters could be replicated, and the roles of new factors were discussed. This study provides clues about how experimental rigor can be combined with clinical requirements in the diagnostics of communication impairment in patients with dysarthria.

Proceedings ArticleDOI
06 Jun 2021
TL;DR: It is confirmed that the proposed method can partially transfer the individuality of the target dysarthric speaker while maintaining the intelligibility of the source speech.
Abstract: This paper presents a high-intelligibility speech synthesis method for persons with dysarthria caused by athetoid cerebral palsy. The muscular control of such speakers is unstable because of their athetoid symptoms, and their pronunciation is unclear, which makes it difficult for them to communicate. In this paper, we present a method for generating highly intelligible speech that preserves the individuality of dysarthric speakers by combining Transformer-TTS, CycleVAE-VC, and a LPCNet vocoder. Rather than repairing prosody from the dysarthric speech, this method transfers the dysarthric speaker’s individuality to the speech of a healthy person generated by TTS synthesis. This task is both important and challenging. From the results of our evaluation experiments, we confirmed that the proposed method can partially transfer the individuality of the target dysarthric speaker while maintaining the intelligibility of the source speech.

Journal ArticleDOI
TL;DR: In this article, the authors investigated the effect of medication on speech phonemes of 24 patients with Parkinson's disease during medication off and on stages, and from 22 healthy participants.
Abstract: Background: Parkinson’s disease (PD) is a multi-symptom neurodegenerative disease generally managed with medications, of which levodopa is the most effective. Determining the dosage of levodopa requires regular meetings where motor function can be observed. Speech impairment is an early symptom in PD and has been proposed for early detection and monitoring of the disease. However, findings from previous research on the effect of levodopa on speech have not shown a consistent picture. Method: This study has investigated the effect of medication on PD patients for three sustained phonemes; /a/, /o/, and /m/, which were recorded from 24 PD patients during medication off and on stages, and from 22 healthy participants. The differences were statistically investigated, and the features were classified using Support Vector Machine (SVM). Results: The results show that medication has a significant effect on the change of time and amplitude perturbation (jitter and shimmer) and harmonics of /m/, which was the most sensitive individual phoneme to the levodopa response. /m/ and /o/ performed at a comparable level in discriminating PD- off from control recordings. However, SVM classifications based on the combined use of the three phonemes /a/, /o/, and /m/ showed the best classifications, both for medication effect and for separating PD from control voice. The SVM classification for PD- off versus PD- on achieved an AUC of 0.81. Conclusion: Studies of phonation by computerized voice analysis in PD should employ recordings of multiple phonemes. Our findings are potentially relevant in research to identify early parkinsonian dysarthria, and to tele-monitoring of the levodopa response in patients with established PD.

Journal ArticleDOI
TL;DR: In this article, the authors conducted a scoping review on the use of ASR technology by people living with ALS and identified research gaps in the existing literature, and concluded that future research needs to investigate how plwALS use ASR, how accurate it needs to be to be functionally useful, and how useful it may be over time as the disease progresses.
Abstract: BACKGROUND More than 80% of people living with Amyotrophic Lateral Sclerosis (plwALS) develop difficulties with their speech, affecting communication, self-identity and quality of life. Automatic speech recognition technology (ASR) is becoming a common way to interact with a broad range of devices, to find information and control the environment.ASR can be problematic for people with acquired neurogenic motor speech difficulties (dysarthria). Given that the field is rapidly developing, a scoping review is warranted. AIMS This study undertakes a scoping review on the use of ASR technology by plwALS and identifies research gaps in the existing literature. MATERIALS AND METHODS Electronic databases and relevant grey literature were searched from 1990 to 2020. Eleven research papers and articles were identified that included participants living with ALS using ASR technology. Relevant data were extracted from the included sources, and a narrative summary of the findings presented.Outcomes and Results: Eleven publications used recordings of plwALS to assess word recognition rate (WRR) word error rate (WER) or phoneme error rate (PER) and appropriacy of responses by ASR devices. All were found to be linked to severity of dysarthria and the ASR technology used. One article examined how speech modification may improve ASR accuracy. The final article completed thematic analysis of Amazon.com reviews for the Amazon Echo and plwALS were reported to use ASR devices to control the environment and summon assistance. CONCLUSIONS There are gaps in the evidence base: understanding expectations of plwALS and how they use ASR technology; how WER/PER/WRR relates to usability; how ASR use changes as ALS progresses.Implications for rehabilitationDevices that people can interact with using speech are becoming ubiquitous. As movement and mobility are likely to be affected by ALS and progress over time, speech interaction could be very helpful for accessing information and environmental control.However, many people living with ALS (plwALS) also have impaired speech (dysarthria) and experience trouble using voice interaction technology because it may not understand them.Although advances in automated speech recognition (ASR) technology promise better understanding of dysarthric speech, future research needs to investigate how plwALS use ASR, how accurate it needs to be to be functionally useful, and how useful it may be over time as the disease progresses.

Journal ArticleDOI
TL;DR: Preliminary findings suggest that speakers with dysarthria with reduced intelligibility are at risk to be negatively judged, particularly on their physical and mental capability.
Abstract: Purpose: To explore the influence of listener profession on impressions of speakers with dysarthria with varying intelligibility using semantic differential scales. Method: Three listener groups (u...

Journal ArticleDOI
TL;DR: In this paper, the effects of speech intelligibility treatment (SIT) on intelligibility and naturalness of narrative speech produced by francophone children with dysarthria due to cerebral palsy were examined.
Abstract: Purpose This study examined the effects of Speech Intelligibility Treatment (SIT) on intelligibility and naturalness of narrative speech produced by francophone children with dysarthria due to cerebral palsy. Method Ten francophone children with dysarthria were randomized to one of two treatments, SIT or Hand-Arm Bimanual Intensive Therapy Including Lower Extremities, a physical therapy (PT) treatment. Both treatments were conducted in a camp setting and were comparable in dosage. The children were recorded pre- and posttreatment producing a story narrative. Intelligibility was measured by means of 60 blinded listeners' orthographic transcription accuracy (percentage of words transcribed correctly). The listeners also rated the children's naturalness on a visual analogue scale. Results A significant pre- to posttreatment increase in intelligibility was found for the SIT group, but not for the PT group, with great individual variability observed among the children. No significant changes were found for naturalness ratings or sound pressure level in the SIT group or the PT group posttreatment. Articulation rate increased in both treatment groups, although not differentially across treatments. Conclusions Findings from this first treatment study on intelligibility in francophone children with dysarthria suggest that SIT shows promise for increasing narrative intelligibility in this population. Acoustic contributors to the increased intelligibility remain to be explored further. Supplemental Material https://doi.org/10.23641/asha.14161943.