scispace - formally typeset
Search or ask a question

Showing papers in "ACM Transactions on Accessible Computing in 2015"


Journal ArticleDOI
TL;DR: A new scalable method for collecting bus stop location and landmark descriptions by combining online crowdsourcing and Google Street View (GSV), which reemphasizes the importance of landmarks in nonvisual navigation, and demonstrates that GSV is a viable bus stop audit dataset.
Abstract: Low-vision and blind bus riders often rely on known physical landmarks to help locate and verify bus stop locations (e.g., by searching for an expected shelter, bench, or newspaper bin). However, there are currently few, if any, methods to determine this information a priori via computational tools or services. In this article, we introduce and evaluate a new scalable method for collecting bus stop location and landmark descriptions by combining online crowdsourcing and Google Street View (GSV). We conduct and report on three studies: (i) a formative interview study of 18 people with visual impairments to inform the design of our crowdsourcing tool, (ii) a comparative study examining differences between physical bus stop audit data and audits conducted virtually with GSV, and (iii) an online study of 153 crowd workers on Amazon Mechanical Turk to examine the feasibility of crowdsourcing bus stop audits using our custom tool with GSV. Our findings reemphasize the importance of landmarks in nonvisual navigation, demonstrate that GSV is a viable bus stop audit dataset, and show that minimally trained crowd workers can find and identify bus stop landmarks with 82.5p accuracy across 150 bus stop locations (87.3p with simple quality control).

87 citations


Journal ArticleDOI
TL;DR: This work presents a modular system with dedicated procedures for syntactic and lexical simplification that are grounded on the analysis of a corpus manually simplified for people with special needs, and shows that sentence meaning is preserved in most cases.
Abstract: The way in which a text is written can be a barrier for many people. Automatic text simplification is a natural language processing technology that, when mature, could be used to produce texts that are adapted to the specific needs of particular users. Most research in the area of automatic text simplification has dealt with the English language. In this article, we present results from the Simplext project, which is dedicated to automatic text simplification for Spanish. We present a modular system with dedicated procedures for syntactic and lexical simplification that are grounded on the analysis of a corpus manually simplified for people with special needs. We carried out an automatic evaluation of the system’s output, taking into account the interaction between three different modules dedicated to different simplification aspects. One evaluation is based on readability metrics for Spanish and shows that the system is able to reduce the lexical and syntactic complexity of the texts. We also show, by means of a human evaluation, that sentence meaning is preserved in most cases. Our results, even if our work represents the first automatic text simplification system for Spanish that addresses different linguistic aspects, are comparable to the state of the art in English Automatic Text Simplification.

82 citations


Journal ArticleDOI
TL;DR: How collaborative assistive technologies, housed on off-the-shelf, low-cost platforms such as the iPad, can be used to facilitate social relationships in children with autism spectrum disorder is described.
Abstract: This article describes how collaborative assistive technologies, housed on off-the-shelf, low-cost platforms such as the iPad, can be used to facilitate social relationships in children with autism spectrum disorder (ASD). Through an empirical study of the use of a collaborative iPad game, Zody, we explore how assistive technologies can be used to support social relationships, even without intervention from adults. We discuss how specific design choices can encourage three levels of social relationship: membership, partnership, and friendship. This work contributes to research on both assistive technologies and collaborative gaming through a framework that describes how specific in-game elements can foster social skill development for children with ASD.

75 citations


Journal ArticleDOI
TL;DR: Surprisingly, while no humanoid aspect was introduced in the system, the senior participants were inclined to embody the system and were disturbed by the rigid structure of the grammar and were eager to adapt it to their own preferences.
Abstract: This article presents an experiment with seniors and people with visual impairment in a voice-controlled smart home using the Sweet-Home system. The experiment shows some weaknesses in automatic speech recognition that must be addressed, as well as the need for better adaptation to the user and the environment. Users were disturbed by the rigid structure of the grammar and were eager to adapt it to their own preferences. Surprisingly, while no humanoid aspect was introduced in the system, the senior participants were inclined to embody the system. Despite these aspects to improve, the system has been favorably assessed as diminishing most participant fears related to the loss of autonomy.

73 citations


Journal ArticleDOI
TL;DR: A baseline assessment of the types of technical and communicative challenges that will need to be overcome for robots to be used effectively in the home for speech-based assistance with daily living is provided.
Abstract: Increases in the prevalence of dementia and Alzheimer’s disease (AD) are a growing challenge in many nations where healthcare infrastructures are ill-prepared for the upcoming demand for personal caregiving. To help individuals with AD live at home for longer, we are developing a mobile robot, called ED, intended to assist with activities of daily living through visual monitoring and verbal prompts in cases of difficulty. In a series of experiments, we study speech-based interactions between ED and each of 10 older adults with AD as the latter complete daily tasks in a simulated home environment. Traditional automatic speech recognition is evaluated in this environment, along with rates of verbal behaviors that indicate confusion or trouble with the conversation. Analysis reveals that speech recognition remains a challenge in this setup, especially during household tasks with individuals with AD. Across the verbal behaviors that indicate confusion, older adults with AD are very likely to simply ignore the robot, which accounts for over 40p of all such behaviors when interacting with the robot. This work provides a baseline assessment of the types of technical and communicative challenges that will need to be overcome for robots to be used effectively in the home for speech-based assistance with daily living.

56 citations


Journal ArticleDOI
TL;DR: A dwell-free eye typing technique that filters out unintentionally selected letters from the sequence of letters looked at by the user and ranks possible words based on their length and frequency of use and suggests them to the user.
Abstract: The ability to use the movements of the eyes to write is extremely important for individuals with a severe motor disability. With eye typing, a virtual keyboard is shown on the screen and the user enters text by gazing at the intended keys one at a time. With dwell-based eye typing, a key is selected by continuously gazing at it for a specific amount of time. However, this approach has two possible drawbacks: unwanted selections and slow typing rates. In this study, we propose a dwell-free eye typing technique that filters out unintentionally selected letters from the sequence of letters looked at by the user. It ranks possible words based on their length and frequency of use and suggests them to the user. We evaluated Filteryedping with a series of experiments. First, we recruited participants without disabilities to compare it with another potential dwell-free technique and with a dwell-based eye typing interface. The results indicate it is a fast technique that allows an average of 15.95 words per minute after 100min of typing. Then, we improved the technique through iterative design and evaluation with individuals who have severe motor disabilities. This phase helped to identify and create parameters that allow the technique to be adapted to different users.

55 citations


Journal ArticleDOI
TL;DR: Two methods of employing novice Web workers to author descriptions of science, technology, engineering, and mathematics images to make them accessible to individuals with visual and print-reading disabilities are compared.
Abstract: This article compares two methods of employing novice Web workers to author descriptions of science, technology, engineering, and mathematics images to make them accessible to individuals with visual and print-reading disabilities. The goal is to identify methods of creating image descriptions that are inexpensive, effective, and follow established accessibility guidelines. The first method explicitly presented the guidelines to the worker, then the worker constructed the image description in an empty text box and table. The second method queried the worker for image information and then used responses to construct a template-based description according to established guidelines. The descriptions generated through queried image description (QID) were more likely to include information on the image category, title, caption, and units. They were also more similar to one another, based on Jaccard distances of q-grams, indicating that their word usage and structure were more standardized. Last, the workers preferred describing images using QID and found the task easier. Therefore, explicit instruction on image-description guidelines is not sufficient to produce quality image descriptions when using novice Web workers. Instead, it is better to provide information about images, then generate descriptions from responses using templates.

47 citations


Journal ArticleDOI
TL;DR: It is shown that intelligibility assessments work best if there is a pre-existing set of words annotated for intelligibility from the speaker to be evaluated, which can be used for training the system.
Abstract: Automated intelligibility assessments can support speech and language therapists in determining the type of dysarthria presented by their clients. Such assessments can also help predict how well a person with dysarthria might cope with a voice interface to assistive technology. Our approach to intelligibility assessment is based on iVectors, a set of measures that capture many aspects of a person’s speech, including intelligibility. The major advantage of iVectors is that they compress all acoustic information contained in an utterance into a reduced number of measures, and they are very suitable to be used with simple predictors. We show that intelligibility assessments work best if there is a pre-existing set of words annotated for intelligibility from the speaker to be evaluated, which can be used for training our system. We discuss the implications of our findings for practice.

38 citations


Journal ArticleDOI
TL;DR: Conversion of whispers into natural-sounding phonated speech as a noninvasive prosthetic aid for people with voice impairments who can only whisper is considered.
Abstract: Whispering is a natural, unphonated, secondary aspect of speech communications for most people. However, it is the primary mechanism of communications for some speakers who have impaired voice production mechanisms, such as partial laryngectomees, as well as for those prescribed voice rest, which often follows surgery or damage to the larynx. Unlike most people, who choose when to whisper and when not to, these speakers may have little choice but to rely on whispers for much of their daily vocal interaction.Even though most speakers will whisper at times, and some speakers can only whisper, the majority of today’s computational speech technology systems assume or require phonated speech. This article considers conversion of whispers into natural-sounding phonated speech as a noninvasive prosthetic aid for people with voice impairments who can only whisper. As a by-product, the technique is also useful for unimpaired speakers who choose to whisper.Speech reconstruction systems can be classified into those requiring training and those that do not. Among the latter, a recent parametric reconstruction framework is explored and then enhanced through a refined estimation of plausible pitch from weighted formant differences. The improved reconstruction framework, with proposed formant-derived artificial pitch modulation, is validated through subjective and objective comparison tests alongside state-of-the-art alternatives.

34 citations


Journal ArticleDOI
TL;DR: The results suggest that the participants overwhelmingly preferred the search engine method to the two browsing conditions, and the broad structure resulted in significantly higher failure rates than thesearch engine condition and the deep structure condition.
Abstract: The ability to gather information online has become increasingly important in the past decades. Previous research suggests that people with cognitive disabilities experience challenges when finding information on websites. Although a number of studies examined the impact of various design guidelines on information search by people with cognitive disabilities, our knowledge in this topic remains limited. To date, no study has been conducted to examine how people with cognitive disabilities navigate in different content structures. We completed an empirical study to investigate the impact of different search methods and content structures on the search behavior of people with cognitive disabilities. 23 participants with various cognitive disabilities completed 15 information search tasks under three conditions: browsing a website with a deep structure (4 × 4 × 4 × 4), browsing a website with a broad structure (16 × 16), and searching through a search engine. The results suggest that the participants overwhelmingly preferred the search engine method to the two browsing conditions. The broad structure resulted in significantly higher failure rates than the search engine condition and the deep structure condition. The causes of failed search tasks were analyzed in detail. Participants frequently visited incorrect categories in both the deep structure and the broad structure conditions. However, it was more difficult to recover from incorrect categories on the lower-level pages in the broad structure than in the deep structure. Under the search engine condition, failed tasks were mainly caused by difficulty in selecting the correct link from the returned list, misspellings, and difficulty in generating appropriate search keywords.

28 citations


Journal ArticleDOI
TL;DR: KINECTWheels is presented, a toolkit designed to integrate wheelchair movements into motion-based games and has the potential of encouraging players of all ages to develop a positive relationship with their wheelchair.
Abstract: People using wheelchairs have access to fewer sports and other physically stimulating leisure activities than nondisabled persons, and often lead sedentary lifestyles that negatively influence their health. While motion-based video games have demonstrated great potential of encouraging physical activity among nondisabled players, the accessibility of motion-based games is limited for persons with mobility disabilities, thus also limiting access to the potential health benefits of playing these games. In our work, we address this issue through the design of wheelchair-accessible motion-based game controls. We present KINECTWheels, a toolkit designed to integrate wheelchair movements into motion-based games. Building on the toolkit, we developed Cupcake Heaven, a wheelchair-based video game designed for older adults using wheelchairs, and we created Wheelchair Revolution, a motion-based dance game that is accessible to both persons using wheelchairs and nondisabled players. Evaluation results show that KINECTWheels can be applied to make motion-based games wheelchair-accessible, and that wheelchair-based games engage broad audiences in physically stimulating play. Through the application of the wheelchair as an enabling technology in games, our work has the potential of encouraging players of all ages to develop a positive relationship with their wheelchair.

Journal ArticleDOI
TL;DR: The results support the feasibility of the system as a complement to traditional face-to-face therapy through the use of mobile tools and automated speech analysis algorithms.
Abstract: We present a multitier system for the remote administration of speech therapy to children with apraxia of speech. The system uses a client-server architecture model and facilitates task-oriented remote therapeutic training in both in-home and clinical settings. The system allows a speech language pathologist (SLP) to remotely assign speech production exercises to each child through a web interface and the child to practice these exercises in the form of a game on a mobile device. The mobile app records the child's utterances and streams them to a back-end server for automated scoring by a speech-analysis engine. The SLP can then review the individual recordings and the automated scores through a web interface, provide feedback to the child, and adapt the training program as needed. We have validated the system through a pilot study with children diagnosed with apraxia of speech, their parents, and SLPs. Here, we describe the overall client-server architecture, middleware tools used to build the system, speech-analysis tools for automatic scoring of utterances, and present results from a clinical study. Our results support the feasibility of the system as a complement to traditional face-to-face therapy through the use of mobile tools and automated speech analysis algorithms.

Journal ArticleDOI
TL;DR: This article investigates automatic speech processing approaches dedicated to the detection and localization of abnormal acoustic phenomena in speech signal produced by people with speech disorders and proposes two different approaches that obtain very encouraging results.
Abstract: Perceptual evaluation is still the most common method in clinical practice for diagnosing and following the progression of the condition of people with speech disorders. Although a number of studies have addressed the acoustic analysis of speech productions exhibiting impairments, additional descriptive analysis is required to manage interperson variability, considering speakers with the same condition or across different conditions. In this context, this article investigates automatic speech processing approaches dedicated to the detection and localization of abnormal acoustic phenomena in speech signal produced by people with speech disorders. This automatic process aims at enhancing the manual investigation of human experts while at the same time reducing the extent of their intervention by calling their attention to specific parts of the speech considered as atypical from an acoustical point of view.Two different approaches are proposed in this article. The first approach models only the normal speech, whereas the second models both normal and dysarthric speech. Both approaches are evaluated following two strategies: one consists of a strict phone comparison between a human annotation of abnormal phones and the automatic output, while the other uses a “one-phone delay” for the comparison.The experimental evaluation of both approaches for the task of detecting acoustic anomalies was conducted on two different corpora composed of French dysarthric speakers and control speakers. These approaches obtain very encouraging results and their potential for clinical uses with different types of dysarthria and neurological diseases is quite promising.

Journal ArticleDOI
TL;DR: The findings confirm past work with sighted users showing that the hand results in faster pointing than the phone, and highlight the potential of on-body input to support accessible nonvisual mobile computing.
Abstract: On-body interaction, in which the user employs one's own body as an input surface, has the potential to provide efficient mobile computing access for blind users. It offers increased tactile and proprioceptive feedback compared to a phone and, because it is always available, it should allow for quick audio output control without having to retrieve the phone from a pocket or bag. Despite this potential, there has been little investigation of on-body input for users with visual impairments. To assess blind users’ performance with on-body input versus touchscreen input, we conducted a controlled lab study with 12 sighted and 11 blind participants. Study tasks included basic pointing and drawing more complex shape gestures. Our findings confirm past work with sighted users showing that the hand results in faster pointing than the phone. Most important, we also show that: (1) the performance gain of the hand applies to blind users as well, (2) the accuracy of where the pointing finger first lands is higher with the hand than the phone, (3) on-hand pointing performance is affected by the location of targets, and (4) shape gestures drawn on the hand result in higher gesture recognition rates than those on the phone. Our findings highlight the potential of on-body input to support accessible nonvisual mobile computing.

Journal ArticleDOI
TL;DR: Two techniques to teach touchscreen gestures to users with visual impairments are proposed and evaluated: gesture sonification to generate sound based on finger touches, creating an audio representation of a gesture; and corrective verbal feedback that combined automatic analysis of the user's drawn gesture with speech feedback.
Abstract: While sighted users may learn to perform touchscreen gestures through observation (e.g., of other users or video tutorials), such mechanisms are inaccessible for users with visual impairments. As a result, learning to perform gestures without visual feedback can be challenging. We propose and evaluate two techniques to teach touchscreen gestures to users with visual impairments: (1) gesture sonification to generate sound based on finger touches, creating an audio representation of a gesture; and (2) corrective verbal feedback that combined automatic analysis of the user's drawn gesture with speech feedback. To refine and evaluate the techniques, we conducted three controlled laboratory studies. The first study, with 12 sighted participants, compared parameters for sonifying gestures in an eyes-free scenario. We identified pitch+stereo panning as the best combination. In the second study, ten blind and low-vision participants completed gesture replication tasks for single-stroke, multistroke, and multitouch gestures using the gesture sonification feedback. We found that multistroke gestures were more difficult to understand in sonification, but that playing each finger sound serially may improve understanding. In the third study, six blind and low-vision participants completed gesture replication tasks with both the sonification and corrective verbal feedback techniques. Subjective data and preliminary performance findings indicated that the techniques offer complementary advantages: although verbal feedback was preferred overall primarily due to the precision of its instructions, almost all participants appreciated the sonification for certain situations (e.g., to convey speed). This article extends our previous publication on gesture sonification by extending these techniques to multistroke and multitouch gestures. These findings provide a foundation for nonvisual training systems for touchscreen gestures.

Journal ArticleDOI
TL;DR: This study quantitatively evaluated the effect of using language models with location knowledge on the efficiency of a word and sentence prediction system and introduced a second location-aware strategy that combines the location-specific approach with the all-purpose approach.
Abstract: In recent years, some works have discussed the conception of location-aware Augmentative and Alternative Communication (AAC) systems with very positive feedback from participants. However, in most cases, complementary quantitative evaluations have not been carried out to confirm those results. To contribute to clarifying the validity of these approaches, our study quantitatively evaluated the effect of using language models with location knowledge on the efficiency of a word and sentence prediction system. Using corpora collected for three different locations (classroom, school cafeteria, home), location-specific language models were trained with sentences from each location and compared with a traditional all-purpose language model, trained on all corpora. User tests showed a modest mean improvement of 2.4p and 1.3p for Words Per Minute (WPM) and Keystroke Saving Rate (KSR), respectively, but the differences were not statistically significant. Since our text prediction system relies on the concept of sentence reuse, we ran a set of simulations with language models having different sentence knowledge levels (0p, 25p, 50p, 75p, 100p). We also introduced in the comparison a second location-aware strategy that combines the location-specific approach with the all-purpose approach (mixed approach). The mixed language models performed better under low sentence-reuse conditions (0p, 25p, 50p) with 1.0p, 1.3p, and 1.2p KSR improvements, respectively. The location-specific language models performed better under high sentence-reuse conditions (75p, 100p) with 1.7p and 1.5p KSR improvements, respectively.

Journal ArticleDOI
TL;DR: This article evaluates the data quality obtained from optical motion capture of isolated signs from Swedish sign language with a large number of low- cost cameras, and presents a novel dual-sensor approach to combine the data with low-cost, five-s sensor instrumented gloves to provide a recording method with low manual postprocessing.
Abstract: Motion capture of signs provides unique challenges in the field of multimodal data collection. The dense packaging of visual information requires high fidelity and high bandwidth of the captured data. Even though marker-based optical motion capture provides many desirable features such as high accuracy, global fitting, and the ability to record body and face simultaneously, it is not widely used to record finger motion, especially not for articulated and syntactic motion such as signs. Instead, most signing avatar projects use costly instrumented gloves, which require long calibration procedures. In this article, we evaluate the data quality obtained from optical motion capture of isolated signs from Swedish sign language with a large number of low-cost cameras. We also present a novel dual-sensor approach to combine the data with low-cost, five-sensor instrumented gloves to provide a recording method with low manual postprocessing. Finally, we evaluate the collected data and the dual-sensor approach as transferred to a highly stylized avatar. The application of the avatar is a game-based environment for training Key Word Signing (KWS) as augmented and alternative communication (AAC), intended for children with communication disabilities.

Journal ArticleDOI
TL;DR: Results from both experiments tend to validate the use of GOP to measure speech capability loss, a dimension that could be used as a complement to physiological measures in pathologies causing speech disorders.
Abstract: In this article, we report on the use of an automatic technique to assess pronunciation in the context of several types of speech disorders. Even if such tools already exist, they are more widely used in a different context, namely, Computer-Assisted Language Learning, in which the objective is to assess nonnative pronunciation by detecting learners’ mispronunciations at segmental and/or suprasegmental levels. In our work, we sought to determine if the Goodness of Pronunciation (GOP) algorithm, which aims to detect phone-level mispronunciations by means of automatic speech recognition, could also detect segmental deviances in disordered speech. Our main experiment is an analysis of speech from people with unilateral facial palsy. This pathology may impact the realization of certain phonemes such as bilabial plosives and sibilants. Speech read by 32 speakers at four different clinical severity grades was automatically aligned and GOP scores were computed for each phone realization. The highest scores, which indicate large dissimilarities with standard phone realizations, were obtained for the most severely impaired speakers. The corresponding speech subset was manually transcribed at phone level; 8.3p of the phones differed from standard pronunciations extracted from our lexicon. The GOP technique allowed the detection of 70.2p of mispronunciations with an equal rate of about 30p of false rejections and false acceptances. Finally, to broaden the scope of the study, we explored the correlation between GOP values and speech comprehensibility scores on a second corpus, composed of sentences recorded by six people with speech impairments due to cancer surgery or neurological disorders. Strong correlations were achieved between GOP scores and subjective comprehensibility scores (about 0.7 absolute). Results from both experiments tend to validate the use of GOP to measure speech capability loss, a dimension that could be used as a complement to physiological measures in pathologies causing speech disorders.

Journal ArticleDOI
TL;DR: The Human Signal Intelligibility Model is developed, a new conceptual model useful for informing evaluations of video intelligibility and the methodology for creating linguistically accessible web surveys for deaf people is developed.
Abstract: Mobile sign language video conversations can become unintelligible if high video transmission rates cause network congestion and delayed video. In an effort to understand the perceived lower limits of intelligible sign language video intended for mobile communication, we evaluated sign language video transmitted at four low frame rates (1, 5, 10, and 15 frames per second [fps]) and four low fixed bit rates (15, 30, 60, and 120 kilobits per second [kbps]) at a constant spatial resolution of 320 × 240 pixels. We discovered an “intelligibility ceiling effect,” in which increasing the frame rate above 10fps did not improve perceived intelligibility, and increasing the bit rate above 60kbps produced diminishing returns. Given the study parameters, our findings suggest that relaxing the recommended frame rate and bit rate to 10fps at 60kbps will provide intelligible video conversations while reducing total bandwidth consumption to 25p of the ITU-T standard (at least 25fps and 100kbps). As part of this work, we developed the Human Signal Intelligibility Model, a new conceptual model useful for informing evaluations of video intelligibility and our methodology for creating linguistically accessible web surveys for deaf people. We also conducted a battery-savings experiment quantifying battery drain when sign language video is transmitted at the lower frame rates and bit rates. Results confirmed that increasing the transmission rates monotonically decreased the battery life.

Journal ArticleDOI
TL;DR: The iterative design and field trial of Amail, an email client specifically designed for people with aphasia who have problems expressing themselves verbally, indicated that, over time, persons withAphasia were able to improve their email communication.
Abstract: In this article, we describe the iterative design and field trial of Amail, an email client specifically designed for people with aphasia who have problems expressing themselves verbally. We conducted a 3-month study with eight persons with aphasia to better understand how people with aphasia could integrate Amail in their daily life. Subjective data (questionnaires, interviews, and diaries) and objective data (usage logs) were collected to gain understanding of the usage patterns. All persons with aphasia in our study were able to use Amail independently, and four participants continued using Amail after the study period. The usage patterns, especially the frequency and length of the composed email messages, indicated that, over time, persons with aphasia were able to improve their email communication. Email partners also had the impression that their email partners with aphasia were improving gradually. Last but not least, the use of Amail positively influenced the number and quality of social contacts for the persons with aphasia. We also report some of the challenges encountered while conducting the field trial.

Journal ArticleDOI
TL;DR: A phoneme-categorized subdictionary and a dictionary selection method using NMF is proposed to reduce the mismatching of phoneme alignment in a voice conversion method for a person with an articulation disorder resulting from athetoid cerebral palsy.
Abstract: We present a voice conversion (VC) method for a person with an articulation disorder resulting from athetoid cerebral palsy. The movements of such speakers are limited by their athetoid symptoms and their consonants are often unstable or unclear, which makes it difficult for them to communicate. Exemplar-based spectral conversion using Nonnegative Matrix Factorization (NMF) is applied to a voice from a speaker with an articulation disorder. In our conventional work, we used a combined dictionary that was constructed from the source speaker’s vowels and the consonants from a target speaker without articulation disorders in order to preserve the speaker’s individuality. However, this conventional exemplar-based approach needs to use all the training exemplars (frames), and it may cause mismatching of phonemes between input signals and selected exemplars. In order to reduce the mismatching of phoneme alignment, we propose a phoneme-categorized subdictionary and a dictionary selection method using NMF. The effectiveness of this method was confirmed by comparing its effectiveness with that of a conventional Gaussian Mixture Model (GMM)-based and a conventional exemplar-based method.

Journal ArticleDOI
TL;DR: Three articles that are extended versions of conference papers presented at the 15th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS’13), which was held in Bellevue, Washington, October 21 to 23, 2013 are presented.
Abstract: We are pleased to present three articles that are extended versions of conference papers presented at the 15th International ACM SIGACCESS Conference on Computers and Accessibility (ASSETS’13), which was held in Bellevue, Washington, October 21 to 23, 2013. Authors of several top papers from the conference submitted manuscripts for consideration, which underwent a full review process for the ACM Transactions on Accessible Computing. The guest editors for these articles include Jonathan Lazar (Towson University) and Richard Ladner (University of Washington), who served as Program Chair for ASSETS’13. This issue includes the first three of these articles that have been accepted to this special issue of TACCESS; additional articles based on ASSETS’13 papers may appear in a future issue of TACCESS. We thank the authors for their excellent submissions, and we also thank all of the reviewers for the journal who contributed their time and expertise to this process. The first article, “Experiences of Someone with a Neuromuscular Disease in Operating a PC (and Ways to Successfully Overcome Challenges),” is based on an Experience Report from the conference. It gives a first-hand account of the development and use of a computer access technology for someone who has a severe motor disability who cannot effectively use a standard keyboard and mouse. The second article, “Improving Public Transit Accessibility for Blind Riders by Crowdsourcing Bus Stop Landmark Locations with Google Street View: An Extended Analysis,” gives strong evidence that crowd workers can accurately identify features of bus stops that are particularly useful to blind transit riders. The conference version of this article won the Best Paper Award at the conference. The third article, “Designing Wheelchair-Based Movement Games,” describes several wheelchair games, based on Microsoft Kinect technology, that can provide fun and exercise to people with limited exercise choices. The conference version of this article won the Best Student Paper Award at the conference.

Journal ArticleDOI
TL;DR: This work attempts to illustrate some of the difficulties the first author usually has to face when operating a computer, due to considerable motor problems, with the innovative approach to human-computer interaction characterized by the software tool OnScreenDualScribe.
Abstract: This article describes the experiences of the first author, who was diagnosed with the neuromuscular disease Friedreich's Ataxia more than 25 years ago, with the innovative approach to human-computer interaction characterized by the software tool OnScreenDualScribe. Originally developed by (and for!) the first author, the tool replaces the standard input devices—that is, keyboard and mouse—with a small numeric keypad, making optimal use of his abilities. This work attempts to illustrate some of the difficulties the first author usually has to face when operating a computer, due to considerable motor problems. The article will discuss what he tried in the past, and why OnScreenDualScribe, offering various assistive techniques—including word prediction, an ambiguous keyboard, and stepwise pointing operations—is indeed a viable alternative. In a pilot study that was repeated multiple times with slight variations over a period of 3 years, the first author's entry rate with OnScreenDualScribe (including early versions of the tool) increased from 1.38wpm to 6.16wpm, while his achievable typing rate went from 12wpm to 3.5wpm in the course of 24 years. However, the ultimate goal is to help not just one single person, but to make the system—which not only accelerates entry, but also clearly reduces the required effort—available to anyone with similar conditions.