scispace - formally typeset
Search or ask a question
Author

Matt Huenerfauth

Bio: Matt Huenerfauth is an academic researcher from Rochester Institute of Technology. The author has contributed to research in topics: American Sign Language & Sign language. The author has an hindex of 24, co-authored 120 publications receiving 1929 citations. Previous affiliations of Matt Huenerfauth include Gallaudet University & University of Rochester.


Papers
More filters
Proceedings Article
23 Aug 2010
TL;DR: It is found that features based on in-domain language models have the highest predictive power and Entity-density and POS-features, in particular nouns, are individually very useful but highly correlated.
Abstract: Several sets of explanatory variables - including shallow, language modeling, POS, syntactic, and discourse features - are compared and evaluated in terms of their impact on predicting the grade level of reading material for primary school students. We find that features based on in-domain language models have the highest predictive power. Entity-density (a discourse feature) and POS-features, in particular nouns, are individually very useful but highly correlated. Average sentence length (a shallow feature) is more useful - and less expensive to compute - than individual syntactic features. A judicious combination of features examined here results in a significant improvement over the state of the art.

237 citations

Proceedings ArticleDOI
24 Oct 2019
TL;DR: The results of an interdisciplinary workshop are presented, providing key background that is often overlooked by computer scientists, a review of the state-of-the-art, a set of pressing challenges, and a call to action for the research community.
Abstract: Developing successful sign language recognition, generation, and translation systems requires expertise in a wide range of fields, including computer vision, computer graphics, natural language processing, human-computer interaction, linguistics, and Deaf culture. Despite the need for deep interdisciplinary knowledge, existing research occurs in separate disciplinary silos, and tackles separate portions of the sign language processing pipeline. This leads to three key questions: 1) What does an interdisciplinary view of the current landscape reveal? 2) What are the biggest challenges facing the field? and 3) What are the calls to action for people working in the field? To help answer these questions, we brought together a diverse group of experts for a two-day workshop. This paper presents the results of that interdisciplinary workshop, providing key background that is often overlooked by computer scientists, a review of the state-of-the-art, a set of pressing challenges, and a call to action for the research community.

237 citations

Proceedings ArticleDOI
30 Mar 2009
TL;DR: The experiments show that discourse-level, cognitively-motivated features improve automatic readability assessment and develop and evaluate a tool for automatically rating the readability of texts for adults with intellectual disabilities.
Abstract: We investigate linguistic features that correlate with the readability of texts for adults with intellectual disabilities (ID). Based on a corpus of texts (including some experimentally measured for comprehension by adults with ID), we analyze the significance of novel discourse-level features related to the cognitive factors underlying our users' literacy challenges. We develop and evaluate a tool for automatically rating the readability of texts for these users. Our experiments show that our discourse-level, cognitively-motivated features improve automatic readability assessment.

147 citations

01 Jan 2006
TL;DR: To evaluate the functionality and scalability of the most novel portion of this English-to-ASL MT design, this project has implemented a prototype-version of the planning-based classifier predicate generator.
Abstract: A majority of deaf 18-year-olds in the United States have an English reading level below that of a typical 10-year-old student, and so machine translation (MT) software that could translate English text into American Sign Language (ASL) animations could significantly improve these individuals' access to information, communication, and services. Previous English-to-ASL MT projects have made limited progress by restricting their output to subsets of ASL phenomena---thus avoiding important linguistic and animation issues. None of these systems have shown how to generate classifier predicates (CPs), a phenomenon in which signers use special hand movements to indicate the location and movement of invisible objects (representing entities under discussion) in space around their bodies. CPs are frequent in ASL and are necessary for conveying many concepts. This project has created an English-to-ASL MT design capable of producing classifier predicates. The classifier predicate generator inside this design has a planning-based architecture that uses a 3D "visualization" model of the arrangement of objects in a scene discussed by the English input text. This generator would be one pathway in a multi-path English-to-ASL MT design; a separate processing pathway would be used to generate classifier predicates, to generate other ASL sentences, and to generate animations of Signed English (if the system lacked lexical resources for some input). Instead of representing the ASL animation as a string (of individual signs to perform), this system encodes the multimodal language signal as multiple channels that are hierarchically structured and coordinated over time. While this design feature and others have been prompted by the unique requirements of generating a sign language, these technologies have applications for the machine translation of written languages, the representation of other multimodal language signals, and the production of meaningful gestures by other animated virtual human characters. To evaluate the functionality and scalability of the most novel portion of this English-to-ASL MT design, this project has implemented a prototype-version of the planning-based classifier predicate generator. The classifier predicate animations produced by the system have been shown to native ASL signers to evaluate the output.

107 citations

Proceedings ArticleDOI
18 Jun 2018
TL;DR: A novel hybrid model, 3D recurrent convolutional neural networks (3DRCNN), is proposed to recognize American Sign Language (ASL) gestures and localize their temporal boundaries within continuous videos, by fusing multi-modality features.
Abstract: In this paper, we propose a novel hybrid model, 3D recurrent convolutional neural networks (3DRCNN), to recognize American Sign Language (ASL) gestures and localize their temporal boundaries within continuous videos, by fusing multi-modality features. Our proposed 3DRCNN model integrates 3D convolutional neural network (3DCNN) and enhanced fully connected recurrent neural network (FC-RNN), where 3DCNN learns multi-modality features from RGB, motion, and depth channels, and FC-RNN captures the temporal information among short video clips divided from the original video. Consecutive clips with the same semantic meaning are singled out by applying the sliding window approach to segment the clips on the entire video sequence. To evaluate our method, we collected a new ASL dataset which contains two types of videos: Sequence videos (in which a human performs a list of specific ASL words) and Sentence videos (in which a human performs ASL sentences, containing multiple ASL words). The dataset is fully annotated for each semantic region (i.e. the time duration of each word that the human signer performs) and contains multiple input channels. Our proposed method achieves 69.2% accuracy on the Sequence videos for 27 ASL words, which demonstrates its effectiveness of detecting ASL gestures from continuous videos.

62 citations


Cited by
More filters
Book
01 Jan 2009
TL;DR: A brief overview of the status of the Convention as at 3 August 2007 is presented and recent efforts of the United Nations and agencies to disseminate information on the Convention and the Optional Protocol are described.
Abstract: The present report is submitted in response to General Assembly resolution 61/106, by which the Assembly adopted the Convention on the Rights of Persons with Disabilities and the Optional Protocol thereto. As requested by the Assembly, a brief overview of the status of the Convention as at 3 August 2007 is presented. The report also contains a brief description of technical arrangements on staff and facilities made necessary for the effective performance of the functions of the Conference of States Parties and the Committee under the Convention and the Optional Protocol, and a description on the progressive implementation of standards and guidelines for the accessibility of facilities and services of the United Nations system. Recent efforts of the United Nations and agencies to disseminate information on the Convention and the Optional Protocol are also described.

2,115 citations

Proceedings Article
22 Aug 1999
TL;DR: The accessibility, usability, and, ultimately, acceptability of Information Society Technologies by anyone, anywhere, at anytime, and through any media and device is addressed.
Abstract: ▶ Addresses the accessibility, usability, and, ultimately, acceptability of Information Society Technologies by anyone, anywhere, at anytime, and through any media and device. ▶ Focuses on theoretical, methodological, and empirical research, of both technological and non-technological nature. ▶ Features papers that report on theories, methods, tools, empirical results, reviews, case studies, and best-practice examples.

752 citations