Showing papers on "Utterance published in 2011"

PDF

Open Access

Journal Article•DOI•

Survey on speech emotion recognition: Features, classification schemes, and databases

[...]

Moataz M. H. El Ayadi¹, Mohamed S. Kamel², Fakhri Karray²•Institutions (2)

Cairo University¹, University of Waterloo²

01 Mar 2011-Pattern Recognition

TL;DR: A survey of speech emotion classification addressing three important aspects of the design of a speech emotion recognition system, the choice of suitable features for speech representation, and the proper preparation of an emotional speech database for evaluating system performance are addressed.

...read moreread less

1,735 citations

Gesticulation and Speech: Two Aspects of the Process of Utterance

[...]

Adam Kendon

02 Aug 2011

783 citations

Proceedings Article•

Identifying Sarcasm in Twitter: A Closer Look

[...]

Roberto González-Ibáñez¹, Smaranda Muresan¹, Nina Wacholder¹•Institutions (1)

Rutgers University¹

19 Jun 2011

TL;DR: This work reports on a method for constructing a corpus of sarcastic Twitter messages in which determination of the sarcasm of each message has been made by its author and uses this reliable corpus to compare sarcastic utterances in Twitter to utterances that express positive or negative attitudes without sarcasm.

...read moreread less

Abstract: Sarcasm transforms the polarity of an apparently positive or negative utterance into its opposite. We report on a method for constructing a corpus of sarcastic Twitter messages in which determination of the sarcasm of each message has been made by its author. We use this reliable corpus to compare sarcastic utterances in Twitter to utterances that express positive or negative attitudes without sarcasm. We investigate the impact of lexical and pragmatic factors on machine learning effectiveness for identifying sarcastic utterances and we compare the performance of machine learning techniques and human judges on this task. Perhaps unsurprisingly, neither the human judges nor the machine learning techniques perform very well.

...read moreread less

592 citations

Book•

Spoken Language Understanding: Systems for Extracting Semantic Information from Speech

[...]

Gokhan Tur, Renato De Mori

25 Apr 2011

TL;DR: This book discusses the development of Spoken Language Understanding in Commercial and Research Spoken Dialogue Systems, and the role of SLU in this process.

...read moreread less

Abstract: List of Contributors. Forward. Preface. 1 Introduction (Gokhan Tur and Renato De Mori). 1.1 A Brief History of Spoken Language Understanding. 1.2 Organization of the Book. PART 1 SPOKEN LANGUAGE UNDERSTANDING FOR HUMAN/MACHINE INTERACTIONS. 2 History of Knowledge and Processes for Spoken Language Understanding (Renato De Mori). 2.1 Introduction. 2.2 Meaning Representation and Sentence Interpretation. 2.3 Knowledge Fragments and Semantic Composition. 2.4 Probabilistic Interpretation in SLU Systems. 2.5 Interpretation with Partial Syntactic Analysis. 2.6 Classification Models for Interpretation. 2.7 Advanced Methods and Resources for Semantic Modeling and Interpretation. 2.8 Recent Systems. 2.9 Conclusions. References. 3 Semantic Frame-based Spoken Language Understanding (Ye-Yi Wang, Li Deng and Alex Acero). 3.1 Background. 3.2 Knowledge-based Solutions. 3.3 Data-driven Approaches. 3.4 Summary. References. 4 Intent Determination and Spoken Utterance Classification (Gokhan Tur and Li Deng). 4.1 Background. 4.2 Task Description. 4.3 Technical Challenges. 4.4 Benchmark Data Sets. 4.5 Evaluation Metrics. 4.6 Technical Approaches. 4.7 Discussion and Conclusions. References. 5 Voice Search (Ye-Yi Wang, Dong Yu, Yun-Cheng Ju and Alex Acero). 5.1 Background. 5.2 Technology Review. 5.3 Summary. References. 6 Spoken Question Answering (Sophie Rosset, Olivier Galibert and Lori Lamel). 6.1 Introduction. 6.2 Specific Aspects of Handling Speech in QA Systems. 6.3 QA Evaluation Campaigns. 6.4 Question-answering Systems. 6.5 Projects Integrating Spoken Requests and Question Answering. 6.6 Conclusions. References. 7 SLU in Commercial and Research Spoken Dialogue Systems (David Suendermann and Roberto Pieraccini). 7.1 Why Spoken Dialogue Systems (Do Not) Have to Understand. 7.2 Approaches to SLU for Dialogue Systems. 7.3 From Call Flow to POMDP: How Dialogue Management Integrates with SLU. 7.4 Benchmark Projects and Data Sets. 7.5 Time is Money: The Relationship between SLU and Overall Dialogue System Performance. 7.6 Conclusion. References. 8 Active Learning (Dilek Hakkani-Tur and Giuseppe Riccardi). 8.1 Introduction. 8.2 Motivation. 8.3 Learning Architectures. 8.4 Active Learning Methods. 8.5 Combining Active Learning with Semi-supervised Learning. 8.6 Applications. 8.7 Evaluation of Active Learning Methods. 8.8 Discussion and Conclusions. References. PART 2 SPOKEN LANGUAGE UNDERSTANDING FOR HUMAN/HUMAN CONVERSATIONS. 9 Human/Human Conversation Understanding (Gokhan Tur and Dilek Hakkani-Tur). 9.1 Background. 9.2 Human/Human Conversation Understanding Tasks. 9.3 Dialogue Act Segmentation and Tagging. 9.4 Action Item and Decision Detection. 9.5 Addressee Detection and Co-reference Resolution. 9.6 Hot Spot Detection. 9.7 Subjectivity, Sentiment, and Opinion Detection. 9.8 Speaker Role Detection. 9.9 Modeling Dominance. 9.10 Argument Diagramming. 9.11 Discussion and Conclusions. References. 10 Named Entity Recognition (Frederic Bechet). 10.1 Task Description. 10.2 Challenges Using Speech Input. 10.3 Benchmark Data Sets, Applications. 10.4 Evaluation Metrics. 10.5 Main Approaches for Extracting NEs from Text. 10.6 Comparative Methods for NER from Speech. 10.7 New Trends in NER from Speech. 10.8 Conclusions. References. 11 Topic Segmentation (Matthew Purver). 11.1 Task Description. 11.2 Basic Approaches, and the Challenge of Speech. 11.3 Applications and Benchmark Datasets. 11.4 Evaluation Metrics. 11.5 Technical Approaches. 11.6 New Trends and Future Directions. References. 12 Topic Identification (Timothy J. Hazen). 12.1 Task Description. 12.2 Challenges Using Speech Input. 12.3 Applications and Benchmark Tasks. 12.4 Evaluation Metrics. 12.5 Technical Approaches. 12.6 New Trends and Future Directions. References. 13 Speech Summarization (Yang Liu and Dilek Hakkani-Tur). 13.1 Task Description. 13.2 Challenges when Using Speech Input. 13.3 Data Sets. 13.4 Evaluation Metrics. 13.5 General Approaches. 13.6 More Discussions on Speech versus Text Summarization. 13.7 Conclusions. References. 14 Speech Analytics (I. Dan Melamed and Mazin Gilbert) 14.1 Introduction. 14.2 System Architecture. 14.3 Speech Transcription. 14.4 Text Feature Extraction. 14.5 Acoustic Feature Extraction. 14.6 Relational Feature Extraction. 14.7 DBMS. 14.8 Media Server and Player. 14.9 Trend Analysis. 14.10 Alerting System. 14.11 Conclusion. References. 15 Speech Retrieval (Ciprian Chelba, Timothy J. Hazen, Bhuvana Ramabhadran and Murat Saraclar). 15.1 Task Description. 15.2 Applications. 15.3 Challenges Using Speech Input. 15.4 Evaluation Metrics. 15.5 Benchmark Data Sets. 15.6 Approaches. 15.7 New Trends. 15.8 Discussion and Conclusions. References. Index.

...read moreread less

577 citations

Proceedings Article•

Chameleons in Imagined Conversations: A New Approach to Understanding Coordination of Linguistic Style in Dialogs

[...]

Cristian Danescu-Niculescu-Mizil¹, Lillian Lee¹•Institutions (1)

Cornell University¹

23 Jun 2011

TL;DR: It is argued that fictional dialogs offer a way to study how authors create the conversations but don't receive the social benefits (rather, the imagined characters do), and significant coordination across many families of function words in the large movie-script corpus is found.

...read moreread less

Abstract: Conversational participants tend to immediately and unconsciously adapt to each other's language styles: a speaker will even adjust the number of articles and other function words in their next utterance in response to the number in their partner's immediately preceding utterance. This striking level of coordination is thought to have arisen as a way to achieve social goals, such as gaining approval or emphasizing difference in status. But has the adaptation mechanism become so deeply embedded in the language-generation process as to become a reflex? We argue that fictional dialogs offer a way to study this question, since authors create the conversations but don't receive the social benefits (rather, the imagined characters do). Indeed, we find significant coordination across many families of function words in our large movie-script corpus. We also report suggestive preliminary findings on the effects of gender and other features; e.g., surprisingly, for articles, on average, characters adapt more to females than to males.

...read moreread less

373 citations

Proceedings Article•

EasyAlign: an automatic phonetic alignment tool under Praat

[...]

Jean-Philippe Goldman¹•Institutions (1)

University of Geneva¹

01 Jan 2011

TL;DR: Evaluation showed that the performances of this HTK-based aligner compare to human alignment and to other existing alignment tools.

...read moreread less

Abstract: We provide a user-friendly automatic phonetic alignment tool for continuous speech, named EasyAlign. It is developed as a plug-in of Praat, the popular speech analysis software, and it is freely available. Its main advantage is that one can easily align speech from an orthographic transcription. It requires a few minor manual steps and the result is a multi-level annotation within a TextGrid composed of phonetic, syllabic, lexical and utterance tiers. Evaluation showed that the performances of this HTK-based aligner compare to human alignment and to other existing alignment tools. It was originally fully available for French, English. Community’s interests for its extension to other languages helped to develop a straight-forward methodology to add languages. While Spanish and Taiwan Min were recently added, other languages are under development.

...read moreread less

280 citations

Book Chapter•DOI•

Discourse representation theory

[...]

Hans Kamp¹, Josef van Genabith², Uwe Reyle¹•Institutions (2)

University of Stuttgart¹, Dublin City University²

01 Jan 2011

TL;DR: Discourse Representation Theory, or DRT, is one of a number of theories of dynamic semantics, which have come upon the scene in the course of the past twenty years to account for the context dependence of meaning.

...read moreread less

Abstract: Discourse Representation Theory, or DRT, is one of a number of theories of dynamic semantics, which have come upon the scene in the course of the past twenty years. The central concern of these theories is to account for the context dependence of meaning. It is a ubiquitous feature of natural languages that utterances are interpretable only when the interpreter takes account of the contexts in which they are made – utterance meaning depends on context. Moreover, the interaction between context and utterance is reciprocal.

...read moreread less

278 citations

Journal Article•DOI•

Accessing the unsaid: the role of scalar alternatives in children's pragmatic inference.

[...]

David Barner¹, Neon Brooks², Alan Bale³•Institutions (3)

University of California, San Diego¹, University of Chicago², Concordia University³

01 Jan 2011-Cognition

TL;DR: Four-year-olds were shown pictures in which three out of three objects fit a description, and asked to evaluate statements that relied on context-independent alternatives or contextual alternatives, which support the hypothesis that children's difficulties with scalar implicature are due to a failure to generate relevant alternatives for specific scales.

...read moreread less

253 citations

Patent•

Word-level correction of speech input

[...]

Michael J. Lebeau¹, William J. Byrne¹, John Nicholas Jitkoff¹, Brandon M. Ballinger¹, Trausti Kristjansson¹ - Show less +1 more•Institutions (1)

Google¹

05 Jan 2011

TL;DR: In this article, a computer-implemented method comprising of providing a transcription of an utterance for presentation in an area of a display of a computing device, receiving a user selection of at least one word of the transcription of the utterance; in response to receiving the user selection, presenting at the display of the computing device one or more of: (i) one OR more alternate words for the user-selected at least 1 word of utterance, and (ii) a remove control to remove the user selected at leastOne word from the transcription from the text of

...read moreread less

Abstract: A computer-implemented method comprising: providing a transcription of an utterance for presentation in an area of a display of a computing device; receiving a user selection of at least one word of the transcription of the utterance; in response to receiving the user selection of the at least one word of the transcription of the utterance, presenting at the display of the computing device one or more of: (i) one or more alternate words for the user-selected at least one word of the transcription of the utterance and (ii) a remove control to remove the user-selected at least one word of the transcription of the utterance from the transcription of the utterance; receiving a user selection of the remove control or an alternate word from among the one or more alternate words; and updating the transcription of the utterance presented in the area of the display of the computing device based at least on the user selection. A system and computer program are also described.

...read moreread less

228 citations

Patent•

Enhanced speech-to-speech translation system and methods

[...]

Alex Waibel, Ian R. Lane

18 Jan 2011

TL;DR: A speech translation system and methods for cross-lingual communication that enable users to improve and modify content and usage of the system and easily abort or reset translation is described in this paper.

...read moreread less

Abstract: A speech translation system and methods for cross-lingual communication that enable users to improve and modify content and usage of the system and easily abort or reset translation. The system includes a speech recognition module configured for accepting an utterance, a machine translation module, an interface configured to communicate the utterance and proposed translation, a correction module and an abort action unit that removes any hypotheses or partial hypotheses and terminates translation. The system also includes modules for storing favorites, changing language mode, automatically identifying language, providing language drills, viewing third party information relevant to conversation, among other things.

...read moreread less

193 citations

Patent•

Belief tracking and action selection in spoken dialog systems

[...]

Rakesh Gupta¹, Deepak Ramachandran¹, Antoine Raux¹, Neville Mehta¹, Stefan Krawczyk¹, Matthew Hoffman¹ - Show less +2 more•Institutions (1)

Honda¹

30 Aug 2011

TL;DR: In this paper, an action is performed in a spoken dialog system in response to a user's spoken utterance, and a policy which maps belief states of user intent to actions is retrieved or created.

...read moreread less

Abstract: An action is performed in a spoken dialog system in response to a user's spoken utterance. A policy which maps belief states of user intent to actions is retrieved or created. A belief state is determined based on the spoken utterance, and an action is selected based on the determined belief state and the policy. The action is performed, and in one embodiment, involves requesting clarification of the spoken utterance from the user. Creating a policy may involve simulating user inputs and spoken dialog system interactions, and modifying policy parameters iteratively until a policy threshold is satisfied. In one embodiment, a belief state is determined by converting the spoken utterance into text, assigning the text to one or more dialog slots associated with nodes in a probabilistic ontology tree (POT), and determining a joint probability based on probability distribution tables in the POT and on the dialog slot assignments.

...read moreread less

Journal Article•DOI•

Quantity implicatures, exhaustive interpretation, and rational conversation

[...]

Michael Franke¹•Institutions (1)

University of Tübingen¹

21 Jun 2011-Semantics and Pragmatics

TL;DR: A procedure for constructing the context of utterance insofar as it is relevant for quantity reasoning as a game between speaker and hearer is offered, and a new solution concept is given that improves on classical equilibrium approaches in that it uniquely selects the desired "empirically correct" play in these interpretation games by a chain of back-and-forth reasoning about players' behavior.

...read moreread less

Abstract: Quantity implicatures are inferences triggered by an utterance based on what other utterances a speaker could have made instead. Using ideas and formalisms from game theory, I demonstrate that these inferences can be explained in a strictly Gricean sense as *rational behavior*. To this end, I offer a procedure for constructing the context of utterance insofar as it is relevant for quantity reasoning as a game between speaker and hearer. I then give a new solution concept that improves on classical equilibrium approaches in that it uniquely selects the desired "empirically correct" play in these interpretation games by a chain of back-and-forth reasoning about players' behavior. To make this formal approach more accessible to a wider audience, I give a simple algorithm with the help of which the model's solution can be computed without having to do heavy calculations of probabilities, expected utilities and the like. This rationalistic approach subsumes and improves on recent exhaustivity-based approaches. It makes correct and uniform predictions for quantity implicatures of various epistemic varieties, free choice readings of disjunctions, as well as a phenomenon tightly related to the latter, namely so-called "simplification of disjunctive antecedents". doi:10.3765/sp.4.1 BibTeX info

...read moreread less

Journal Article•DOI•

The relation between teacher input and lexical growth of preschoolers.

[...]

Edmond P. Bowers¹, Marina Vasilyeva²•Institutions (2)

Tufts University¹, Boston College²

01 Jan 2011-Applied Psycholinguistics

TL;DR: The authors examined the growth of receptive lexical skills in preschoolers over an academic year in relation to teacher speech and found that vocabulary growth was positively related to the total number of words produced by the teacher, but negatively correlated with the number of word per utterance.

...read moreread less

Abstract: The present study examined the growth of receptive lexical skills in preschoolers over an academic year in relation to teacher speech. The participating students were English language learners and their monolingual English-speaking peers from the same classrooms. The measures of teacher input included indicators of the amount of speech (total number of words), lexical richness (number of different word types), and structural complexity (number of words per utterance). These measures were based on a speech sample collected during a classroom observation. For English language learners, vocabulary growth was positively related to the total number of words produced by the teacher, but negatively related to the number of words per utterance. For monolingual speakers, vocabulary growth was positively related to the number of word types produced by the teacher. The findings underscore the importance of considering different aspects of verbal input for understanding individual variability in language growth of preschool students.

...read moreread less

Patent•

System and method for machine-mediated human-human conversation

[...]

Srinivas Bangalore¹•Institutions (1)

AT&T¹

06 Dec 2011

TL;DR: In this article, a system is configured to monitor user utterances to generate a conversation context, and then the system receives a current user utterance independent of non-natural language input intended to trigger speech processing.

...read moreread less

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for processing speech. A system configured to practice the method monitors user utterances to generate a conversation context. Then the system receives a current user utterance independent of non-natural language input intended to trigger speech processing. The system compares the current user utterance to the conversation context to generate a context similarity score, and if the context similarity score is above a threshold, incorporates the current user utterance into the conversation context. If the context similarity score is below the threshold, the system discards the current user utterance. The system can compare the current user utterance to the conversation context based on an n-gram distribution, a perplexity score, and a perplexity threshold. Alternately, the system can use a task model to compare the current user utterance to the conversation context.

...read moreread less

Journal Article•DOI•

Investigating joint attention mechanisms through spoken human-robot interaction.

[...]

Maria Staudte¹, Matthew W. Crocker¹•Institutions (1)

Saarland University¹

01 Aug 2011-Cognition

TL;DR: The results reveal a quantified benefit-disruption spectrum of gaze on utterance comprehension and show that gaze is used, even during the initial movement phase, to restrict the spatial domain of potential referents.

...read moreread less

Journal Article•DOI•

What's wrong with ‘the student experience’?

[...]

Duna Sabri¹•Institutions (1)

King's College London¹

26 Oct 2011-Discourse: Studies in The Cultural Politics of Education

TL;DR: The authors examine the dominance and sacralisation of the discourse of the student experience and question its positioning as a means of discriminating between the value of different experiences of education, arguing that it homogenises students and deprives them of agency at the same time as apparently giving them "voice".

...read moreread less

Abstract: Speaking about ‘the student experience’ has become common-place in higher education and the phrase has acquired the aura of a sacred utterance in UK higher education policy over the last decade. A critical discourse analysis of selected higher education policy texts reveals what ‘the student experience’ has come to signify, and how it structures relations between students and academics, institutions and academics, and higher education institutions and government. ‘The student experience’ homogenises students and deprives them of agency at the same time as apparently giving them ‘voice’. This paper examines the dominance and sacralisation of the discourse of ‘the student experience’ and questions its positioning as a means of discriminating between the value of different experiences of education.

...read moreread less

Patent•

Speech recognition dependent on text message content

[...]

Gaurav Talwar¹, Xufang Zhao¹•Institutions (1)

General Motors¹

25 Mar 2011

TL;DR: In this article, an utterance is received from a user in reply to a text message, via a microphone that converts the reply utterance into a speech signal, which is processed using at least one processor to extract acoustic data from the speech signal.

...read moreread less

Abstract: A method of automatic speech recognition. An utterance is received from a user in reply to a text message, via a microphone that converts the reply utterance into a speech signal. The speech signal is processed using at least one processor to extract acoustic data from the speech signal. An acoustic model is identified from a plurality of acoustic models to decode the acoustic data, and using a conversational context associated with the text message. The acoustic data is decoded using the identified acoustic model to produce a plurality of hypotheses for the reply utterance.

...read moreread less

Patent•

System and method for advanced turn-taking for interactive spoken dialog systems

[...]

Jason D. Williams¹, Ethan Selfridge¹•Institutions (1)

AT&T¹

01 Sep 2011

TL;DR: In this article, the authors present systems, methods, and non-transitory computer-readable storage media for advanced turn-taking in an interactive spoken dialog system that incrementally process speech prior to completion of the speech utterance, and can communicate partial speech recognition results upon finding particular conditions.

...read moreread less

Abstract: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for advanced turn-taking in an interactive spoken dialog system A system configured according to this disclosure can incrementally process speech prior to completion of the speech utterance, and can communicate partial speech recognition results upon finding particular conditions A first condition which, if found, allows the system to communicate partial speech recognition results, is that the most recent word found in the partial results is statistically likely to be the termination of the utterance, also known as a terminal node A second condition is the determination that all search paths within a speech lattice converge to a common node, also known as a pinch node, before branching out again Upon finding either condition, the system can communicate the partial speech recognition results Stability and correctness probabilities can also determine which partial results are communicated

...read moreread less

Journal Article•DOI•

Isolated words enhance statistical language learning in infancy

[...]

Casey Lew-Williams¹, Bruna Pelucchi¹, Bruna Pelucchi², Jenny R. Saffran¹•Institutions (2)

University of Wisconsin-Madison¹, University of Ferrara²

01 Nov 2011-Developmental Science

TL;DR: This investigation suggests that statistical learning mechanisms actually benefit from variability in utterance length, and provides the first evidence that isolated words and longer utterances act in concert to support infant word segmentation.

...read moreread less

Abstract: Infants are adept at tracking statistical regularities to identify word boundaries in pause-free speech. However, researchers have questioned the relevance of statistical learning mechanisms to language acquisition, since previous studies have used simplified artificial languages that ignore the variability of real language input. The experiments reported here embraced a key dimension of variability in infant-directed speech. English-learning infants (8–10 months) listened briefly to natural Italian speech that contained either fluent speech only or a combination of fluent speech and single-word utterances. Listening times revealed successful learning of the statistical properties of target words only when words appeared both in fluent speech and in isolation; brief exposure to fluent speech alone was not sufficient to facilitate detection of the words’ statistical properties. This investigation suggests that statistical learning mechanisms actually benefit from variability in utterance length, and provides the first evidence that isolated words and longer utterances act in concert to support infant word segmentation.

...read moreread less

Imperatives: Meaning and Illocutionary Force

[...]

Cleo Condoravdi

01 Jan 2011

TL;DR: In this paper, it was shown that an utterance of an imperative modal creates an obligation for the addressee, a view explicitly espoused by Lewis (1969), and that utterances constituting promises or orders do the same.

...read moreread less

Abstract: Certain types of utterances, by virtue of being made, bring about obligations on their speakers or addressees. An utterance of a performatively used necessity modal brings about an obligation for the addressee (Kamp 1973). Explicitly performative utterances constituting promises or orders do the same for the speaker and addressee, respectively (Searle 1964; Alston 2000; Truckenbrodt 2009). It would seem that in the same fashion an utterance of an imperative creates an obligation for the addressee, a view explicitly espoused by Lewis (1969).

...read moreread less

The syntacticization of discourse

[...]

Liliane Haegeman, Virginia Hill

01 Jan 2011

TL;DR: This paper explored the viability of a syntactic analysis of a range of empirical data that have so far received scarce attention in the syntactic literature, namely pragmatic markers which appear either on the left or the right edge of the utterance, as those illustrated in (1), from Romanian (R) and West Flemish (WF), and whose distribution will also be shown to interact with that of vocatives:

...read moreread less

Abstract: The goal of this paper is to explore the viability of a syntactic analysis of a range of empirical data that have so far received scarce attention in the syntactic literature, namely pragmatic markers which appear either on the left or the right edge of the utterance, as those illustrated in (1), from Romanian (R) and West Flemish (WF), and whose distribution will also be shown to interact with that of vocatives:

...read moreread less

Journal Article•DOI•

Achieving rapport with turn-by-turn, user-responsive emotional coloring

[...]

Jaime C. Acosta¹, Nigel Ward¹•Institutions (1)

University of Texas at El Paso¹

01 Nov 2011-Speech Communication

TL;DR: Gracie is the first spoken dialog system that recognizes a user's emotional state from his or her speech and gives a response with appropriate emotional coloring, and shows that dialog systems can tap into this important level of interpersonal interaction using today's technology.

...read moreread less

Journal Article•DOI•

Terminology and Notation in Written Representations of Conversations with Augmentative and Alternative Communication

[...]

Stephen von Tetzchner¹, Carmen Basil²•Institutions (2)

University of Oslo¹, University of Barcelona²

18 Oct 2011-Augmentative and Alternative Communication

TL;DR: The aim of this paper was to contribute to reviving the discussion of terminology and to more analyses of signing and aided communication and an increase in the use of conversation excerpts in the AAC Journal and elsewhere.

...read moreread less

Abstract: There is a need for a continuous discussion about what terms one should use within the field of augmentative and alternative communication. When talking and thinking about people in their role as users of alternative communication forms, the terms should reflect their communicative ways and means, their achievements and what they are doing, rather than focus on what they cannot do. There are rather few articles analyzing utterance construction and dialogue processes involving children and adults using manual and graphic communication systems. The aim of this paper was to contribute to reviving the discussion of terminology and to more analyses of signing and aided communication and an increase in the use of conversation excerpts in the AAC Journal and elsewhere.

...read moreread less

Journal Article•DOI•

Incrementality and intention-recognition in utterance processing.

[...]

Eleni Gregoromichelaki¹, Ruth Kempson¹, Matthew Purver², Gregory Mills², Ronnie Cann³, Wilfried Meyer-Viol¹, Patrick G. T. Healey² - Show less +3 more•Institutions (3)

King's College London¹, Queen Mary University of London², University of Edinburgh³

11 May 2011

TL;DR: This paper examines the phenomenon of split utterances, from the perspective of Dynamic Syntax, to further probe the necessity of full intention recognition/formation in communication, and illustrates how many dialogue phenomena can be seen as direct consequences of the grammar architecture, as long as this is presented within an incremental, goal-directed/predictivemodel.

...read moreread less

Abstract: Ever since dialogue modelling first developed relative to br oadly Gricean assumptions about utterance interpretation (Clark, 1996), it has remained an open question whether the full complexity of higher-order intention computation is made use of in everyday conversation. In this paper we examine the phenomenon of split utterances, from the perspective of Dynamic Syntax, to further probe the necessity of full intention recognition/formation in c ommunication: we do so by exploring the extent to which the interactive coordination of dialogue exchange can be seen as emergent from low-level mechanisms of language processing, without needing representation by interlocutors of each other’s mental states, or fully developed intentions a s regards messages to be conveyed. We thus illustrate how many dialogue phenomena can be seen as direct consequences of the grammar architecture, as long as this is presented within an incremental, goal-directed/predictivemodel.

...read moreread less

Patent•

Integrated local and cloud based speech recognition

[...]

Thomas M. Soemo¹, Leo Soong¹, Michael H. Kim¹, Chad R. Heinemann¹, Dax Hawkins¹ - Show less +1 more•Institutions (1)

Microsoft¹

02 Sep 2011

TL;DR: A system for integrating local speech recognition with cloud-based speech recognition in order to provide an efficient natural user interface is described in this article, where a computing device determines a direction associated with a particular person within an environment and generates an audio recording associated with the direction.

...read moreread less

Abstract: A system for integrating local speech recognition with cloud-based speech recognition in order to provide an efficient natural user interface is described In some embodiments, a computing device determines a direction associated with a particular person within an environment and generates an audio recording associated with the direction The computing device then performs local speech recognition on the audio recording in order to detect a first utterance spoken by the particular person and to detect one or more keywords within the first utterance The first utterance may be detected by applying voice activity detection techniques to the audio recording The first utterance and the one or more keywords are subsequently transferred to a server which may identify speech sounds within the first utterance associated with the one or more keywords and adapt one or more speech recognition techniques based on the identified speech sounds

...read moreread less

Post-focus compression: cross-linguistic distribution and historical origin

[...]

Yi Xu¹•Institutions (1)

University College London¹

01 Jan 2011

TL;DR: This paper presents a brief review of the current state of the art in the investigation of PFC, and discusses a number of hypotheses in regard to this typological division among the world languages.

...read moreread less

Abstract: One of the most important acoustic correlates of prosodic focus is post-focus compression (PFC) — the reduction of pitch range and amplitude of all post-focus components in an utterance PFC has been found in many Indo-European, Altaic languages, and interestingly, also in Mandarin Chinese Meanwhile, there have also been reports that many other languages do not have PFC, or lack any clear prosodic marking of focus This paper presents a brief review of the current state of the art in the investigation of PFC, and discusses a number of hypotheses in regard to this typological division among the world languages In particular, the idea is explored that the distribution of PFC is related to the historical development of the world languages

...read moreread less

Journal Article•DOI•

Communicatively driven versus prosodically driven hyper-articulation in Korean

[...]

Taehong Cho¹, Yoonjeong Lee¹, Sahyang Kim²•Institutions (2)

Hanyang University¹, Hongik University²

01 Jul 2011-Journal of Phonetics

TL;DR: The present study suggests that the communicatively driven and the prosodically driven hyper-articulation are intricately intertwined in ways that reflect not only interactions of principles of gestural economy and contrast enhancement, but also language-specific prosodic systems, which further modulate how the three kinds of hyper-artsiculations are phonetically expressed.

...read moreread less

DOI•

Incremental interpretation and prediction of utterance meaning for interactive dialogue

[...]

David DeVault¹, Kenji Sagae¹, David Traum¹•Institutions (1)

Institute for Creative Technologies¹

11 May 2011

TL;DR: A framework for incremental interpretation and prediction of utterance meaning in dialogue systems is presented and a method for determining when a system has reached a point of maximal understanding of an ongoing user utterance is presented.

...read moreread less

Abstract: We present techniques for the incremental interpretation and prediction of utterance meaning in dialogue systems. These techniques open possibilities for systems to initiate responsive overlap behaviors during user speech, such as interrupting, acknowledging, or completing a user's utterance while it is still in progress. In an implemented system, we show that relatively high accuracy can be achieved in understanding of spontaneous utterances before utterances are completed. Further, we present a method for determining when a system has reached a point of maximal understanding of an ongoing user utterance, and show that this determination can be made with high precision. Finally, we discuss a prototype implementation that shows how systems can use these abilities to strategically initiate system completions of user utterances. More broadly, this framework facilitates the implementation of a range of overlap behaviors that are common in human dialogue, but have been largely absent in dialogue systems.

...read moreread less

Journal Article•DOI•

On the Nature of "Laughables": Laughter as a Response to Overdone Figurative Phrases

[...]

Elizabeth Holt

01 Jan 2011-Pragmatics

TL;DR: The authors explored the relationship between laughter responses and the turns which they orient to and found that laughter is not simply a reaction to the perception of humour, but an action in its own right.

...read moreread less

Abstract: In this article I explore the relationship between laugh responses and the turns which they orient to. I consider whether it is possible to identify properties of the prior turns that the recipient may be orienting to in laughing. Thus, I begin by briefly exploring the relationship between laughter and humour in interaction. But I point to some of the difficulties in identifying what it is that makes some discourse humorous, and I argue that laughter is not simply a reaction to the perception of humour. Laughter should be considered as an action in its own right, the occurrence of which may have nothing to do with the presence of humour. Consequently, I consider the notion of the “laughable” and whilst I agree that “(v)irtually any utterance or action could draw laughter, under the right (or wrong) circumstances” (Glenn 2003: 49), I argue it is often possible to identify recurrent properties of turns treated as laughables. These properties concern the design, action and the sequential position of the turns. Thus, it seems that speakers draw from a range of resources in constructing laughables. I illustrate this by exploring a collection of instances of figurative phrases followed by laugh responses from telephone calls. I argue that in responding with laughter, recipients may orient to a cluster of properties in the prior turn. However, because laughter is an action with its own sequential implications, rather than simply a response to a prior turn, whether a recipient orients to a prior candidate laughable by laughing will depend on the nature of his or her contribution to the action sequence.

...read moreread less

Journal Article•DOI•

Discourse marker and modal particle: The functions of utterance-final then in spoken English

[...]

Alexander Haselow¹•Institutions (1)

University of Rostock¹

01 Nov 2011-Journal of Pragmatics

TL;DR: This article provided a detailed account of the pragmatic functions of final then in spoken English, based on corpus data from the British component of the International Corpus of English, and provided a descriptive schema that captures the functions of the final then as a modal particle.

...read moreread less

Collapse