scispace - formally typeset
Search or ask a question

Showing papers on "Utterance published in 2010"


PatentDOI
TL;DR: In this paper, a system for receiving speech and non-speech communications of natural language questions and commands, transcribing the speech and NN communications to textual messages, and executing the questions and/or commands is presented.
Abstract: Systems and methods are provided for receiving speech and non-speech communications of natural language questions and/or commands, transcribing the speech and non-speech communications to textual messages, and executing the questions and/or commands. The invention applies context, prior information, domain knowledge, and user specific profile data to achieve a natural environment for one or more users presenting questions or commands across multiple domains. The systems and methods creates, stores and uses extensive personal profile information for each user, thereby improving the reliability of determining the context of the speech and non-speech communications and presenting the expected results for a particular question or command.

1,164 citations


Patent
22 Feb 2010
TL;DR: In this article, a system and method for processing multi-modal device interactions in a natural language voice services environment is presented, in which context relating to the non-voice interaction and the natural language utterance may be extracted and combined to determine an intent of the multidomal device interaction, and a request may then be routed to one or more of the electronic devices based on the determined intent.
Abstract: A system and method for processing multi-modal device interactions in a natural language voice services environment may be provided. In particular, one or more multi-modal device interactions may be received in a natural language voice services environment that includes one or more electronic devices. The multi-modal device interactions may include a non-voice interaction with at least one of the electronic devices or an application associated therewith, and may further include a natural language utterance relating to the non-voice interaction. Context relating to the non-voice interaction and the natural language utterance may be extracted and combined to determine an intent of the multi-modal device interaction, and a request may then be routed to one or more of the electronic devices based on the determined intent of the multi-modal device interaction.

321 citations


Patent
16 Sep 2010
TL;DR: In this paper, a system and method for hybrid processing in a natural language voice services environment that includes a plurality of multi-modal devices may be provided, in particular, the hybrid processing may generally include the plurality of multidomal devices cooperatively interpreting and processing one or more natural language utterances included in one or multiple multimodal requests.
Abstract: A system and method for hybrid processing in a natural language voice services environment that includes a plurality of multi-modal devices may be provided. In particular, the hybrid processing may generally include the plurality of multi-modal devices cooperatively interpreting and processing one or more natural language utterances included in one or more multi-modal requests. For example, a virtual router may receive various messages that include encoded audio corresponding to a natural language utterance contained in a multi-modal interaction provided to one or more of the devices. The virtual router may then analyze the encoded audio to select a cleanest sample of the natural language utterance and communicate with one or more other devices in the environment to determine an intent of the multi-modal interaction. The virtual router may then coordinate resolving the multi-modal interaction based on the intent of the multi-modal interaction.

231 citations


Journal ArticleDOI
01 Oct 2010
TL;DR: An account of metaphor understanding which covers the full range of cases has to allow for two routes or modes of processing, one of which requires a greater focus on the literal meaning of sentences or texts, which is metarepresented as a whole and subjected to more global, reflective pragmatic inference.
Abstract: I propose that an account of metaphor understanding which covers the full range of cases has to allow for two routes or modes of processing. One is a process of rapid, local, on-line concept construction that applies quite generally to the recovery of word meaning in utterance comprehension. The other requires a greater focus on the literal meaning of sentences or texts, which is metarepresented as a whole and subjected to more global, reflective pragmatic inference. The questions whether metaphors convey a propositional content and what role imagistic representation plays receive somewhat different answers depending on the processing route.

201 citations


Book ChapterDOI
01 Jan 2010
TL;DR: The authors found that older adults often find it difficult to communicate, especially in group situations, because they are unable to keep up with the flow of conversation or are too slow in comprehending what they are hearing.
Abstract: Older individuals often find it difficult to communicate, especially in group situations, because they are unable to keep up with the flow of conversation or are too slow in comprehending what they are hearing. These communication difficulties are often exacerbated by negative stereotypes held by their communication partners who often perceive older adults as less competent than they actually are (Ryan et al. 1986). Sometimes, older adults’ communication problems motivate them, often at the prompting of their family and friends, to seek help from hearing specialists (O’Mahoney et al. 1996). Quite often, however, older adults and/or their family members wonder if these comprehension difficulties are a sign of cognitive decline. Such uncertainty on the part of both older adults and their family members with respect to the source of communication difficulties is understandable given that age-related changes in the comprehension of spoken language could be due to age-related changes in hearing, to age-related declines in cognitive functioning, or to interactions between these two levels of processing. To participate effectively in a multitalker conversation, listeners need to do more than simply recognize and repeat speech. They have to keep track of who said what, extract the meaning of each utterance, store it in memory for future use, integrate the incoming information with what each conversational participant has said in the past, and draw on the listener’s own knowledge of the topic under consideration to extract general themes and formulate responses. In other words, effective communication requires not only an intact auditory system but also an intact cognitive system.

183 citations


Proceedings Article
09 Oct 2010
TL;DR: It is shown that a speaker model that acts optimally with respect to an explicit, embedded listener model substantially outperforms one that is trained to directly generate spatial descriptions.
Abstract: Language is sensitive to both semantic and pragmatic effects. To capture both effects, we model language use as a cooperative game between two players: a speaker, who generates an utterance, and a listener, who responds with an action. Specifically, we consider the task of generating spatial references to objects, wherein the listener must accurately identify an object described by the speaker. We show that a speaker model that acts optimally with respect to an explicit, embedded listener model substantially outperforms one that is trained to directly generate spatial descriptions.

156 citations


Patent
11 Mar 2010
TL;DR: In this article, a decoder estimates an intention in the content of an utterance based on a language score of each of the language models calculated by the language score calculating section.
Abstract: A speech recognition device includes one intention extracting language model and more in which an intention of a focused specific task is inherent, an absorbing language model in which any intention of the task is not inherent, a language score calculating section that calculates a language score indicating a linguistic similarity between each of the intention extracting language model and the absorbing language model, and the content of an utterance, and a decoder that estimates an intention in the content of an utterance based on a language score of each of the language models calculated by the language score calculating section.

136 citations


Journal ArticleDOI
TL;DR: In this paper, a personality-based user adaptation framework is proposed to generate user-adaptive utterance variations. But the results indicate that humans perceive the personality of system utterances in the way that the system intended.
Abstract: Conversation is an essential component of social behavior, one of the primary means by which humans express intentions, beliefs, emotions, attitudes and personality. Thus the development of systems to support natural conversational interaction has been a long term research goal. In natural conversation, humans adapt to one another across many levels of utterance production via processes variously described as linguistic style matching, entrainment, alignment, audience design, and accommodation. A number of recent studies strongly suggest that dialogue systems that adapted to the user in a similar way would be more effective. However, a major research challenge in this area is the ability to dynamically generate user-adaptive utterance variations. As part of a personality-based user adaptation framework, this article describes personage, a highly parameterizable generator which provides a large number of parameters to support adaptation to a user's linguistic style. We show how we can systematically apply results from psycholinguistic studies that document the linguistic reflexes of personality, in order to develop models to control personage's parameters, and produce utterances matching particular personality profiles. When we evaluate these outputs with human judges, the results indicate that humans perceive the personality of system utterances in the way that the system intended.

113 citations


Journal ArticleDOI
TL;DR: In this article, a prosodic contrast is defined as a statistically reliable shift between adjacent phrasal units in at least one of five acoustic dimensions (mean fundamental frequency, fundamental frequency variability, mean amplitude, amplitude variability, and mean syllable duration).
Abstract: Prosodic features in spontaneous speech help disambiguate implied meaning not explicit in linguistic surface structure, but little research has examined how these signals manifest themselves in real conversations. Spontaneously produced verbal irony utterances generated between familiar speakers in conversational dyads were acoustically analyzed for prosodic contrasts. A prosodic contrast was defined as a statistically reliable shift between adjacent phrasal units in at least 1 of 5 acoustic dimensions (mean fundamental frequency, fundamental frequency variability, mean amplitude, amplitude variability, and mean syllable duration). Overall, speakers contrasted prosodic features in ironic utterances with utterances immediately preceding them at a higher rate than between adjacent nonironic utterance pairs from the same interactions. Across multiple speakers, ironic utterances were spoken significantly slower than preceding speech, but no other acoustic dimensions changed consistently. This is the first aco...

111 citations


20 Sep 2010
TL;DR: The present study investigates the validity of this assumption that online platforms such as email and instant messaging mirror informal spoken language by examining discourse structures in IM conversations between American college students.
Abstract: Both users of CMC and the popular press commonly assume that online platforms such as email and instant messaging (IM) mirror informal spoken language. The present study investigates the validity of this assumption by examining discourse structures in IM conversations between American college students. Linguistic features of spoken and written language were first compared both paradigmatically and empirically, drawing particularly on research on intonation units by Chafe (1980, 1994). A subsequent fine-grained analysis of the grammatical points at which subjects chunked their IM turns into multiple transmissions revealed that while IM conversations between male dyads tended to resemble spoken discourse according to this dimension, IM conversations between females bore more similarities to traditional written language.

92 citations


Patent
23 Dec 2010
TL;DR: The authors describe word-dependent language models, as well as their creation and use, which can be useful in many contexts, including those where one or more letters of the expected phrase are known to the speaker.
Abstract: This document describes word-dependent language models, as well as their creation and use. A word-dependent language model can permit a speech-recognition engine to accurately verify that a speech utterance matches a multi-word phrase. This is useful in many contexts, including those where one or more letters of the expected phrase are known to the speaker.

Journal ArticleDOI
TL;DR: This article treated the social practice of indirect reports and treated them as cases of language games and proposed a number of principles like the following: Paraphrasis/Form Principle The that-clause embedded in the verb "say" is a paraphrasis of what Y said, and meets the following constraints: should Y hear what X said he (Y) had said, he would not take issue with it, as to content, but would approve of it as a fair paraphrasing of his original utterance.

Journal ArticleDOI
TL;DR: A model ofspeech segmentation is presented that aims to reveal important sources of information for speech segmentation, and to capture psycholinguistic constraints on children's language perception, and constructs a lexicon based on information about utterance boundaries and deduces phonotactic constraints from the discovered lexicon.
Abstract: There are numerous models of how speech segmentation may proceed in infants acquiring their first language. We present a framework for considering the relative merits and limitations of these various approaches. We then present a model of speech segmentation that aims to reveal important sources of information for speech segmentation, and to capture psycholinguistic constraints on children's language perception. The model constructs a lexicon based on information about utterance boundaries and deduces phonotactic constraints from the discovered lexicon. Compared to other models of speech segmentation, our model performs well in terms of accuracy, computational tractability and the number of components of the model. Finally, our model also reflects the psycholinguistic effects of language learning, in terms of the early advantage for segmentation provided by the child's name, and by revealing the overlap in usefulness of information for segmentation and for grammatical categorization of the language.

Journal ArticleDOI
TL;DR: The enhancements and optimizations of a speech-based emotion recognizer jointly operating with automatic speech recognition are described and it is argued that the knowledge about the textual content of an utterance can improve the recognition of the emotional content.
Abstract: The involvement of emotional states in intelligent spoken human-computer interfaces has evolved to a recent field of research. In this article we describe the enhancements and optimizations of a speech-based emotion recognizer jointly operating with automatic speech recognition. We argue that the knowledge about the textual content of an utterance can improve the recognition of the emotional content. Having outlined the experimental setup we present results and demonstrate the capability of a post-processing algorithm combining multiple speech-emotion recognizers. For the dialogue management we propose a stochastic approach comprising a dialogue model and an emotional model interfering with each other in a combined dialogue-emotion model. These models are trained from dialogue corpora and being assigned different weighting factors they determine the course of the dialogue.

01 Jan 2010
TL;DR: A number of approaches for detecting whether a textual utterance is of objective or subjective nature and in the latter case detecting the polarity of the utterance (i.e. positive vs. negative) are studied.
Abstract: The ability to correctly identify the existence and polarity of emotion in informal, textual communication is a very important part of a realistic and immersive 3D environment where people communicate with one another through avatars or with an automated system. Such a feature would provide the system the ability to realistically represent the mood and intentions of the participants, thus greatly enhancing their experience. In this paper, we study and compare a number of approaches for detecting whether a textual utterance is of objective or subjective nature and in the latter case detecting the polarity of the utterance (i.e. positive vs. negative). Experiments are carried out on a real corpus of social exchanges in cyberspace and general conclusions are presented.

Journal ArticleDOI
TL;DR: In this article, an agent-centred approach to embodied cognition and communication is presented in a study of carpentry practices among English woodwork tutors and trainees, focusing on the way that received somatic information is interpreted from the body.
Abstract: The approach to embodied cognition and communication presented in this study of carpentry practices among English woodwork tutors and trainees is an agent-centred one. I describe the cognitive operations that make possible both the enactment and understanding of practice by focussing on the way that received somatic information is ‘interpreted’ from the body. It is proposed that the flow of human movement, like the stream of words in an utterance, is segmentable in that it can be broken down into component actions, gestures and postures that unfold dynamically in space and time. The point of my argument is that physical practice communicates and therefore, like language, its component elements can be parsed by an observing party and acquired as mental representations by their motor domains of cognition. Motor representations yield embodied simulations of actions, or they can be systematically re-combined with the effect of producing physical imitation or novel articulations of knowledge-in-practice. Notably the compositional nature of mental representations underlies the ongoing and novel production of knowledge.

Harry Bunt1
01 Jan 2010
TL;DR: In this article, the authors studied the multifunctionality of dialogue utterances, i.e., the phenomenon that utterances in dialogue often have more than one communicative function, by analyzing the participation in dialogue as involving the performance of several types of activity in parallel, relating to different dimensions of communication.
Abstract: This paper studies the multifunctionality of dialogue utterances, i.e. the phenomenon that utterances in dialogue often have more than one communicative function. It is argued that this phenomenon can be explained by analyzing the participation in dialogue as involving the performance of several types of activity in parallel, relating to different dimensions of communication. The multifunctionality of dialogue utterances is studied by (1) redefining the notion of 'utterance' in a rigorous manner (calling the revised notion 'functional segment'), and (2) empirically investigating the multifunctionality of functional segments in a corpus of dialogues, annotated with a rich, multidimensional annotation schema. It is shown that, when communicative functions are assigned to functional segments, thereby eliminating every form of segmentation-related multifunctionality, an average multifunctionality is found between 1.8 and 3.6, depending on what is considered to count as a segment's communicative function. Moreover, a good understanding of the nature of the relations among the various multiple functions that a segment may have, and of the relations between functional segments and other units in dialogue segmentation, opens the way for defining a multidimensional computational update semantics for dialogue interpretation.

Journal ArticleDOI
TL;DR: Two experiments aimed at selecting utterances from lists of responses indicate that the decoding process can be improved by optimizing the language model and the acoustic models, thus reducing the utterance error rate from 29–26% to 10–8%.
Abstract: Computer-Assisted Language Learning (CALL) applications for improving the oral skills of low-proficient learners have to cope with non-native speech that is particularly challenging. Since unconstrained non-native ASR is still problematic, a possible solution is to elicit constrained responses from the learners. In this paper, we describe experiments aimed at selecting utterances from lists of responses. The first experiment on utterance selection indicates that the decoding process can be improved by optimizing the language model and the acoustic models, thus reducing the utterance error rate from 29-26% to 10-8%. Since giving feedback on incorrectly recognized utterances is confusing, we verify the correctness of the utterance before providing feedback. The results of the second experiment on utterance verification indicate that combining duration-related features with a likelihood ratio (LR) yield an equal error rate (EER) of 10.3%, which is significantly better than the EER for the other measures in isolation.

Patent
23 Nov 2010
TL;DR: In this paper, a first spoken input is received from a user of an electronic device (an "original utterance") and a first set of character string candidates are determined that each represent the original utterance converted to textual characters and a selection of one or more of the character string candidate are provided in a format for display to the user.
Abstract: Subject matter described in this specification can be embodied in methods, computer program products and systems relating to speech-to-text conversion. A first spoken input is received from a user of an electronic device (an “original utterance”). Based on the original utterance, a first set of character string candidates are determined that each represent the original utterance converted to textual characters and a selection of one or more of the character string candidates are provided in a format for display to the user. A second spoken input is received from the user and a determination is made that the second spoken input is a repeat utterance of the original utterance. Based on this determination and using the original utterance and the repeat utterance, a second set of character string candidates is determined.

DOI
17 Jun 2010
TL;DR: The relationship between written and spoken academic and research speech has been examined in this article, where the authors discuss the similarities and differences between the two modes of communication, in all their uses and varieties, in the context of English as a primary medium for the transmission and exchange of academic knowledge.
Abstract: It is a fact universally acknowledged that English has emerged in recent decades as the premier vehicle for the communication of scholarship, research and advanced postgraduate training. The causes of this rise have, however, been the subject of considerable controversy, with a particularly strenuous debate between Phillipson (e.g. 1999) and Crystal (2000), which is fully reprised and extended in Seidlhofer (2003). Whatever the merits of the various arguments, whether, for example, Crystal’s 1997 account is ‘triumphalist’ or not, there can be no doubt that English has become the principal medium for the transmission and exchange of academic knowledge, just as there can be no doubt that the global number of academic communications, both in English and in other languages, has greatly increased in recent decades. And this applies not simply to the number of research articles and scholarly books published each year, but also to the number of international and more local academic conferences held annually, as well as to other kinds of cross-national academic and research exchange, such as multinational research projects and the growing numbers of students spending study periods outside their home countries (Fortanet-Gomez and Raisanen 2008). In other words, the increasing use of academic English is not confined to the printed word, but equally applies to the spoken utterance. Aspects of the similarities and differences between written and spoken academic andresearch speech – in all their uses and varieties – will surface at various points in this chapter. At the outset, however, it is pertinent to consider, and perhaps reconsider, the relationship between these two primary modes. For a number of reasons, the written mode was long privileged by analysts and researchers of academic discourse, as it was by instructors in the applied field of English for Academic Purposes. For one thing, it is written work that is primarily assessed and evaluated, both for students as they journey towards their higher degrees, and for academics as they apply for better jobs or come up for evaluation and potential contract renewal, tenure or promotion. Second, written exemplars have been much easier to get hold of and to get a handle on; they also more readily lend themselves to the traditional methods of linguistic analysis. Third, therecent rise of interest in courses, workshops and manuals designed to develop academic language skills has also been largely focused on the written side. For instance, there is now a considerable body of material designed to help students, both with English as a first language or as an additional language (EAL), in the writing of Master’s and PhD theses (e.g. Swales and Feak 2000), but, at present, there is relatively little available to help them with the oral presentation and defence of their work. So, not only has the written side of things been privileged, but, in addition, it has tended to become detached from the various speech events and episodes in which the development of academic text is typically immersed. Over the last decade, however, there has been something of a change in both per-ception and outcome with regard to the speech-writing ‘divide’. One motivating force is increasing interest on the part of applied linguists and others in ethnographic studies of the academy. An important and influential work in this regard is Prior’s Writing/ Disciplinarity (Prior 1998). His case studies offer insights into the lived experience of post-graduate seminars in which talk emerges as a crucial element in textualizing processes and also as a negotiated ground that undermines the traditional institutional power imbalance between professors and their post-graduate students. Later work along these lines includes Tardy (2005) and Seloni (2008). Another has been the creation of corpora of spoken academic and research English (e.g. T2K-SWAL), and the more widely available Michigan Corpus in Academic Spoken English (MICASE), and the many publications that have been based on them, such as Biber et al. (2002) and Perez-Llantada and Ferguson (2006). A third development, very much centred on Europe, has been interest by discourse analysts in the conference presentation, over and beyond the traditional research focus on the written research article. A key work here is the outstanding collection edited by Ventola et al. (2002). If the balance of attention between spoken and written genres is now being readjusted,there are other affordances that work for an even greater rapprochement. One requires a recognition of the Bakhtinian notion of ‘inner’ or ‘private’ speech. Every time we are faced with a non-trivial speaking or writing task, we run through options in our minds as we prepare to either address an audience (as when preparing to ask a question) or place our fingers on the keyboard (as when composing a conference abstract). We mentally rehearse, as we try to imagine the effects of possible spoken or written offerings. In effect, there are cognitive and rhetorical correspondences here. Another type of affordance derives from the ongoing development of hybrid communicative styles in electronic genres such as emails and blogs, and in those parts of websites that deal with such part-genres as FAQs (Bloch 2008). A third is essentially sociological, or at least socio-academic. A major change in the perception of academia originated in science studies in the 1970s, when sociologists and anthropologists turned their attention to scientific work. Instead of asking scientists what they did and taking their word for it, they observed the activities that scientists actually engaged in. This constituted a major break with the traditional provinces of the philosophy or the history of science. The reorientation in seeing academia coincided with changes in academic practices: while scientists had worked in teams for centuries, scholars in the ‘soft’ sciences had remained solitary individuals, each on their own projects. Twenty years ago, the concept of the individual scholar toiling away in her solipsistic ivory tower, or of the lonely PhD student immured in her library carrel, has been replaced by a growing speech-writing interconnectedness of those individual members of the academic world, mainly through formal sub-groupings of researchers and research students, as well asvia various kinds of informal collectives for study, information or mutual support, not excluding various specialized ‘lists’ on the web. In consequence of all this, the older models of speaking-writing interaction that tended to consider the oral component as sub-ordinate, preparatory or merely evaluative in a post hoc kind of way (as in a thesis defence or a promotion committee) are being replaced. As Rubin and Kang interestingly propose:A more apt model might be a double helix with a writing strand and a speaking strand intertwined. At any particular page one strand may be the focal outcome, drawing upon the other. But as a whole, the two strands are reciprocally supportive and leading in the same direction.

Journal ArticleDOI
TL;DR: Findings were taken to suggest that, although word- level influences cannot be discounted, utterance-level influences contribute to the loci of stuttering in preschool-age children, and may help account for developmental changes in the locu of stuttered.

Journal Article
TL;DR: The authors look at some issues raised by the idea that addressees infer ad hoc concepts as part of the on-line comprehension process, and some tentative suggestions are made of questions arising from work in theoretical lexical pragmatics which might be amenable to investigation at the neuro-pragmatic level.
Abstract: Ostensive communication, the paradigm case of which is verbal communication, is the domain of a dedicated cognitive system, according to Relevance Theory (RT). This ‘pragmatics’ module is responsible for inferring the content or meaning that the communicator intends by his/her ostensive stimulus. An important sub-process within the system is the adjustment or modulation of lexically-encoded meaning, which makes it possible for speakers to communicate a vastly greater range of concepts than those that are stably encoded in their linguistic system. This includes the meaning communicated by at least some cases of metaphorically-used language. Taking a broadly Fodorian view that lexical concepts are atomic (unstructured), this paper looks at some issues raised by the idea that addressees infer ad hoc concepts as part of the on-line comprehension process. As a cognitive-scientific theory, RT is open to evidence from a range of sources, including native speaker-hearer intuitions, recorded instances of linguistic communication (corpus data) from both communicatively typical and atypical populations, results from relevant psychological and psycholinguistic experiments, and findings in cognitive neuroscience on brain activation during both utterance production and comprehension. At the end of the paper, some tentative suggestions are made of questions arising from work in theoretical lexical pragmatics which might be amenable to investigation at the neuropragmatic level.

Journal ArticleDOI
TL;DR: Small but significant results partially supported the predictions, suggesting a link between eyebrow raising and spoken language and possible linguistic functions are proposed, namely the structuring and emphasising of information in the verbal message.

Journal ArticleDOI
TL;DR: The authors identify the relation between the interpretation of epistemic parentheticals in discourse and their prosodic realisation, and make an important contribution to the understanding of how prosody conveys apparently subtle shades of meaning that are nonetheless crucial for utterance interpretation.
Abstract: The aim of this study is to identify the relation between the interpretation of epistemic parentheticals in discourse and their prosodic realisation. Data drawn from a corpus of British English speech suggests that epistemic parentheticals (comment clauses such as I think, I believe) convey a spectrum of meaning from propositional to interpersonal. They have long been categorised simply as sentence adverbials with a meaning that relates to the truth value of the proposition. However, a study of their prosodic realisation suggests that they occupy a transitional place in the process of semantic change. They can express a wide range of meanings from propositional (sentential) meaning, through discourse meaning to the status of verbal filler. The analysis draws on theories of discourse, historical change and prosody. It makes an important contribution to the understanding of how prosody conveys apparently subtle shades of meaning that are nonetheless crucial for utterance interpretation, including degrees of speaker certainty, the identification of disfluency and the expression of politeness.

Journal ArticleDOI
TL;DR: The authors traces Weigand's linguistic career from its beginning to today and comprises a selection of articles which take the reader on a vivid and fascinating journey through the most important stages of her theorizing.
Abstract: With her theory of ‘Language as Dialogue’, Edda Weigand has opened up a new and promising perspective in linguistic research and its neighbouring disciplines. Her model of ‘competence-in-performance’ solved the problem of how to bridge the gap between competence and performance and thus substantially shaped the way in which people look at language today. This book traces Weigand’s linguistic career from its beginning to today and comprises a selection of articles which take the reader on a vivid and fascinating journey through the most important stages of her theorizing. The initial stage when a model of communicative competence was developed is followed by a gradual transition period which finally resulted in the theory of the dialogic action game as a mixed game or the Mixed Game Model. The articles cover a wide range of linguistic topics including, among others, speech act theory, lexical semantics, utterance grammar, emotions, the media, rhetoric and institutional communication. Editorial introductions give further information on the origin and theoretical background of the articles included.

Journal ArticleDOI
01 Jun 2010-Noûs
TL;DR: The standard view of linguistic communication is that a speaker has a thought, such as a belief that Fichte was a philosopher, which she would like to convey to her audience as discussed by the authors.
Abstract: According to a view implicit in much of twentieth-century philosophy of language, linguistic communication conforms to the following model: a speaker has a thought—say, a belief that Fichte was a philosopher—which she would like to convey to her audience. This thought has a certain proposition as its content, a proposition which we might specify using the ‘that’-clause ‘that Fichte was a philosopher’. If the speaker knows that her audience is a competent speaker of a shared language such as English, she can choose some form of words which makes manifest to her audience the proposition she intends to communicate. Perhaps she utters ‘Uncle Johann was a philosopher’, or ‘He [pointing to a painting of Fichte] was a philosopher’. The speaker’s audience recognizes her communicative intentions, and communication is successful, only if her audience thereby entertains a thought whose content is the proposition the speaker meant by her utterance. Let’s call this the standard view. Stated at this level of generality one might be tempted to think that some version of the standard view is obviously true—perhaps hardly worth stating. I believe that this temptation ought to be resisted: communication rarely, if ever, works in the way characterized by the standard view. In this paper, my primary aim is to make some headway towards establishing this rather sweeping claim by discussing some cases in which the standard view apparently fails, focusing on the phenomena of quantifier domain restriction and non-sentential assertion, illustrated by (1) and (2), respectively:

Journal ArticleDOI
TL;DR: The results indicate that shadowed utterances sounded more similar to the model’s utterances than did subjects’ nonshadowed read utterances, suggesting that speech alignment can be based on visual speech.
Abstract: Speech alignment is the tendency for interlocutors to unconsciously imitate one another's speaking style Alignment also occurs when a talker is asked to shadow recorded words (eg, Shockley, Sabadini, & Fowler, 2004) In two experiments, we examined whether alignment could be induced with visual (lipread) speech and with auditory speech In Experiment 1, we asked subjects to lipread and shadow out loud a model silently uttering words The results indicate that shadowed utterances sounded more similar to the model's utterances than did subjects' nonshadowed read utterances This suggests that speech alignment can be based on visual speech In Experiment 2, we tested whether raters could perceive alignment across modalities Raters were asked to judge the relative similarity between a model's visual (silent video) utterance and subjects' audio utterances The subjects' shadowed utterances were again judged as more similar to the model's than were read utterances, suggesting that raters are sensitive to cross-modal similarity between aligned words

Proceedings Article
01 Jan 2010
TL;DR: A method of on-demand language model interpolation in which contextual information about each utterance determines interpolation weights among a number of n-gram language models is presented.
Abstract: Google offers several speech features on the Android mobile operating system: search by voice, voice input to any text field, and an API for application developers. As a result, our speech recognition service must support a wide range of usage scenarios and speaking styles: relatively short search queries, addresses, business names, dictated SMS and e-mail messages, and a long tail of spoken input to any of the applications users may install. We present a method of on-demand language model interpolation in which contextual information about each utterance determines interpolation weights among a number of n-gram language models. On-demand interpolation results in an 11.2% relative reduction in WER compared to using a single language model to handle all traffic. Index Terms: language modeling, interpolation, mobile

Journal ArticleDOI
Peng Li1, Yong Guan1, Shijin Wang1, Bo Xu1, Wenju Liu1 
TL;DR: Recognition results on the 2006 speech separation challenge corpus prove that this proposed system can improve the robustness of ASR significantly.

Book
15 Jan 2010
TL;DR: Prashant Parikh argues that equilibrium, or balance among multiple interacting forces, is a key attribute of language and meaning and shows how to derive the meaning of an utterance from first principles by modeling it as a system of interdependent games.
Abstract: In Language and Equilibrium, Prashant Parikh offers a new account of meaning for natural language. He argues that equilibrium, or balance among multiple interacting forces, is a key attribute of language and meaning and shows how to derive the meaning of an utterance from first principles by modeling it as a system of interdependent games. His account results in a novel view of semantics and pragmatics and describes how both may be integrated with syntax. It considers many aspects of meaningincluding literal meaning and implicatureand advances a detailed theory of definite descriptions as an application of the framework. Language and Equilibrium is intended for a wide readership in the cognitive sciences, including philosophers, linguists, and artificial intelligence researchers as well as neuroscientists, psychologists, and economists interested in language and communication.