scispace - formally typeset
Search or ask a question

Showing papers on "Phrase published in 1997"


Journal ArticleDOI
01 Sep 1997
TL;DR: A tutorial on the design and development of automatic speaker-recognition systems is presented and a new automatic speakers recognition system is given that performs with 98.9% correct decalcification.
Abstract: A tutorial on the design and development of automatic speaker-recognition systems is presented. Automatic speaker recognition is the use of a machine to recognize a person from a spoken phrase. These systems can operate in two modes: to identify a particular person or to verify a person's claimed identity. Speech processing and the basic components of automatic speaker-recognition systems are shown and design tradeoffs are discussed. Then, a new automatic speaker-recognition system is given. This recognizer performs with 98.9% correct decalcification. Last, the performances of various systems are compared.

1,686 citations


Journal ArticleDOI
TL;DR: SEQUITUR as mentioned in this paper is an algorithm that infers a hierarchical structure from a sequence of discrete symbols by replacing repeated phrases with a grammatical rule that generates the phrase, and continuing this process recursively.
Abstract: SEQUITUR is an algorithm that infers a hierarchical structure from a sequence of discrete symbols by replacing repeated phrases with a grammatical rule that generates the phrase, and continuing this process recursively. The result is a hierarchical representation of the original sequence, which offers insights into its lexical structure. The algorithm is driven by two constraints that reduce the size of the grammar, and produce structure as a by-product. SEQUITUR breaks new ground by operating incrementally. Moreover, the method's simple structure permits a proof that it operates in space and time that is linear in the size of the input. Our implementation can process 50,000 symbols per second and has been applied to an extensive range of real world sequences.

650 citations


Patent
26 Jun 1997
TL;DR: A word recognition system can: respond to the input of a character string from a user by limiting the words it will recognize to words having a related, but not necessarily the same, string; score signals generated after a user has been prompted to generate a given word against words other than the prompted word to determine if the signal should be used to train the asked word.
Abstract: A word recognition system can: respond to the input of a character string from a user by limiting the words it will recognize to words having a related, but not necessarily the same, string; score signals generated after a user has been prompted to generate a given word against words other than the prompted word to determine if the signal should be used to train the prompted word; vary the number of signals a user is prompted to generate to train a given word as a function of how well the training signals score against each other or prior models for the prompted word; create a new acoustic model of a phrase by concatenating prior acoustic models of the words in the phrase; obtain information from another program running on the same computer, such as its commands or the context of text being entered into it, and use that information to vary which words it can recognize; determine which program unit, such as an application program or dialog box, currently has input focus on its computer and create a vocabulary state associated with that program unit into which vocabulary words which will be made active when that program group has the focus can be put; detect the available computational resources and alter the instructions it executes in response; test if its ability to respond to voice input has been shut off without user confirmation, and, if so, turn that ability back on and prompt the user to confirm if that ability is to be turned off, store both a first and a second set of models for individual vocabulary words and enable a user to selectively cause the recognizer to disregard the second set of models for a selected word; and/or score a signal representing a given word against models for that word from different word model sets to select which model should be used for future recognition.

276 citations


Patent
25 Nov 1997
TL;DR: In this paper, the frequency of occurrence of each word of each document is derived and a new query phrase is derived from a good keyword and replaces the good keyword to narrow a search.
Abstract: A user views search results and subjectively determines if a document is desirable or undesirable. Only documents categorized by the user are analyzed for deriving a list of prospective keywords. The frequency of occurrence of each word of each document is derived. Keywords that occur only in desirable documents are good keywords. Keywords that occur only in undesirable documents are bad keywords. Keywords that occurs in both types are dirty keywords. The best keywords are the good keywords with the highest frequency of occurrence. The worst keywords are the bad keywords with the highest frequency of occurrence. A new query phrase includes the highest ranked good keywords and performs filtering using the highest ranked bad keywords. Key phrases are derived to clean dirty keywords into good key phrases. A key phrase also is derived from a good keyword and replaces the good keyword to narrow a search.

266 citations


Proceedings Article
01 Jan 1997
TL;DR: This paper presents an algorithm for automatically assigning phrase breaks to unrestricted text for use in a text-to-speech synthesizer and reports a variety of experiments investigating part-of-speech tag-sets, Markov model structure and smoothing.
Abstract: This paper presents an algorithm for automatically assigning phrase breaks to unrestricted text for use in a text-to-speech synthesizer. Text is first converted into a sequence of part-of-speech tags. Next a Markov model is used to give the most likely sequence of phrase breaks for the input part-of-speech tags. In the Markov model, states represent types of phrase break and the transitions between states represent the likelihoods of sequences of phrase types occurring. The paper reports a variety of experiments investigating part-of-speech tag-sets, Markov model structure and smoothing. The best setup correctly identifies 79% of breaks in the test corpus.

243 citations


Patent
17 Dec 1997
TL;DR: In this article, a speaker-independent, continuous speech, multilingual, multi-dialect, Automatic Speech Recognition (ASR) technology is used to identify key words in telephone conversations.
Abstract: The present invention comprises speaker-independent, continuous speech, multilingual, multi-dialect, Automatic Speech Recognition (ASR) technology. In particular, the present application integrates the ASR technology into call control technology such that it will identify key words in two ways. First, it will "Listen" to live conversations of any or all telephone lines controlled by the call control system. Second, it will "Listen" to recorded conversations of any or all voice recorder channels so that previously recorded telephone conversations can be quickly scanned to find key words or phrases spoken in the past. The unique aspect of this application is that it is being applied to the corrections industry for the purpose of spotting key words or phrases for investigative purposes or inmate control purposes which then can alert or trigger remedial action.

238 citations


Proceedings Article
14 Aug 1997
TL;DR: This work addresses the problem of discovering trends in text databases by defining a trend, a specific subsequence of the history of a phrase that satisfies the users’ query over the histories.
Abstract: We address the problem of discovering trends in text databases. Trends can be used, for example, to discover that a company is shifting interests from one domain to another. We are given a database V of documents. Each document consists of one or more text fields and a timestamp. The unit of text is a word and a phrase is a list of words. (We defer the discussion of more complex structures till the “Methodology” secl-inn \ Ao.aw.;,tc.rl ..r;th r...rh nhrano ;a s h;rtmw nf the YAVU., ~uu”~Icu”n,L& ““lull \.uIUIA yuLCll”U I” Lo ,YYUY”~ y “I Yll” frequency of occurrence of the phrase, obtained by partitioning the documents based upon their timestamps. The frequency of occurrence in a particular time period is the number of documents that contain the phrase. (Other measures of frequency are possible, e.g. counting each occurrence of the phrase in a document.) A trend is a specific subsequence of the history of a phrase that satisfies the users’ query over the histories. For example, the user may specify a “spike” query to finds those phrases whose frequency of occurrence increased and then decreased.

233 citations


Journal ArticleDOI
TL;DR: Sentence comprehension and production were evaluated for chronic aphasic patients who have been shown to demonstrate one of three patterns in the relative case of retrieval of nouns and verbs, and results are interpreted as indicating multiple contribution to patients' sentence processing impairments.

194 citations


Journal ArticleDOI
TL;DR: The authors found that interference from an intervening plural depends on a close syntactic link to the head noun phrase (e.g., The owner of the house who charmed the realtors ).

187 citations


Patent
23 Jun 1997
TL;DR: In this paper, a trainable method for keyword extraction of one or more words is described, in which every word within a document that is not a stop word is stemmed and evaluated and receives a score, which is then replaced by a word phrase that is delimited by punctuation or stop words.
Abstract: A trainable method of extracting keywords of one or more words is disclosed According to the method, every word within a document that is not a stop word is stemmed and evaluated and receives a score The scoring is performed based on a plurality of parameters which are adjusted through training prior to use of the method for keyword extraction Each word having a high score is then replaced by a word phrase that is delimited by punctuation or stop words The word phrase is selected from word phrases having the stemmed word therein Repeated keywords are removed The keywords are expanded and capitalisation is determined The resulting list forms extracted keywords

180 citations


PatentDOI
TL;DR: A keyphrase detection and verification method that can be advantageously used to realize understanding of flexible (i.e., unconstrained) speech is presented in this paper. But the method is not suitable for the verification of keyphrase candidates.
Abstract: A key-phrase detection and verification method that can be advantageously used to realize understanding of flexible ( i.e., unconstrained) speech. A "multiple pass" procedure is applied to a spoken utterance comprising a sequence of words ( i.e., a "sentence"). First, a plurality of key-phrases are detected (i.e., recognized) based on a set of phrase sub-grammars which may, for example, be specific to the state of the dialogue. These key-phrases are then verified by assigning confidence measures thereto and comparing these confidence measures to a threshold, resulting in a set of verified key-phrase candidates. Next, the verified key-phrase candidates are connected into sentence hypotheses based upon the confidence measures and predetermined (e.g., task-specific) semantic information. And, finally, one or more of these sentence hypotheses are verified to produce a verified sentence hypothesis and, from that, a resultant understanding of the spoken utterance.

01 Jan 1997
TL;DR: This paper showed that the syntactic subject-object preference is not as strong as has previously been assumed: the discourse-related properties of the NPs also play a role in determining order preferences.
Abstract: Various clause types in Dutch and German are at least temporarily ambiguous with respect to the order of subject and object. A number of previous studies regarding the processing of such subject-object ambiguities have reported a preference for a subject-object interpretation. This order preference has generally been attributed to a syntactic generalization, that is, a generalization which abstracts away from specific properties of the NPs and the verb in the clause. The results of the present experiments suggest, however, that the syntactic subjectobject preference is not as strong as has previously been assumed: the discourserelated properties of the NPs also play a role in determining order preferences. First, the subject-object preference for main clauses is much weaker when the first NP is a wh-phrase than when it is a non-deictic definite NP; second, embedded wh-questions may even show an object-subject preference when the second NP is a pronoun. However, whether this non-structural information has an effect, and to what extent, depends on other factors, such as the manner of disambiguation (case, number information), and the point of disambiguation. In Chapter 2 an overview was given of the current literature on subjectobject ambiguities in Dutch and German. With only a few exceptions, a preference for the subject-object order has been found, even in cases where plausibility or contextual information favored an object-subject interpretation. Several syntactic accounts of this order preference were discussed. Next, it was argued that information which is not purely structural in nature may also affect the order preference. Several predictions were formulated that were experimentally investigated in Chapters 3 and 4. In Chapter 3 the processing of subject-object and object-subject main clauses was investigated. Declarative clauses in which the first NP was a nondeictic definite NP were compared with wh-questions in which the first NP was a which-N (welke-N) phrase. Object-subject declaratives impose more restrictions on the discourse context than subject-object declaratives do. Subject- and objectinitial wh-questions do not differ in this respect. In addition, subject- and objectinitial declaratives have been claimed to differ in terms of phrase structure in a way subject- and object-initial wh-phrases do not. A weaker subject-object preference was therefore expected for the wh-clauses compared to the declaratives. Self-paced reading times (Experiment 1) showed a subject-object preference starting immediately at the disambiguating auxiliary. The difference between wh-clauses and declaratives with respect to the order preference became apparent one word later, suggesting that the nature of the first NP affects ambiguity resolution somewhat later than the overall syntactic subject-object bias. In Chapter 4 the impact of the discourse-related properties of the second NP was investigated using temporarily ambiguous embedded wh-questions. Pronouns differ from non-pronominal definite NPs in the frequency with which they are used in the subject position. This is related to the discourse-status of the elements they refer to. Non-pronominal definite NPs can either refer to given information or introduce new entities into the discourse. In contrast, pronouns are generally used to refer to given entities in the discourse which are salient. Given, salient entities are generally also the topic of discussion. The prototypical position for a topic is the subject position. Pronouns therefore bias towards a subject interpretation. This bias is much weaker for definite NPs, especially if sentences are presented in absence of a discourse context and the definite NP is taken to introduce new entities. A pronoun in second position thus introduces a bias for the object-subject order. This bias is in competition with the syntactic bias for the subject-object order. If the discourse-related properties of the NPs are taken into account during the processing of order ambiguities, a weak subject-object preference, or even a preference for an object-subject order is expected if the second NP is a pronoun. First, an off-line completion study (Experiment 2) was conducted, showing that the syntactic preference for the subject-object order also holds for embedded wh-clauses. Next, three experiments were carried out on wh-clauses in which the second NP was a case-marked pronoun. An off-line questionnaire study (Experiment 3) showed that people choose the nominative form (object-subject interpretation) more often than the accusative form (subject-object order). The preference for an object-subject order was replicated in two on-line studies. Selfpaced grammaticality decision times (Experiment 4) and self-paced reading times (Experiment 5) showed an increase for the subject-object order relative to the object-subject order starting at or immediately after the disambiguating pronoun. No preference for the subject-object order was seen. The object-subject preference was partially replicated in Experiment 6. In this experiment, the wh-clauses were disambiguated by number information at the finite auxiliary in penultimate position. The pronoun itself was ambiguous. In addition, the length of the ambiguous region was manipulated: either one or six words separated the second NP pronoun from the disambiguating auxiliary. Again, an object-subject preference was found, but only in the conditions with a long ambiguous region. In the short conditions, subject-object clauses were responded to faster than object-subject clauses, but this difference was not significant. Finally, in Experiment 7, wh-clauses containing a case-ambiguous pronoun were compared with clauses containing a non-pronominal definite NP in second position. This time, four words separated the second NP from the disambiguating auxiliary. The clauses with a definite NP showed a tendency for a subject-object preference. No preference for either order was found for clauses containing the ambiguous pronoun, in contrast to the object-subject preference found in conditions with a case-marked pronoun (Experiments 3-5) or in conditions in which the ambiguous region was six words in length (Experiment 6). These results suggest that the discourse-related properties of the NPs can indeed have an effect on order preference. However, the time course and strength of this effect depends on other factors such as the manner and point of disambiguation. In Chapter 5 the frequencies of occurrence of subject and object-initial wh-clauses were investigated in a sample of written Dutch texts. Collapsing across the various types of predicates, the subject-initial order is the most frequent. However, when counts are restricted to transitive and ditransitive predicates only, the object-subject order is the most frequent. The nature of the second NP appears to be of influence: the object-subject order is significantly more frequent in clauses containing a pronoun than in clauses containing a definite NP or an indefinite NP. These frequency data are interesting in the light of frequency-based theories of sentence processing. These theories predict a correspondence between processing difficulty and frequency: the most frequent solution to the ambiguity should elicit the least processing difficulties. An important issue in this respect is the grain-size problem: which categories can be distinguished in terms of frequency, and on the basis of which information? The present data suggest that a grain-size according to which transitive welke-questions are treated as one, separate class cannot be correct for the following reason. For transitive welkequestions in general, the object-subject order is the most frequent. Transitive welke-clauses containing a definite NP, however, showed a reading time advantage for the subject-object order (Experiment 6). Tabulating frequencies separately for welke-questions containing a definite NP will not solve this problem: the object-subject order for such clauses is still more frequent than the subject-object order, in spite of the reversed parsing preference. A possible solution is to assume either that grain-size is yet even finer, or that grain-size is not fixed, but that several levels of abstraction are taken into consideration during processing. The results of the experiments and the corpus study were summarized in Chapter 6. The data suggest that not only syntactic and discourse-related preferences play a role in determining the order preference, but that also the manner and point of disambiguation are of importance. Which order is ultimately preferred, the strength of this preference and the development of the preference over time are determined by the interplay of these and other factors. It was shown that these factors do not have an equally strong contribution; rather some factors or combinations of factors are stronger than others. Finally, four current theories of sentence processing were discussed. Garden-path theories and constraint-based theories account for the Dutch data most readily. These two approaches differ with respect to the modularity of syntactic processing: according to garden-path theories an initial, informationally encapsulated syntactic stage of processing can be distinguished; non-syntactic information may affect processing only somewhat later. According to constraint-based theories, all kinds of information are made use of immediately when available. Future research should be directed at constructing quantitative models which capture the relative impact of various sources of information. Only then can precise predictions be made which can be used to decide between garden-path and constraint-based approaches to sentence processing.

Patent
21 Jan 1997
TL;DR: This article used a carefully assembled list of partition elements to partition the text into the chunks, and selected phrases from the chunks according to a small number of frequency based definitions, which can also incorporate additional processes such as categorization of proper names to enhance phrase recognition.
Abstract: A phrase recognition method breaks streams of text into text "chunks" and selects certain chunks as "phrases" useful for automated full text searching. The phrase recognition method uses a carefully assembled list of partition elements to partition the text into the chunks, and selects phrases from the chunks according to a small number of frequency based definitions. The method can also incorporate additional processes such as categorization of proper names to enhance phrase recognition. The method selects phrases quickly and efficiently, referring simply to the phrases themselves and the frequency with which they are encountered, rather than relying on complex, time-consuming, resource-consuming grammatical analysis, or on collocation schemes of limited applicability, or on heuristical text analysis of limited reliability or utility.

Book
01 Jan 1997
TL;DR: The baseline ASU systems, including prosodic labeling and speech corpora, and prosodic phrase models, suggest that prosodic attributes in ASU are likely to be related to language quality.
Abstract: Basic approaches- The baseline ASU systems- Prosody- Prosodic labeling and speech corpora- Preprocessing and classification- Prosodic phrase models- Intergration of the prosodic attributes in ASU- Future work- Summary

Journal ArticleDOI
TL;DR: In this paper, the structural and word order properties of the adjectival projection of Dutch adjectives have been investigated and a strong empirical and theoretical basis for extending the functional head hypothesis to the adjective system has been established.
Abstract: This paper is concerned with the phrase structural and word order properties of the (extended) adjectival projection, a phrase structural domain which has received relatively little attention in the generative literature. Focusing on the internal syntax of Dutch adjective phrases, I will come to the following conclusions. First, there is a strong empirical (and theoretical) basis for extending the functional head hypothesis to the adjectival system (i.e. for adopting the DegP-hypothesis). Secondly, a distinction should be made between two types of functional degree categories: Deg(P) and Q(P). This split is represented structurally, with Deg selecting QP and Q selecting AP (the split degree system hypothesis). Thirdly, there is empirical support for the existence of a third functional projection, AgrP, within the adjectival domain. Fourthly, as regards directionality of headedness within the Dutch functional system, it is concluded that Deg and Q take their complements to the right, whereas Agr takes its complement to the left. It is proposed that this asymmetry of headedness within the functional structure of the adjectival projection relates to the nominal orientation of Deg and Q and the verbal orientation of Agr. Finally, three movement operations will be identified within the Dutch adjectival system: A-to-Q raising, A-to-Agr raising and leftward scrambling. The latter two are at the basis of the word order variation which is found within the Dutch adjectival system.

Journal ArticleDOI
TL;DR: An online reading paradigm was used to examine the working memory capacity-constrained sentence processing model from M.A. Carpenter (1992), and working memory span, type of syntactic ambiguity (ambiguous vs. unambiguous), and type of Syntactic ambiguity resolution (main verb vs. relative clause) interacted to influence younger and older adults' on-line reading times and off-line sentence comprehension.
Abstract: Off-line studies of younger and older adults' processing of syntactically complex sentences have shown that there is a consistent negative relationship between task performance and working memory for older adults. However, it is not evident from these studies whether working memory affects the immediate syntactic analysis of a sentence, off-line processes, or both. In the current study an online reading paradigm was used to examine the working memory capacity-constrained sentence processing model from M.C. MacDonald, M.A. Just, and P.A. Carpenter (1992). Working memory span, type of syntactic ambiguity (ambiguous vs. unambiguous), and type of syntactic ambiguity resolution (main verb vs. relative clause) interacted to influence younger and older adults' on-line reading times and off-line sentence comprehension.

Journal ArticleDOI
TL;DR: It is argued that grammatical categories constitute an organizing parameter of representation and/or processing for each of the independent, modality-specific lexicons and that these observations contribute to the growing evidence that access to the orthographic and phonological forms of words can occur independently.

Proceedings ArticleDOI
07 Jul 1997
TL;DR: The results support the need to distinguish homonymy and polysemy and suggest where natural language processing can provide further improvements in retrieval performance.
Abstract: This paper discusses research on distinguishing word meanings in the context of information retrieval systems. We conducted experiments with three sources of evidence for making these distinctions: morphology, part-of-speech, and phrases. We have focused on the distinction between homonymy and polysemy (unrelated vs. related meanings). Our results support the need to distinguish homonymy and polysemy. We found: 1) grouping morphological variants makes a significant improvement in retrieval performance, 2) that more than half of all words in a dictionary that differ in part-of-speech are related in meaning, and 3) that it is crucial to assign credit to the component words of a phrase. These experiments provide better understanding of word-based methods, and suggest where natural language processing can provide further improvements in retrieval performance.

Journal ArticleDOI
TL;DR: The results suggest that readers encode focused information more carefully, either upon first encountering it or during a second-pass reading of it, and that the enhanced memory representations for focused information found in previous studies may be due in part to differences in reading patterns.
Abstract: In two experiments, we explored how readers encode information that is linguistically focused. Subjects read sentences in which a word or phrase was focused by a syntactic manipulation (Experiment 1) or by a preceding context (Experiment 2) while their eye movements were monitored. Readers had longer reading times while reading a region of the sentence that was focused than when the same region was not focused. The results suggest that readers encode focused information more carefully, either upon first encountering it or during a second-pass reading of it. We conclude that the enhanced memory representations for focused information found in previous studies may be due in part to differences in reading patterns for focused information.

Patent
12 Aug 1997
TL;DR: In this paper, a method and apparatus for mining text databases, employing sequential pattern phrase identification and shape queries, to discover trends is presented, where a maximum and minimum gap between words in the phrases and the minimum support all phrases must meet for the selected time period are specified.
Abstract: A method and apparatus for mining text databases, employing sequential pattern phrase identification and shape queries, to discover trends. The method passes over a desired database using a dynamically generated shape query. Documents within the database are selected based on specific classifications and user defined partitions. Once a partition is specified, transaction IDs are assigned to the words in the text documents depending on their placement within each document. The transaction IDs encode both the position of each word within the document as well as representing sentence, paragraph, and section breaks, and are represented in one embodiment as long integers with the sentence boundaries. A maximum and minimum gap between words in the phrases and the minimum support all phrases must meet for the selected time period may be specified. A generalized sequential pattern method is used to generate those phrases in each partition that meet the minimum support threshold. The shape query engine takes the set of phrases for the partition of interest and selects those that match a given shape query. A query may take the form of requesting a trend such as "recent upwards trend", "recent spikes in usage", "downward trends", and "resurgence of usage". Once the phrases matching the shape query are found, they are presented to the user.


Journal ArticleDOI
TL;DR: This paper shows how database tomography can be used to enhance information retrieval from large textual databases through the newly developed process of simulated nucleation.
Abstract: Database tomography is an information extraction and analysis system which operates on textual databases. Its primary use to date has been to identify pervasive technical thrusts and themes, and the interrelationships among these themes and sub-themes, which are intrinsic to large textual databases. Its two main algorithmic components are multiword phrase frequency analysis and phrase proximity analysis. This paper shows how database tomography can be used to enhance information retrieval from large textual databases through the newly developed process of simulated nucleation. The principles of simulated nucleation are presented, and the advantages for information retrieval are delineated. An application is described of developing, from Science Citation Index and Engineering Compendex, a database of journal articles focused on near-Earth space science and technology.

Proceedings ArticleDOI
21 Apr 1997
TL;DR: The main components of the speech understanding system are described: the large vocabulary recognizer and the language understanding module performing the call-type classification, and automatic algorithms for selecting phrases from a training corpus in order to enhance the prediction power of the standard word n-gram.
Abstract: We are interested in the problem of understanding fluently spoken language. In particular, we consider people's responses to the open-ended prompt of "How may I help you?". We then further restrict the problem to classifying and automatically routing such a call, based on the meaning of the user's response. Thus, we aim at extracting a relatively small number of semantic actions from the utterances of a very large set of users who are not trained to the system's capabilities and limitations. In this paper, we describe the main components of our speech understanding system: the large vocabulary recognizer and the language understanding module performing the call-type classification. In particular, we propose automatic algorithms for selecting phrases from a training corpus in order to enhance the prediction power of the standard word n-gram. The phrase language models are integrated into stochastic finite state machines which outperform standard word n-gram language models. From the speech recognizer output we recognize and exploit automatically-acquired salient phrase fragments to make a call-type classification. This system is evaluated on a database of 10 K fluently spoken utterances collected from interactions between users and human agents.

Book ChapterDOI
01 Jan 1997
TL;DR: A new computational model of prosody aimed at recognizing detailed intonation patterns, both pitch accent and phrase boundary location and their specific tonal markers, using a multi-level representation to capture acoustic feature dependence at different time scales is described.
Abstract: Prosodic patterns can be an important source of information for interpreting an utterance, but because the suprasegmental nature poses a challenge to computational modelling, prosody has seen limited use in automatic speech understanding. This work describes a new computational model of prosody aimed at recognizing detailed intonation patterns, both pitch accent and phrase boundary location and their specific tonal markers, using a multi-level representation to capture acoustic feature dependence at different time scales. The model assumes that an utterance is a sequence of phrases, each of which is composed of a sequence of syllable-level tone labels, which are in turn realized as a sequence of acoustic feature vectors (fundamental frequency and energy) depending in part on the segmental composition of the syllable. The variable lengths are explicitly modelled in a probabilistic representation of the complete sequence, using a dynamical system model at the syllable level that builds on existing models of intonation. Recognition and training algorithms are described, and initial experimental results are reported for prosodic labelling of radio news speech.

Journal ArticleDOI
TL;DR: In this paper, three speech modifications were experimentally manipulated in order to investigate their individual and combined effects on sentence comprehension in patients with Alzheimer's disease, and the results indicated a significant decline in sentence comprehension for the AD group when repeated in either verbatim or paraphrase.
Abstract: Caregivers of patients diagnosed with Alzheimer's disease (AD) are often advised to modify their speech to facilitate the patients' sentence comprehension. Three common recommendations are to (a) speak in simple sentences, (b) speak slowly, and (c) repeat one's utterance, using the same words. These three speech modifications were experimentally manipulated in order to investigate their individual and combined effects on sentence comprehension in AD. Fifteen patients with mild to moderate AD and 20 healthy older persons were tested on a sentence comprehension task with sentences varying in terms of (a) degree of grammatical complexity, (b) rate of presentation (normal vs. slow), and (c) form of repetition (verbatim vs. paraphrase). The results indicated a significant decline in sentence comprehension for the AD group. Sentence comprehension improved, however, after the sentence was repeated in either verbatim or parapharsed form. However, the patients' comprehension did not improve for sentences presented at the slow speech rate. This pattern of results is explained via-a-vis the patients' working memory loss. The findings challenge the appropriateness of several clinical recommendations.

Patent
TL;DR: A word recognition system can: respond to the input of a character string from a user by limiting the words it will recognize to words having a related, but not necessarily the same, string; score signals generated after a user has been prompted to generate a given word against words other than the prompted word to determine if the signal should be used to train the asked word as mentioned in this paper.
Abstract: A word recognition system can: respond to the input of a character string from a user by limiting the words it will recognize to words having a related, but not necessarily the same, string; score signals generated after a user has been prompted to generate a given word against words other than the prompted word to determine if the signal should be used to train the prompted word; vary the number of signals a user is prompted to generate to train a given word as a function of how well the training signals score against each other or prior models for the prompted word; create a new acoustic model of a phrase by concatenating prior acoustic models of the words in the phrase; obtain information from another program running on the same computer, such as its commands or the context of text being entered into it, and use that information to vary which words it can recognize; determine which program unit, such as an application program or dialog box, currently has input focus on its computer and create a vocabulary state associated with that program unit into which vocabulary words which will be made active when that program group has the focus can be put; detect the available computational resources and alter the instructions it executes in response; test if its ability to respond to voice input has been shut off without user confirmation, and, if so, turn that ability back on and prompt the user to confirm if that ability is to be turned off; store both a first and a second set of models for individual vocabulary words and enable a user to selectively cause the recognizer to disregard the second set of models for a selected word; and/or score a signal representing a given word against models for that word from different word model sets to select which model should be used for future recognition.

Journal ArticleDOI
TL;DR: The basic definitions of construal are presented, illustrating the theory with some already published evidence, and the question of what kind of mechanism operates in the process of interpreting a nonprimary phrase (a phrase that receives an underspecified syntactic analysis) is raised.
Abstract: Is there underspecification in the syntactic phrase marker constructed during on-line sentence analysis? According to the construal hypothesis (Frazier & Clifton, 1996), a very limited amount and type of structural underspecification is available to the human sentence parsing mechanism. Here we present the basic definitions of construal, illustrating the theory with some already published evidence. We also discuss several new pieces of evidence, from our laboratory and elsewhere, that support the construal hypothesis. We end by raising the question of what kind of mechanism operates in the process of interpreting a nonprimary phrase (a phrase that receives an underspecified syntactic analysis), and conclude that it is not a process of competition between multiple activated possible analyses but instead is a process in which the sheer existence of ambiguity need not result in increased processing cost.

Proceedings Article
17 Sep 1997
TL;DR: An inclusive notion of bilexical grammars is formalized, and it is shown that they can be parsed in time only O(ngtm) ≈ O(n), where g, t, and m are bounded by the grammar and are typically small.
Abstract: Computational linguistics has a long tradition of lexicalized grammars, in which each grammatical rule is specialized for some individual word. The earliest lexicalized rules were word-specific subcategorization frames. It is now common to find fully lexicalized versions of many grammatical formalisms, such as context-free and tree-adjoining grammars [Schabes et al. 1988]. Other formalisms, such as dependency grammar [Mel’cuk 1988] and head-driven phrase-structure grammar [Pollard & Sag 1994], are explicitly lexical from the start. Lexicalized grammars have two well-known advantages. Where syntactic acceptability is sensitive to the quirks of individual words, lexicalized rules are necessary for linguistic description. Lexicalized rules are also computationally cheap for parsing written text: a parser may ignore those rules that do not mention any input words. More recently, a third advantage of lexicalized grammars has emerged. Even when syntactic acceptability is not sensitive to the particular words chosen, syntactic distribution may be [Resnik 1993]. Certain words may be able but highly unlikely to modify certain other words. Such facts can be captured by a probabilistic lexicalized grammar, where they may be used to resolve ambiguity in favor of the most probable analysis, and also to speed parsing by avoiding (“pruning”) unlikely search paths. Accuracy and efficiency can therefore both benefit. Recent work along these lines includes [Charniak 1995, Collins 1996, Eisner 1996b, Collins 1997], who reported state-of-the-art parsing accuracy. Related models are proposed without evaluation in [Lafferty et al. 1992, Alshawi 1996]. This recent flurry of probabilistic lexicalized parsers has focused on what one might call bilexical grammars, in which each grammatical rule is specialized for not one but two individual words. The central insight is that specific words subcategorize to some degree for other specific words: tax is a good object for the verb raise. Accordingly, these models estimate, for example, the probability that (a phrase headed by) word y modifies word x, for any two words x, y in the vocabulary V . At first blush, probabilistic bilexical grammars appear to carry a substantial computational penalty. Chart parsers derived directly from CKY or Earley’s algorithm take time O(n min(n, |V |)), which amounts to O(n) in practice. Such algorithms implicitly or explicitly regard the grammar as a context-free grammar in which a noun phrase headed by tiger bears the special nonterminal NPtiger. Such ≈ O(n ) algorithms are explicitly used by [Alshawi 1996, Collins 1996], and almost certainly by [Charniak 1995] as well. The present paper formalizes an inclusive notion of bilexical grammars, and shows that they can be parsed in time only O(ngtm) ≈ O(n), where g, t, and m are bounded by the grammar and are typically small. (g is the maximum number of senses per input word, t measures the degree of lexical interdependence that the grammar allows among the several children of a word, and m bounds the number of modifier relations that the parser need distinguish for a given pair of words.) The new algorithm also reduces space requirements to O(ngt) ≈ O(n), from the ≈ O(n) required by CKY-style approaches to bilexical grammar. The parsing algorithm finds the highest-scoring analysis or analyses generated by the grammar, under a probabilistic or other measure. Non-probabilistic grammars may be treated as a special case.

Patent
11 Jun 1997
TL;DR: In this paper, an advertising device is composed of an interaction recording part 2 recording chat contents, a word extracting part 3 extracting a word from the part 2, an advertising dictionary storing advertising contents, an advertisement retrieving part 5 selecting advertisement concerning a word/phrase coincident with an extracted word from a dictionary, and an advertisement providing part 6 recording selected advertisement in part 2.
Abstract: PROBLEM TO BE SOLVED: To enable a speaker participating in electronic interaction to obtain desired information at the desired point of time and to perform efficient advertisement by allowing an advertising side to provide information the talker requires at the time of being required. SOLUTION: An advertising device 1 is composed of an interaction recording part 2 recording chat contents, a word extracting part 3 extracting a word from the part 2, an advertising dictionary 4 storing advertising contents, an advertisement retrieving part 5 selecting advertisement concerning a word/phrase coincident with an extracted word from the dictionary 4 and an advertisement providing part 6 recording selected advertisement in the part 2. When a word within a sentence inputted by a chat participant is coincident with a registered word/phrase, an advertising message matching with it is provided.

PatentDOI
TL;DR: In this article, a speaker identification system includes a speaker-independent phrase recognizer and a score processor coupled to the outputs of the speaker independent phrase recognition and the speaker-dependent phrase recognition is used to determine a putative identity.
Abstract: A speaker identification system includes a speaker-independent phrase recognizer. The speaker-independent phrase recognizer scores a password utterance against all the sets of phonetic transcriptions in a lexicon database to determine the N best speaker-independent scores, determines the N best sets of phonetic transcriptions based on the N best speaker-independent scores, and determines the N best possible identities. A speaker-dependent phrase recognizer retrieves the hidden Markov model corresponding to each of the N best possible identities, and scores the password utterance against each of the N hidden Markov models to generate a speaker-dependent score for each of the N best possible identities. A score processor coupled to the outputs of the speaker-independent phrase recognizer and the speaker-dependent phrase recognizer determines a putative identity. A verifier coupled to the score processor authenticates the determined putative identity.