scispace - formally typeset
Search or ask a question

Showing papers on "Natural language published in 2005"


Journal ArticleDOI
TL;DR: The principles-and-parameter approach has been used in this paper to account for properties of language in terms of general considerations of computational efficiency, eliminating some of the technology postulated as specific to language and providing more principled explanation of linguistic phenomena.
Abstract: The biolinguistic perspective regards the language faculty as an “organ of the body,” along with other cognitive systems. Adopting it, we expect to find three factors that interact to determine (I-) languages attained: genetic endowment (the topic of Universal Grammar), experience, and principles that are language- or even organism-independent. Research has naturally focused on I-languages and UG, the problems of descriptive and explanatory adequacy. The Principles-and-Parameters approach opened the possibility for serious investigation of the third factor, and the attempt to account for properties of language in terms of general considerations of computational efficiency, eliminating some of the technology postulated as specific to language and providing more principled explanation of linguistic phenomena

1,409 citations


Book
01 Jan 2005

996 citations


Journal ArticleDOI
TL;DR: This paper explores the detection of domain-specific emotions using language and discourse information in conjunction with acoustic correlates of emotion in speech signals on a case study of detecting negative and non-negative emotions using spoken language data obtained from a call center application.
Abstract: The importance of automatically recognizing emotions from human speech has grown with the increasing role of spoken language interfaces in human-computer interaction applications. This paper explores the detection of domain-specific emotions using language and discourse information in conjunction with acoustic correlates of emotion in speech signals. The specific focus is on a case study of detecting negative and non-negative emotions using spoken language data obtained from a call center application. Most previous studies in emotion recognition have used only the acoustic information contained in speech. In this paper, a combination of three sources of information-acoustic, lexical, and discourse-is used for emotion recognition. To capture emotion information at the language level, an information-theoretic notion of emotional salience is introduced. Optimization of the acoustic correlates of emotion with respect to classification error was accomplished by investigating different feature sets obtained from feature selection, followed by principal component analysis. Experimental results on our call center data show that the best results are obtained when acoustic and language information are combined. Results show that combining all the information, rather than using only acoustic information, improves emotion classification by 40.7% for males and 36.4% for females (linear discriminant classifier used for acoustic information).

959 citations


Proceedings Article
26 Jul 2005
TL;DR: A learning algorithm is described that takes as input a training set of sentences labeled with expressions in the lambda calculus and induces a grammar for the problem, along with a log-linear model that represents a distribution over syntactic and semantic analyses conditioned on the input sentence.
Abstract: This paper addresses the problem of mapping natural language sentences to lambda–calculus encodings of their meaning. We describe a learning algorithm that takes as input a training set of sentences labeled with expressions in the lambda calculus. The algorithm induces a grammar for the problem, along with a log-linear model that represents a distribution over syntactic and semantic analyses conditioned on the input sentence. We apply the method to the task of learning natural language interfaces to databases and show that the learned parsers outperform previous methods in two benchmark database domains.

865 citations


Journal ArticleDOI
TL;DR: The approach is sufficiently problematic that it cannot be used to support claims about evolution and related arguments that language is not an adaptation, namely that it is "perfect," non-redundant, unusable in any partial form, and badly designed for communication.

850 citations


Journal ArticleDOI
TL;DR: Examining asymmetrical brain and cognitive functions provides a unique opportunity for understanding the neural basis of complex cognition.

777 citations


01 Jan 2005
TL;DR: It is argued that an event structure can provide a distinct and useful level of representation for linguistic analysis involving the aspectual properties of verbs, adverbial scope, the role of argument structure, and the mapping from the lexicon to syntax.
Abstract: In this paper we examine the role of events within a theory of lexical semantics. We propose a configurational theory of event structure and examine how it contributes to a lexical semantic theory for natural language. In particular, we argue that an event structure can provide a distinct and useful level of representation for linguistic analysis involving the aspectual properties of verbs, adverbial scope, the role of argument structure, and the mapping from the lexicon to syntax.

690 citations


Proceedings Article
05 Dec 2005
TL;DR: A new kernel method for extracting semantic relations between entities in natural language text, based on a generalization of subsequence kernels, is presented, which uses three types of subsequent patterns that are typically employed innatural language to assert relationships between two entities.
Abstract: We present a new kernel method for extracting semantic relations between entities in natural language text, based on a generalization of subsequence kernels. This kernel uses three types of subsequence patterns that are typically employed in natural language to assert relationships between two entities. Experiments on extracting protein interactions from biomedical corpora and top-level relations from newspaper corpora demonstrate the advantages of this approach.

546 citations


Journal ArticleDOI
TL;DR: This chapter presents the challenges of NLP, progress so far made in this field, NLP applications, components of N LP, and grammar of English language—the way machine requires it.
Abstract: Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems.

543 citations


Book
01 Feb 2005
TL;DR: A comprehensive reference text that examines how the three aspects of language (genre, text and grammar) can be used as resources in teaching and assessing writing.
Abstract: A comprehensive reference text that examines how the three aspects of language (genre, text and grammar) can be used as resources in teaching and assessing writing. It provides an accessible account of current theories of language and language learning, together with practical ideas for teaching and assessing the genres and grammar of writing across the curriculum.

406 citations


Dissertation
01 Jan 2005
TL;DR: This thesis focuses on two segmentation tasks, named-entity recognition and Chinese word segmentation, and shows that features derived from unlabeled data substantially improves performance, both in terms of reducing the amount of labeled data needed to achieve a certain performance level and in termsof reducing the error using a fixed amount of labeling data.
Abstract: Statistical supervised learning techniques have been successful for many natural language processing tasks, but they require labeled datasets, which can be expensive to obtain. On the other hand, unlabeled data (raw text) is often available "for free" in large quantities. Unlabeled data has shown promise in improving the performance of a number of tasks, e.g. word sense disambiguation, information extraction, and natural language parsing. In this thesis, we focus on two segmentation tasks, named-entity recognition and Chinese word segmentation. The goal of named-entity recognition is to detect and classify names of people, organizations, and locations in a sentence. The goal of Chinese word segmentation is to find the word boundaries in a sentence that has been written as a string of characters without spaces. Our approach is as follows: In a preprocessing step, we use raw text to cluster words and calculate mutual information statistics. The output of this step is then used as features in a supervised model, specifically a global linear model trained using the Perceptron algorithm. We also compare Markov and semi-Markov models on the two segmentation tasks. Our results show that features derived from unlabeled data substantially improves performance, both in terms of reducing the amount of labeled data needed to achieve a certain performance level and in terms of reducing the error using a fixed amount of labeled data. We find that sometimes semi-Markov models can also improve performance over Markov models. Thesis Supervisor: Michael Collins Title: Assistant Professor, CSAIL

Journal ArticleDOI
TL;DR: A machine learning algorithm for semantic role parsing is proposed, extending the work of Gildea and Jurafsky (2002), Surdeanu et al. (2003) and others, based on Support Vector Machines which shows large improvement in performance over earlier classifiers.
Abstract: The natural language processing community has recently experienced a growth of interest in domain independent shallow semantic parsing--the process of assigning a Who did What to Whom, When, Where, Why, How etc. structure to plain text. This process entails identifying groups of words in a sentence that represent these semantic arguments and assigning specific labels to them. It could play a key role in NLP tasks like Information Extraction, Question Answering and Summarization. We propose a machine learning algorithm for semantic role parsing, extending the work of Gildea and Jurafsky (2002), Surdeanu et al. (2003) and others. Our algorithm is based on Support Vector Machines which we show give large improvement in performance over earlier classifiers. We show performance improvements through a number of new features designed to improve generalization to unseen data, such as automatic clustering of verbs. We also report on various analytic studies examining which features are most important, comparing our classifier to other machine learning algorithms in the literature, and testing its generalization to new test set from different genre. On the task of assigning semantic labels to the PropBank (Kingsbury, Palmer, & Marcus, 2002) corpus, our final system has a precision of 84% and a recall of 75%, which are the best results currently reported for this task. Finally, we explore a completely different architecture which does not requires a deep syntactic parse. We reformulate the task as a combined chunking and classification problem, thus allowing our algorithm to be applied to new languages or genres of text for which statistical syntactic parsers may not be available.

Proceedings Article
30 Jul 2005
TL;DR: Improved machine learning algorithms for text categorization with generated features based on domain-specific and common-sense knowledge are enhanced, addressing the two main problems of natural language processing--synonymy and polysemy.
Abstract: We enhance machine learning algorithms for text categorization with generated features based on domain-specific and common-sense knowledge. This knowledge is represented using publicly available ontologies that contain hundreds of thousands of concepts, such as the Open Directory; these ontologies are further enriched by several orders of magnitude through controlled Web crawling. Prior to text categorization, a feature generator analyzes the documents and maps them onto appropriate ontology concepts, which in turn induce a set of generated features that augment the standard bag of words. Feature generation is accomplished through contextual analysis of document text, implicitly performing word sense disambiguation. Coupled with the ability to generalize concepts using the ontology, this approach addresses the two main problems of natural language processing--synonymy and polysemy. Categorizing documents with the aid of knowledge-based features leverages information that cannot be deduced from the documents alone. Experimental results confirm improved performance, breaking through the plateau previously reached in the field.

Book
06 Apr 2005
TL;DR: This book introduces fundamental techniques for computing semantic representations for fragments of natural language and performing inference with the result by using first-order logic and lambda calculus.
Abstract: This book introduces fundamental techniques for computing semantic representations for fragments of natural language and performing inference with the result. The prinary tools used are first-order logic and lambda calculus. All the techniques introduced are implemented in Prolog. The book also shown how to use theorem provers and model builders in parallel to deal with natrual language inference.

Proceedings Article
09 Jul 2005
TL;DR: This paper presents a method for inducing transformation rules that map natural-language sentences into a formal query or command language and shows that this method performs overall better and faster than previous approaches in both domains.
Abstract: This paper presents a method for inducing transformation rules that map natural-language sentences into a formal query or command language. The approach assumes a formal grammar for the target representation language and learns transformation rules that exploit the non-terminal symbols in this grammar. The learned transformation rules incrementally map a natural-language sentence or its syntactic parse tree into a parse-tree for the target formal language. Experimental results are presented for two corpora. one which maps English instructions into an existing formal coaching language for simulated RoboCup soccer agents, and another which maps English U.S.-geography questions into a database query language. We show that our method performs overall better and faster than previous approaches in both domains.

Journal ArticleDOI
TL;DR: An extension of the classic Yager's approach to involve more sophisticated criteria of goodness, search methods, etc, and shows how fuzzy queries are related to linguistic summaries, which makes it possible to implement such linguistic data summaries.

Journal ArticleDOI
TL;DR: The authors found that phonological awareness developed in response to language exposure and instruction but, once established, transferred across languages for both bilinguals and 2nd-language learners, and decoding ability developed separately for each language as a function of proficiency and instruction in that language and did not transfer to the other language.
Abstract: Two hundred and four 5- and 6-year-olds who were monolingual English-, bilingual English-Chinese-, or Chinese-speaking children beginning to learn English (2nd-language learners) were compared on phonological awareness and word decoding tasks in English and Chinese. Phonological awareness developed in response to language exposure and instruction but, once established, transferred across languages for both bilinguals and 2nd-language learners. In contrast, decoding ability developed separately for each language as a function of proficiency and instruction in that language and did not transfer to the other language. Therefore, there was no overall effect of bilingualism on learning to read: Performance depended on the structure of the language, proficiency in that language, and instructional experiences with that writing system. These results point to the importance of evaluating the features of the languages and instructional context in which children become biliterate.

PatentDOI
Fuliang Weng1, Qi Zhang1
TL;DR: In this paper, an advanced model that includes new processes is provided for use as a component of an effective disfluency identifier, which tags edited words in transcribed speech, and combines a speech recognition unit in combination with a part-of-speech tagger, a disfluence identifier, and a parser.
Abstract: An advanced model that includes new processes is provided for use as a component of an effective disfluency identifier. The disfluency identifier tags edited words in transcribed speech. A speech recognition unit in combination with a part-of-speech tagger, a disfluency identifier, and a parser form a natural language system that helps machines properly interpret spoken utterances.

Proceedings Article
05 Dec 2005
TL;DR: It is shown that taking a particular stochastic process - the Pitman-Yor process - as an adaptor justifies the appearance of type frequencies in formal analyses of natural language, and improves the performance of a model for unsupervised learning of morphology.
Abstract: Standard statistical models of language fail to capture one of the most striking properties of natural languages: the power-law distribution in the frequencies of word tokens. We present a framework for developing statistical models that generically produce power-laws, augmenting standard generative models with an adaptor that produces the appropriate pattern of token frequencies. We show that taking a particular stochastic process - the Pitman-Yor process - as an adaptor justifies the appearance of type frequencies in formal analyses of natural language, and improves the performance of a model for unsupervised learning of morphology.

Journal ArticleDOI
TL;DR: It is proposed that the analytical power of the axial map in empirical studies derives from the efficient representation of key properties of the spatial configuration that it captures.
Abstract: The fewest-line axial map, often simply referred to as the 'axial map, is one of the primary tools of space syntax. Its natural language definition has allowed researchers to draw consistent maps that present a concise description of architectural space; it has been established that graph measures obtained from the map are useful for the analysis of pedestrian movement patterns and activities related to such movement: for example, the location of services or of crime. However, the definition has proved difficult to translate into formal language by mathematicians and algorithmic implementers alike. This has meant that space syntax has been criticised for a lack of rigour in the definition of one of its fundamental representations. Here we clarify the original definition of the fewest-line axial map and show that it can be implemented algorithmically. We show that the original definition leads to maps similar to those currently drawn by hand, and we demonstrate that the differences between the two may be accounted for in terms of the detail of the algorithm used. We propose that the analytical power of the axial map in empirical studies derives from the efficient representation of key properties of the spatial configuration that it captures.

01 Jan 2005
TL;DR: An algorithm for the unsupervised learning, or induction, of a simple morphology of a natural language, which builds hierarchical representations for a set of morphs, which are morpheme-like units discovered from unannotated text corpora.
Abstract: This work presents an algorithm for the unsupervised learning, or induction, of a simple morphology of a natural language. A probabilistic maximum a posteriori model is utilized, which builds hierarchical representations for a set of morphs, which are morpheme-like units discovered from unannotated text corpora. The induced morph lexicon stores parameters related to both the “meaning” and “form” of the morphs it contains. These parameters affect the role of the morphs in words. The model is implemented in a task of unsupervised morpheme segmentation of Finnish and English words. Very good results are obtained for Finnish and almost as good results are obtained in the English task.

01 Jul 2005
TL;DR: An understanding of the functions of switching between the native language and the foreign language and its underlying reasons will provide language teachers with a heightened awareness of its use in classroom discourse and will obviously lead to betterment of instruction by either eliminating it or dominating its use during the foreignlanguage instruction.
Abstract: Olcay Sert Hacettepe University (Ankara, Turkey) sertolcay@yahoo.com Code switching is a widely observed phenomenon especially seen in multilingual and multicultural communities. In ELT classrooms, code switching comes into use either in the teachers’ or the students’ discourse. Although it is not favoured by many educators, one should have at least an understanding of the functions of switching between the native language and the foreign language and its underlying reasons. This understanding will provide language teachers with a heightened awareness of its use in classroom discourse and will obviously lead to betterment of instruction by either eliminating it or dominating its use during the foreign language instruction.

Journal ArticleDOI
TL;DR: A system for automatic story generation that reuses existing stories to produce a new story that matches a given user query is presented.
Abstract: In this paper we present a system for automatic story generation that reuses existing stories to produce a new story that matches a given user query. The plot structure is obtained by a case-based reasoning (CBR) process over a case base of tales and an ontology of explicitly declared relevant knowledge. The resulting story is generated as a sketch of a plot described in natural language by means of natural language generation (NLG) techniques.

Journal ArticleDOI
TL;DR: This article proposes a method for automatically discovering inconsistencies in the requirements from multiple stakeholders, using both theorem-proving and model-checking techniques, and shows how to deal with them in a formal manner.
Abstract: The use of logic in identifying and analyzing inconsistency in requirements from multiple stakeholders has been found to be effective in a number of studies. Nonmonotonic logic is a theoretically well-founded formalism that is especially suited for supporting the evolution of requirements. However, direct use of logic for expressing requirements and discussing them with stakeholders poses serious usability problems, since in most cases stakeholders cannot be expected to be fluent with formal logic. In this article, we explore the integration of natural language parsing techniques with default reasoning to overcome these difficulties. We also propose a method for automatically discovering inconsistencies in the requirements from multiple stakeholders, using both theorem-proving and model-checking techniques, and show how to deal with them in a formal manner. These techniques were implemented and tested in a prototype tool called CARL. The effectiveness of the techniques and of the tool are illustrated by a classic example involving conflicting requirements from multiple stakeholders.

Journal ArticleDOI
01 Jun 2005-Language
TL;DR: The model of language evolution exemplified in Ringe et al. 2002 is extended, which recovers phylogenetic trees optimized according to a criterion of weighted maximum compatibility, to include cases in which languages remain in contact and trade linguistic material as they evolve.
Abstract: In this article we extend the model of language evolution exemplified in Ringe et al. 2002, which recovers phylogenetic trees optimized according to a criterion of weighted maximum compatibility, to include cases in which languages remain in contact and trade linguistic material as they evolve. We describe our analysis of an Indo-European (IE) dataset (originally assembled by Ringe and Taylor) based on this new model. Our study shows that this new model fits the IE family well and suggests that the early evolution of IE involved only limited contact between distinct lineages. Furthermore, the candidate histories we obtain appear to be consistent with archaeological findings, which suggests that this method may be of practical use. The case at hand provides no opportunity to explore the problem of conflict between network optimization criteria; that problem must be left to future research.

Proceedings ArticleDOI
14 Jun 2005
TL;DR: It is shown that NaLIX, while far from being able to pass the Turing test, is perfectly usable in practice, and able to handle even quite complex queries in a variety of application domains.
Abstract: Database query languages can be intimidating to the non-expert, leading to the immense recent popularity for keyword based search in spite of its significant limitations. The holy grail has been the development of a natural language query interface. We present NaLIX, a generic interactive natural language query interface to an XML database. Our system can accept an arbitrary English language sentence as query input, which can include aggregation, nesting, and value joins, among other things. This query is translated, potentially after reformulation, into an XQuery expression that can be evaluated against an XML database. The translation is done through mapping grammatical proximity of natural language parsed tokens to proximity of corresponding elements in the result XML. In this demonstration, we show that NaLIX, while far from being able to pass the Turing test, is perfectly usable in practice, and able to handle even quite complex queries in a variety of application domains. In addition, we also demonstrate how carefully designed features in NaLIX facilitate the interactive query process and improve the usability of the interface.

01 Jan 2005
TL;DR: Several syntactic representations and associated probabilistic models are described which are designed to capture the basic character of natural language syntax as directly as possible and can inform the investigation of what biases are or are not needed in the human acquisition of language.
Abstract: There is precisely one complete language processing system to date: the human brain. Though there is debate on how much built-in bias human learners might have, we definitely acquire language in a primarily unsupervised fashion. On the other hand, computational approaches to language processing are almost exclusively supervised, relying on hand-labeled corpora for training. This reliance is largely due to unsupervised approaches having repeatedly exhibited discouraging performance. In particular, the problem of learning syntax (grammar) from completely unannotated text has received a great deal of attention for well over a decade, with little in the way of positive results. We argue that previous methods for this task have generally underperformed because of the representations they used. Overly complex models are easily distracted by non-syntactic correlations (such as topical associations), while overly simple models aren't rich enough to capture important first-order properties of language (such as directionality, adjacency, and valence). In this work, we describe several syntactic representations and associated probabilistic models which are designed to capture the basic character of natural language syntax as directly as possible. First, we examine a nested, distributional method which induces bracketed tree structures. Second, we examine a dependency model which induces word-to-word dependency structures. Finally, we demonstrate that these two models perform better in combination than they do alone. With these representations, high-quality analyses can be learned from surprisingly little text, with no labeled examples, in several languages (we show experiments with English, German, and Chinese). Our results show above-baseline performance in unsupervised parsing in each of these languages. Grammar induction methods are useful since parsed corpora exist for only a small number of languages. More generally, most high-level NLP tasks, such as machine translation and question-answering, lack richly annotated corpora, making unsupervised methods extremely appealing even for common languages like English. Finally, while the models in this work are not intended to be cognitively plausible, their effectiveness can inform the investigation of what biases are or are not needed in the human acquisition of language.

Journal ArticleDOI
01 May 2005
TL;DR: A language is presented, TimeML, which attempts to capture the richness of temporal and event related information in language, while demonstrating how it can play an important part in the development of more robust question answering systems.
Abstract: In this paper, we discuss the role that temporal information plays in natural language text, specifically in the context of question answering systems. We define a descriptive framework with which we can examine the temporally sensitive aspects of natural language queries. We then investigate broadly what properties a general specification language would need, in order to mark up temporal and event information in text. We present a language, TimeML, which attempts to capture the richness of temporal and event related information in language, while demonstrating how it can play an important part in the development of more robust question answering systems.

Patent
14 Nov 2005
TL;DR: In this paper, the authors present a system for managing business knowledge expressed as statements, preferably sentences using a vocabulary, where such statements may be automated by the generation of programming language source code or computer program instructions.
Abstract: The present invention is directed to a system for managing business knowledge expressed as statements, preferably sentences using a vocabulary, where such statements may be automated by the generation of programming language source code or computer program instructions. As such, the present invention also manages software design specifications that define, describe, or constrain the programming code it generates or programs with which it or the code it generates is to integrate. The present invention facilitates the creation of composite sentences. In one embodiment, the present invention also interprets a composite sentence as a logical formula in first order predicate calculus or similar logic formalism supporting conjunction, disjunction, and negation as well as existentially and universally quantified variables. The invention further interprets natural language, including singular common count noun phrases and connectives, as variables in formal logic. Further, the invention then implements the logical interpretations as rules.

Journal ArticleDOI
01 Jun 2005
TL;DR: An empirical study was conducted which shows that the categories of the hierarchical knowledge map generated by NewsMap are better than those generated by regular news readers, both in terms of recall and precision, on the sub- level categories but not on the top-level categories.
Abstract: Information technology has made possible the capture and accessing of a large number of data and knowledge bases, which in turn has brought about the problem of information overload. Text mining to turn textual information into knowledge has become a very active research area, but much of the research remains restricted to the English language. Due to the differences in linguistic characteristics and methods of natural language processing, many existing text analysis approaches have yet to be shown to be useful for the Chinese language. This research focuses on the automatic generation of a hierarchical knowledge map NewsMap, based on online Chinese news, particularly the finance and health sections. Whether in print or online, news still represents one important knowledge source that people produce and consume on a daily basis. The hierarchical knowledge map can be used as a tool for browsing business intelligence and medical knowledge hidden in news articles. In order to assess the quality of the map, an empirical study was conducted which shows that the categories of the hierarchical knowledge map generated by NewsMap are better than those generated by regular news readers, both in terms of recall and precision, on the sub-level categories but not on the top-level categories. NewsMap employs an improved interface combining a ID alphabetical hierarchical list and a 2D Self-Organizing Map (SOM) island display. Another empirical study compared the two visualization displays and found that users' performances can be improved by taking advantage of the visual cues of the 2D SOM display.