scispace - formally typeset
Search or ask a question
Book ChapterDOI

Computational models of language acquisition

21 Mar 2010-pp 86-99
TL;DR: This paper surveys various works that address language learning from data, identifying the commonalities and differences between the various existing approaches to language learning, and specifying desiderata for future research that must be considered by any plausible solution to this puzzle.
Abstract: Child language acquisition, one of Nature's most fascinating phenomena, is to a large extent still a puzzle. Experimental evidence seems to support the view that early language is highly formulaic, consisting for the most part of frozen items with limited productivity. Fairly quickly, however, children find patterns in the ambient language and generalize them to larger structures, in a process that is not yet well understood. Computational models of language acquisition can shed interesting light on this process. This paper surveys various works that address language learning from data; such works are conducted in different fields, including psycholinguistics, cognitive science and computer science, and we maintain that knowledge from all these domains must be consolidated in order for a well-informed model to emerge. We identify the commonalities and differences between the various existing approaches to language learning, and specify desiderata for future research that must be considered by any plausible solution to this puzzle.

Content maybe subject to copyright    Report

Citations
More filters
01 Jan 2011
TL;DR: This volume collects three decades of articles by distinguish linguist Joan Bybee, which essentially argue for the importance of frequency of use as a factor in the analysis and explanation of language structure.
Abstract: This volume collects three decades of articles by distinguish linguist Joan Bybee. Her articles essentially argue for the importance of frequency of use as a factor in the analysis and explanation of language structure. Her work has been very influential for a broad range of researchers in linguistics, particularly in discourse analysis, corpus linguistics, phonology, phonetics, and historical linguistics.

584 citations

Journal ArticleDOI
Okko Räsänen1
TL;DR: This work reviews a number of existing computational studies concentrated on the question of how spoken language can be learned from continuous speech in the absence of linguistically or phonetically motivated background knowledge, a situation faced by human infants when they first attempt to learn their native language.

61 citations

Book ChapterDOI
24 Oct 2016
TL;DR: The relationship between natural language processing (NLP) and cognitive science has been discussed in detail in this article, where the authors give an overview of the questions and problems addressed jointly in NLP and Cognitive Science.
Abstract: On the Relationships between Natural Language Processing and Cognitive Sciences This introduction aims at giving an overview of the questions and problems addressed jointly in natural language processing and cognitive science. More precisely, the idea of this introduction, and more generally of this book, is to address how these fields can fertilize each other, bringing recent advances to produce richer studies. Natural language processing is fundamentally dealing with semantics and more generally with knowledge. Cognitive science is also mostly dealing with knowledge: how knowledge is acquired and processed in the brain. The two domains have developed largely independently, as we discuss later in this Introduction, but there are obvious links between the two, and a large number of researchers have investigated problems involving the two fields, in either the data or the methods used. A Quick Historical Overview The landscape of natural language processing (NLP) has dramatically changed in the last decades. Until recently, it was generally assumed that one first needs to adequately formalize an information context (for example information contained in a text) in order to be able to subsequently develop applications dealing with semantics (see, for example, Sowa 1991; Allen 1994; Nirenburg and Raskin 2004). This initial step involved manipulating large knowledge bases of manually hand-crafted rules, and has resulted in the new field of “knowledge engineering” (Brachman and Levesque 2004). Knowledge can be seen as the result of the confrontation of our a priori ideas with the reality of the outside world. This leads to several difficulties: (1) the task is potentially infinite since people constantly perceive a multiplicity of things; (2) perception interferes with information already registered in the brain, leading to complex inferences with commonsense knowledge; (3) additionally, very little is known about how information is processed in the brain, which makes things even harder to formalize. To answer some of these issues, a common assumption is that knowledge could be disconnected from perception, which led to projects aiming at developing large static databases of “common sense knowledge” from CYC (Lenat 1995) to more recent general domain ontologies like ConceptNet (Liu and Singh 2004). However, these projects have always led to databases that, despite their sizes, were never enough to completely and accurately formalize a given domain, and domain-independent applications were thus even more unattainable.

11 citations

26 Oct 2016
TL;DR: The results demonstrate that children’s comprehension of non-canonical sentences improved when the topic argument was realized as a personal pronoun and this improvement was independent of the grammatical role of the arguments.
Abstract: This dissertation examines the impact of the type of referring expression on the acquisition of word order variation in German-speaking preschoolers. A puzzle in the area of language acquisition concerns the production-comprehension asymmetry for non-canonical sentences like "Den Affen fangt die Kuh." (“The monkey, the cow chases.”), that is, preschoolers usually have difficulties in accurately understanding non-canonical sentences approximately until age six (e.g., Dittmar et al., 2008) although they produce non-canonical sentences already around age three (e.g., Poeppel & Wexler, 1993; Weissenborn, 1990). This dissertation investigated the production and comprehension of non-canonical sentences to address this issue. Three corpus analyses were conducted to investigate the impact of givenness, topic status and the type of referring expression on word order in the spontaneous speech of two- to four-year-olds and the child-directed speech produced by their mothers. The positioning of the direct object in ditransitive sentences was examined; in particular, sentences in which the direct object occurred before or after the indirect object in the sentence-medial positions and sentences in which it occurred in the sentence-initial position. The results reveal similar ordering patterns for children and adults. Word order variation was to a large extent predictable from the type of referring expression, especially with respect to the word order involving the sentence-medial positions. Information structure (e.g., topic status) had an additional impact only on word order variation that involved the sentence-initial position. Two comprehension experiments were conducted to investigate whether the type of referring expression and topic status influences the comprehension of non-canonical transitive sentences in four- and five-year-olds. In the first experiment, the topic status of the one of the sentential arguments was established via a preceding context sentence, and in the second experiment, the type of referring expression for the sentential arguments was additionally manipulated by using either a full lexical noun phrase (NP) or a personal pronoun. The results demonstrate that children’s comprehension of non-canonical sentences improved when the topic argument was realized as a personal pronoun and this improvement was independent of the grammatical role of the arguments. However, children’s comprehension was not improved when the topic argument was realized as a lexical NP. In sum, the results of both production and comprehension studies support the view that referring expressions may be seen as a sentence-level cue to word order and to the information status of the sentential arguments. The results highlight the important role of the type of referring expression on the acquisition of word order variation and indicate that the production-comprehension asymmetry is reduced when the type of referring expression is considered.

8 citations


Cites background from "Computational models of language ac..."

  • ...…Bates & MacWhinney, 1987; MacWhinney, 1987; Tomasello, 2000, 2003; see Behrens, 2009 for a review), or whether innate principles are required that restrict the hypothesis space in language acquisition (e.g., Chomsky, 1965, 1995; Valian, 2009a, 2009b; Yang, 2014; see Eisenbeiß, 2009 for a review)....

    [...]

  • ...While the input clearly plays a role, it is usually assumed that there may be certain learning biases (Yang, 2014) or aspects of language, i....

    [...]

  • ...Language acquisition basically means setting the parameters to the values of the language that the child is learning (e.g., Eisenbeiß, 2009; Yang, 2014) on the basis of the input....

    [...]

  • ...In this way, the Universal Grammar (UG) restricts the learning hypothesis space for the children acquiring their language (cf., Eisenbeiß, 2009; Yang, 2014)....

    [...]

  • ...As mentioned in the introduction, nativist approaches assume that language cannot be acquired solely on the basis of the input (e.g., see Eisenbeiß, 2009; Valian, 2009a, 2009b; Yang, 2014)....

    [...]

Journal ArticleDOI
TL;DR: The presented network is developmental which means that the internal representations are directly learned from the signals of the input and motor ports, not designed internally for particular task, hence the same learning principles are potentially suitable for other sensory modalities.
Abstract: Natural language acquisition is a crucial research domain of artificial intelligence. To enable the computer acquire and understand the human language and generate the corresponding response, we must research the principle of language acquiring of human being. Most of the prior language acquisition methods use handcrafted internal representation which is not sufficiently brain-based. An emergent developmental network (DN) is presented to acquire the certain speech extracted by mel-frequency cepstrum coefficient (MFCC), from sensory and motor experience. This work is different in the sense that we focused on mechanisms that enable a system to develop its emergent representations from its operational experience. In this work, internal unsupervised neurons of the DN are used to represent the short contexts, and the competitions among the internal neurons enable them to represent different short contexts. To demonstrate the acquisition effect, we study and analyze the influences of different network structure (i.e., different neuron number and weight threshold) on the language acquisition rate. Four speech acquisition experiments demonstrate efficiently how such internal neurons represent the short context while they are not directly supervised by the external environment. The presented network is developmental which means that the internal representations are directly learned from the signals of the input and motor ports, not designed internally for particular task, hence the same learning principles are potentially suitable for other sensory modalities.

6 citations

References
More filters
Journal ArticleDOI
TL;DR: Methodological preliminaries of generative grammars as theories of linguistic competence; theory of performance; organization of a generative grammar; justification of grammar; descriptive and explanatory theories; evaluation procedures; linguistic theory and language learning.

12,586 citations


"Computational models of language ac..." refers background in this paper

  • ...One, the nativist approach, originating in Chomsky [21, 22, 23] and popularized by Pinker [50], claims that the linguistic capacity is innate, expressed as dedicated “language organs” in our brains; therefore, certain linguistic universals are given to language learners for free, requiring only the tuning of a set parameters in order for language to be fully acquired....

    [...]

Book
01 May 1965
TL;DR: Generative grammars as theories of linguistic competence as discussed by the authors have been used as a theory of performance for language learning. But they have not yet been applied to the problem of language modeling.
Abstract: : Contents: Methodological preliminaries: Generative grammars as theories of linguistic competence; theory of performance; organization of a generative grammar; justification of grammars; formal and substantive grammars; descriptive and explanatory theories; evaluation procedures; linguistic theory and language learning; generative capacity and its linguistic relevance Categories and relations in syntactic theory: Scope of the base; aspects of deep structure; illustrative fragment of the base component; types of base rules Deep structures and grammatical transformations Residual problems: Boundaries of syntax and semantics; structure of the lexicon

12,225 citations

ReportDOI
TL;DR: As a result of this grant, the researchers have now published on CDROM a corpus of over 4 million words of running text annotated with part-of- speech (POS) tags, which includes a fully hand-parsed version of the classic Brown corpus.
Abstract: : As a result of this grant, the researchers have now published oil CDROM a corpus of over 4 million words of running text annotated with part-of- speech (POS) tags, with over 3 million words of that material assigned skeletal grammatical structure This material now includes a fully hand-parsed version of the classic Brown corpus About one half of the papers at the ACL Workshop on Using Large Text Corpora this past summer were based on the materials generated by this grant

8,377 citations


Additional excerpts

  • ...[48]), whose data are taken from the Wall Street Journal....

    [...]

Book
01 Jan 1994
TL;DR: In this "extremely valuable book, very informative, and very well written" (Noam Chomsky), one of the greatest thinkers in the field of linguistics explains how language works -how people, ny making noises with their mouths, can cause ideas to arise in other people's minds as mentioned in this paper.
Abstract: In this "extremely valuable book, very informative, and very well written" (Noam Chomsky), one of the greatest thinkers in the field of linguistics explains how language works--how people, ny making noises with their mouths, can cause ideas to arise in other people's minds.

4,696 citations

Book
Roger Brown1
01 Jan 1973
TL;DR: This article studied the early stages of grammatical constructions and the meanings they convey in pre-school children and found that the order of their acquisition is almost identical across children and is predicted by their relative semantic and grammatical complexity.
Abstract: For many years, Roger Brown and his colleagues have studied the developing language of pre-school children--the language that ultimately will permit them to understand themselves and the world around them. This longitudinal research project records the conversational performances of three children, studying both semantic and grammatical aspects of their language development. These core findings are related to recent work in psychology and linguistics--and especially to studies of the acquisition of languages other than English, including Finnish, German, Korean, and Samoan. Roger Brown has written the most exhaustive and searching analysis yet undertaken of the early stages of grammatical constructions and the meanings they convey. The five stages of linguistic development Brown establishes are measured not by chronological age-since children vary greatly in the speed at which their speech develops--but by mean length of utterance. This volume treats the first two stages. Stage I is the threshold of syntax, when children begin to combine words to make sentences. These sentences, Brown shows, are always limited to the same small set of semantic relations: nomination, recurrence, disappearance, attribution, possession, agency, and a few others. Stage II is concerned with the modulations of basic structural meanings--modulations for number, time, aspect, specificity--through the gradual acquisition of grammatical morphemes such as inflections, prepositions, articles, and case markers. Fourteen morphemes are studied in depth and it is shown that the order of their acquisition is almost identical across children and is predicted by their relative semantic and grammaticalcomplexity. It is, ultimately, the intent of this work to focus on the nature and development of knowledge: knowledge concerning grammar and the meanings coded by grammar; knowledge inferred from performance, from sentences and the settings in which they are spoken, and from signs of comprehension or incomprehension of sentences.

4,302 citations


"Computational models of language ac..." refers background in this paper

  • ...Such a model must be rigorously defined, in a way that lends itself to computational implementation; formally, it should exhibit highly-retricted computational expressivity; it should employ biases that correspond to established observations of child language research (such as item-based learning, rote learning [45], left-edge biases [30], adherence to stages of acquisition [19, 10], etc....

    [...]