Showing papers on "Natural language published in 2007"

PDF

Open Access

Proceedings Article•

Computing semantic relatedness using Wikipedia-based explicit semantic analysis

[...]

Evgeniy Gabrilovich¹, Shaul Markovitch¹•Institutions (1)

Technion – Israel Institute of Technology¹

06 Jan 2007

TL;DR: This work proposes Explicit Semantic Analysis (ESA), a novel method that represents the meaning of texts in a high-dimensional space of concepts derived from Wikipedia that results in substantial improvements in correlation of computed relatedness scores with human judgments.

...read moreread less

Abstract: Computing semantic relatedness of natural language texts requires access to vast amounts of common-sense and domain-specific world knowledge. We propose Explicit Semantic Analysis (ESA), a novel method that represents the meaning of texts in a high-dimensional space of concepts derived from Wikipedia. We use machine learning techniques to explicitly represent the meaning of any text as a weighted vector of Wikipedia-based concepts. Assessing the relatedness of texts in this space amounts to comparing the corresponding vectors using conventional metrics (e.g., cosine). Compared with the previous state of the art, using ESA results in substantial improvements in correlation of computed relatedness scores with human judgments: from r = 0.56 to 0.75 for individual words and from r = 0.60 to 0.72 for texts. Importantly, due to the use of natural concepts, the ESA model is easy to explain to human users.

...read moreread less

2,285 citations

Book•

Algorithms on Strings

[...]

Maxime Crochemore, Christophe Hancart, Thierry Lecroq

01 Apr 2007

TL;DR: In this paper, the authors describe algorithms in a C-like language for automatic processing of natural language, analysis of molecular sequences and management of textual databases, and present examples related to the automatic processing and analysis of natural languages.

...read moreread less

Abstract: Describing algorithms in a C-like language, this text presents examples related to the automatic processing of natural language, to the analysis of molecular sequences and to the management of textual databases.

...read moreread less

686 citations

Proceedings Article•

The CoNLL 2007 Shared Task on Dependency Parsing

[...]

Joakim Nivre, Johan Hall, Sandra K"ubler, Ryan McDonald, Jens Nilsson, Sebastian Riedel, Deniz Yuret - Show less +3 more

01 Dec 2007

TL;DR: The tasks of the different tracks are defined and how the data sets were created from existing treebanks for ten languages are described, to characterize the different approaches of the participating systems and report the test results and provide a first analysis of these results.

...read moreread less

Abstract: The Conference on Computational Natural Language Learning features a shared task, in which participants train and test their learning systems on the same data sets. In 2007, as in 2006, the shared task has been devoted to dependency parsing, this year with both a multilingual track and a domain adaptation track. In thispaper, we definethe tasksof the different tracks and describe how the data sets were created from existing treebanks for ten languages. In addition, we characterize the different approaches of the participating systems, report the test results, and provide a first analysis of these results.

...read moreread less

606 citations

Journal Article•DOI•

Is speech learning ‘gated’ by the social brain?

[...]

Patricia K. Kuhl¹•Institutions (1)

University of Washington¹

01 Jan 2007-Developmental Science

TL;DR: It is argued that the social brain 'gates' the computational mechanisms involved in human language learning.

...read moreread less

Abstract: I advance the hypothesis that the earliest phases of language acquisition ‐ the developmental transition from an initial universal state of language processing to one that is language-specific ‐ requires social interaction. Relating human language learning to a broader set of neurobiological cases of communicative development, I argue that the social brain ‘gates’ the computational mechanisms involved in human language learning.

...read moreread less

574 citations

Journal Article•DOI•

Representing word meaning and order information in a composite holographic lexicon.

[...]

Michael N. Jones¹, Douglas J. K. Mewhort²•Institutions (2)

University of Colorado Boulder¹, Queen's University²

01 Jan 2007-Psychological Review

TL;DR: The authors used simple convolution and superposition mechanisms to learn distributed holographic representations for words, which can be used for higher order models of language comprehension, relieving the complexity required at the higher level.

...read moreread less

Abstract: The authors present a computational model that builds a holographic lexicon representing both word meaning and word order from unsupervised experience with natural language. The model uses simple convolution and superposition mechanisms (cf. B. B. Murdock, 1982) to learn distributed holographic representations for words. The structure of the resulting lexicon can account for empirical data from classic experiments studying semantic typicality, categorization, priming, and semantic constraint in sentence completions. Furthermore, order information can be retrieved from the holographic representations, allowing the model to account for limited word transitions without the need for built-in transition rules. The model demonstrates that a broad range of psychological data can be accounted for directly from the structure of lexical representations learned in this way, without the need for complexity to be built into either the processing mechanisms or the representations. The holographic representations are an appropriate knowledge representation to be used by higher order models of language comprehension, relieving the complexity required at the higher level.

...read moreread less

549 citations

Proceedings Article•DOI•

Detection of Duplicate Defect Reports Using Natural Language Processing

[...]

Per Runeson¹, Magnus Alexandersson¹, Oskar Nyholm¹•Institutions (1)

Lund University¹

24 May 2007

TL;DR: This work investigates using natural language processing (NLP) techniques to identify duplicates in defect reports at Sony Ericsson mobile communications, and shows that about 2/3 of the duplicates can possibly be found using the NLP techniques.

...read moreread less

Abstract: Defect reports are generated from various testing and development activities in software engineering. Sometimes two reports are submitted that describe the same problem, leading to duplicate reports. These reports are mostly written in structured natural language, and as such, it is hard to compare two reports for similarity with formal methods. In order to identify duplicates, we investigate using natural language processing (NLP) techniques to support the identification. A prototype tool is developed and evaluated in a case study analyzing defect reports at Sony Ericsson mobile communications. The evaluation shows that about 2/3 of the duplicates can possibly be found using the NLP techniques. Different variants of the techniques provide only minor result differences, indicating a robust technology. User testing shows that the overall attitude towards the technique is positive and that it has a growth potential.

...read moreread less

535 citations

Representing Word Meaning and Order Information in a Composite

[...]

Michael N. Jones, Douglas J. K. Mewhort

01 Jan 2007

TL;DR: A computational model that builds a holographic lexicon representing both word meaning and word order from unsupervised experience with natural language demonstrates that a broad range of psychological data can be accounted for directly from the structure of lexical representations learned in this way, without the need for complexity to be built into either the processing mechanisms or the representations.

...read moreread less

506 citations

Book•

From Corpus to Classroom: Language Use and Language Teaching

[...]

Anne O'Keeffe¹, Michael McCarthy², Ronald Carter²•Institutions (2)

University of Limerick¹, University of Nottingham²

03 May 2007

TL;DR: From Corpus to Classroom summarises and makes accessible recent work in corpus research, focusing particularly on spoken data, based on analysis of corpora such as CANCODE and Cambridge International Corpus.

...read moreread less

Abstract: From Corpus to Classroom summarises and makes accessible recent work in corpus research, focusing particularly on spoken data. It is based on analysis of corpora such as CANCODE and Cambridge International Corpus, and written with particular reference to the development of corpus-informed pedagogy.The book explains how corpora can be designed and used, and focuses on what they tell us about language teaching. It examines the relevance of corpora to materials writers, course designers and language teachers and considers the needs of the learner in relation to authentic data. It shows how the answers to key questions such as 'Is there a basic, everyday vocabulary for English?', 'How should idioms be taught?' and 'What are the most common spoken language chunks?' are best explored by means of a clearer understanding of the workings of language in context.

...read moreread less

492 citations

Patent•

System and method for providing a natural language voice user interface in an integrated voice navigation services environment

[...]

Michael R. Kennewick, Catherine Cheung, Larry Baldwin, Ari Salomon, Michael Tjalve, Sheetal Guttigoli, Lynn Armstrong, Philippe Di Cristo, Bernie Zimmerman, Sam Menaker - Show less +6 more

11 Dec 2007

TL;DR: In this paper, a conversational, natural language voice user interface may provide an integrated voice navigation services environment, where the user can speak conversationally, using natural language, to issue queries, commands, or other requests relating to the navigation services provided in the environment.

...read moreread less

Abstract: A conversational, natural language voice user interface may provide an integrated voice navigation services environment. The voice user interface may enable a user to make natural language requests relating to various navigation services, and further, may interact with the user in a cooperative, conversational dialogue to resolve the requests. Through dynamic awareness of context, available sources of information, domain knowledge, user behavior and preferences, and external systems and devices, among other things, the voice user interface may provide an integrated environment in which the user can speak conversationally, using natural language, to issue queries, commands, or other requests relating to the navigation services provided in the environment.

...read moreread less

450 citations

Book Chapter•

Parallel corpora for medium density languages

[...]

Dániel Varga, Péter Halácsy, András Kornai, Nagy Viktor, Nagy Laszlo, Nemeth Laszlo, Tron Viktor - Show less +3 more

13 Dec 2007

TL;DR: A general methodology for rapidly collecting, building, and aligning parallel corpora for medium density languages, illustrating the main points on the case of Hungarian, Romanian, and Slovenian is described.

...read moreread less

Abstract: The choice of natural language technology appropriate for a given language is greatly impacted by density (availability of digitally stored material). More than half of the world speaks medium density languages, yet many of the methods appropriate for high or low density languages yield suboptimal results when applied to the medium density case. In this paper we describe a general methodology for rapidly collecting, building, and aligning parallel corpora for medium density languages, illustrating our main points on the case of Hungarian, Romanian, and Slovenian. We also describe and evaluate the hybrid sentence alignment method we are using.

...read moreread less

447 citations

Journal Article•DOI•

Automatic summarising: The state of the art

[...]

Karen Sparck Jones¹•Institutions (1)

University of Cambridge¹

01 Nov 2007-Information Processing and Management

TL;DR: The conclusions drawn are that automatic summarisation has made valuable progress, with useful applications, better evaluation, and more task understanding, but summarising systems are still poorly motivated in relation to the factors affecting them, and evaluation needs taking much further to engage with the purposes summaries are intended to serve.

...read moreread less

Abstract: This paper reviews research on automatic summarising in the last decade. This work has grown, stimulated by technology and by evaluation programmes. The paper uses several frameworks to organise the review, for summarising itself, for the factors affecting summarising, for systems, and for evaluation. The review examines the evaluation strategies applied to summarising, the issues they raise, and the major programmes. It considers the input, purpose and output factors investigated in recent summarising research, and discusses the classes of strategy, extractive and non-extractive, that have been explored, illustrating the range of systems built. The conclusions drawn are that automatic summarisation has made valuable progress, with useful applications, better evaluation, and more task understanding. But summarising systems are still poorly motivated in relation to the factors affecting them, and evaluation needs taking much further to engage with the purposes summaries are intended to serve and the contexts in which they are used.

...read moreread less

Journal Article•DOI•

The consequences of talking to strangers: Evolutionary corollaries of socio-cultural influences on linguistic form

[...]

Alison Wray¹, George W. Grace•Institutions (1)

Cardiff University¹

01 Mar 2007-Lingua

TL;DR: This paper explored the possibility that the linguistic forms and structures employed by our earliest language-using ancestors might have been significantly different from those observed in the languages we are most familiar with today, not because of a biological difference between them and us, but because the communicative context in which they operated was fundamentally different from that of most modern humans.

...read moreread less

Journal Article•

Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL)

[...]

Philipp Koehn, Hieu Hoang

01 Jun 2007-The Association for Computational Linguistics

Proceedings Article•DOI•

Using natural language program analysis to locate and understand action-oriented concerns

[...]

David C. Shepherd¹, Zachary P. Fry¹, Emily Hill¹, Lori Pollock¹, K. Vijay-Shanker¹ - Show less +1 more•Institutions (1)

University of Delaware¹

14 Mar 2007

TL;DR: A semi-automated concern location and comprehension tool designed to reduce the time developers spend on maintenance tasks and to increase their confidence in the results of these tasks, which is effective because it searches a unique natural language-based representation of source code.

...read moreread less

Abstract: Most current software systems contain undocumented high-level ideas implemented across multiple files and modules. When developers perform program maintenance tasks, they often waste time and effort locating and understanding these scattered concerns. We have developed a semi-automated concern location and comprehension tool, Find-Concept, designed to reduce the time developers spend on maintenance tasks and to increase their confidence in the results of these tasks. Find-Concept is effective because it searches a unique natural language-based representation of source code, uses novel techniques to expand initial queries into more effective queries, and displays search results in an easy-to-comprehend format. We describe the Find-Concept tool, the underlying program analysis, and an experimental study comparing Find-Concept's search effectiveness with two state-of-the-art lexical and information retrieval-based search tools. Across nine action-oriented concern location tasks derived from open source bug reports, our Eclipse-based tool produced more effective queries more consistently than either competing search tool with similar user effort.

...read moreread less

Journal Article•DOI•

The dominance of English in the international scientific periodical literature and the future of language use in science

[...]

Rainer Enrique Hamel¹•Institutions (1)

Universidad Autónoma Metropolitana¹

01 Jan 2007-Aila Review

TL;DR: In this article, the authors focus on international periodical publications where more than 75 percent of the articles in the social sciences and humanities and well over 90 percent in the natural sciences are written in English.

...read moreread less

Abstract: Throughout the 20th century, international communication has shifted from a plural use of several languages to a clear pre-eminence of English, especially in the field of science. This paper focuses on international periodical publications where more than 75 percent of the articles in the social sciences and humanities and well over 90 percent in the natural sciences are written in English. The shift towards English implies that an increasing number of scientists whose mother tongue is not English have already moved to English for publication. Consequently, other international languages, namely French, German, Russian, Spanish and Japanese lose their attraction as languages of science. Many observers conclude that it has become inevitable to publish in English, even in English only. The central question is whether the actual hegemony of English will create a total monopoly, at least at an international level, or if changing global conditions and language policies may allow alternative solutions. The paper analyses how the conclusions of an inevitable monopoly of English are constructed, and what possible disadvantages such a process might entail. Finally, some perspectives of a new plurilingual approach in scientific production and communication are sketched.

...read moreread less

Proceedings Article•

Combining Lexical and Grammatical Features to Improve Readability Measures for First and Second Language Texts

[...]

Michael Heilman¹, Kevyn Collins-Thompson¹, Jamie Callan¹, Maxine Eskenazi¹•Institutions (1)

Carnegie Mellon University¹

01 Apr 2007

TL;DR: This work evaluates a system that uses interpolated predictions of reading difficulty that are based on both vocabulary and grammatical features, and indicates that Grammatical features may play a more important role in second language readability than in first languagereadability.

...read moreread less

Abstract: This work evaluates a system that uses interpolated predictions of reading difficulty that are based on both vocabulary and grammatical features. The combined approach is compared to individual grammar- and language modeling-based approaches. While the vocabulary-based language modeling approach outperformed the grammar-based approach, grammar-based predictions can be combined using confidence scores with the vocabulary-based predictions to produce more accurate predictions of reading difficulty for both first and second language texts. The results also indicate that grammatical features may play a more important role in second language readability than in first language readability.

...read moreread less

Journal Article•DOI•

Infant Rule Learning Facilitated by Speech

[...]

Gary Marcus, Keith J. Fernandes¹, Scott P. Johnson¹•Institutions (1)

New York University¹

01 May 2007-Psychological Science

TL;DR: A striking finding is reported: Infants are better able to extract rules from sequences of nonspeech—such as sequences of musical tones, animal sounds, or varying timbres—if they first hear those rules instantiated in sequences of speech.

...read moreread less

Abstract: Sequences of speech sounds play a central role in human cognitive life, and the principles that govern such sequences are crucial in determining the syntax and semantics of natural languages. Infants are capable of extracting both simple transitional probabilities and simple algebraic rules from sequences of speech, as demonstrated by studies using ABB grammars (la ta ta, gai mu mu, etc.). Here, we report a striking finding: Infants are better able to extract rules from sequences of nonspeech—such as sequences of musical tones, animal sounds, or varying timbres—if they first hear those rules instantiated in sequences of speech.

...read moreread less

Journal Article•

Provenance in Scientific Workflow Systems

[...]

Susan B. Davidson¹, Sarah Cohen Boulakia¹, Anat Eyal, Bertram Ludäscher, Timothy M. McPhillips, Shawn Bowers, Manish Kumar Anand, Juliana Freire - Show less +4 more•Institutions (1)

Microsoft¹

01 Mar 2007-IEEE Data(base) Engineering Bulletin

TL;DR: This paper explores the implementation of SemEQUAL using OrdPath, a positional representation for nodes of a hierarchy that is used successfully for supporting XML documents in relational systems, and proposes the use of OrdPath to represent position within the Wordnet hierarchy, leveraging its ability to compute transitive closures efficiently.

...read moreread less

Abstract: The volume of information in natural languages in electronic format is increasing exponentially. The demographics of users of information management systems are becoming increasingly multilingual. Together these trends create a requirement for information management systems to support processing of information in multiple natural languages seamlessly. Database systems, the backbones of information management, should support this requirement effectively and efficiently. Earlier research in this area had proposed multilingual operators [7, 8] for relational database systems, and discussed their implementation using existing database features. In this paper, we specifically focus on the SemEQUAL operator [8], implementing a multilingual semantic matching predicate using WordNet [12]. We explore the implementation of SemEQUAL using OrdPath [10], a positional representation for nodes of a hierarchy that is used successfully for supporting XML documents in relational systems. We propose the use of OrdPath to represent position within the Wordnet hierarchy, leveraging its ability to compute transitive closures efficiently. We show theoretically that an implementation using OrdPath will outperform those implementations proposed previously. Our initial experimental results confirm this analysis, and show that the OrdPath implementation performs significantly better. Further, since our technique is not specifically rooted to linguistic hierarchies, the same approach may benefit other applications that utilize alternative hierarchical ontologies.

...read moreread less

Book Chapter•DOI•

PANTO: A Portable Natural Language Interface to Ontologies

[...]

Chong Wang¹, Miao Xiong¹, Qi Zhou¹, Yong Yu¹•Institutions (1)

Shanghai Jiao Tong University¹

03 Jun 2007

TL;DR: PANTO is presented, a Portable nAtural laNguage inTerface to Ontologies, which accepts generic natural language queries and outputs SPARQL queries, and adopts a triple-based data model to interpret the parse trees output by an off-the-shelf parser.

...read moreread less

Abstract: Providing a natural language interface to ontologies will not only offer ordinary users the convenience of acquiring needed information from ontologies, but also expand the influence of ontologies and the semantic web consequently. This paper presents PANTO, a Portable nAtural laNguage inTerface to Ontologies, which accepts generic natural language queries and outputs SPARQL queries. Based on a special consideration on nominal phrases, it adopts a triple-based data model to interpret the parse trees output by an off-the-shelf parser. Complex modifications in natural language queries such as negations, superlative and comparative are investigated. The experiments have shown that PANTO provides state-of-the-art results.

...read moreread less

Journal Article•DOI•

The Cross-Linguistic Transfer of Early Literacy Skills: The Role of Initial L1 and L2 Skills and Language of Instruction

[...]

Elsa Cardenas-Hagan¹, Coleen D. Carlson¹, Sharolyn D. Pollard-Durodola²•Institutions (2)

University of Houston¹, Texas A&M University²

01 Jul 2007-Language Speech and Hearing Services in Schools

TL;DR: Investigation of the development of early language and literacy skills among Spanish-speaking students in 2 large urban school districts, 1 middle-size urban district, and 1 border district suggests that pedagogical decisions for ELLs should not only consider effective instructional literacy strategies but also acknowledge that the language of instruction forSpanish-speaking ELLS may produce varying results for different students.

...read moreread less

Abstract: Purpose The purpose of this study was to examine the effects of initial first and second language proficiencies as well as the language of instruction that a student receives on the relationship be...

...read moreread less

Proceedings Article•DOI•

Design and Semantics of a Decentralized Authorization Language

[...]

Moritz Y. Becker¹, Cédric Fournet¹, Andrew D. Gordon¹•Institutions (1)

Microsoft¹

06 Jul 2007

TL;DR: This work describes an execution strategy based on translation to datalog with constraints, and table-based resolution that is sound, complete, and always terminates, despite recursion and negation, as long as simple syntactic conditions are met.

...read moreread less

Abstract: We present a declarative authorization language that strikes a careful balance between syntactic and semantic simplicity, policy expressiveness, and execution efficiency. The syntax is close to natural language, and the semantics consists of just three deduction rules. The language can express many common policy idioms using constraints, controlled delegation, recursive predicates, and negated queries. We describe an execution strategy based on translation to datalog with constraints, and table-based resolution. We show that this execution strategy is sound, complete, and always terminates, despite recursion and negation, as long as simple syntactic conditions are met.

...read moreread less

Book Chapter•DOI•

How useful are natural language interfaces to the semantic web for casual end-users?

[...]

Esther Kaufmann¹, Abraham Bernstein¹•Institutions (1)

University of Zurich¹

11 Nov 2007

TL;DR: The results of the study confirm that NLIs are useful for querying Semantic Web data and introduce four interfaces each allowing a different query language and present a usability study benchmarking these interfaces.

...read moreread less

Abstract: Natural language interfaces offer end-users a familiar and convenient option for querying ontology-based knowledge bases. Several studies have shown that they can achieve high retrieval performance as well as domain independence. This paper focuses on usability and investigates if NLIs are useful from an end-user's point of view. To that end, we introduce four interfaces each allowing a different query language and present a usability study benchmarking these interfaces. The results of the study reveal a clear preference for full sentences as query language and confirm that NLIs are useful for querying Semantic Web data.

...read moreread less

Journal Article•DOI•

Biomimetic design through natural language analysis to facilitate cross-domain information retrieval

[...]

I. Chiu¹, L. H. Shu¹•Institutions (1)

University of Toronto¹

01 Jan 2007-Ai Edam Artificial Intelligence for Engineering Design, Analysis and Manufacturing

TL;DR: A method to systematically bridge the disparate biology and engineering domains using natural language analysis is described, able to algorithmically generate several biologically meaningful keywords, including defend, that are not obviously related to the engineering problem.

...read moreread less

Abstract: Biomimetic, or biologically inspired, design uses analogous biological phenomena to develop solutions for engineering problems. Several instances of biomimetic design result from personal observations of biological phenomena. However, many engineers' knowledge of biology may be limited, thus reducing the potential of biologically inspired solutions. Our approach to biomimetic design takes advantage of the large amount of biological knowledge already available in books, journals, and so forth, by performing keyword searches on these existing natural-language sources. Because of the ambiguity and imprecision of natural language, challenges inherent to natural language processing were encountered. One challenge of retrieving relevant cross-domain information involves differences in domain vocabularies, or lexicons. A keyword meaningful to biologists may not occur to engineers. For an example problem that involved cleaning, that is, removing dirt, a biochemist suggested the keyword “defend.” Defend is not an obvious keyword to most engineers for this problem, nor are the words defend and “cleansremove” directly related within lexical references. However, previous work showed that biological phenomena retrieved by the keyword defend provided useful stimuli and produced successful concepts for the cleansremove problem. In this paper, we describe a method to systematically bridge the disparate biology and engineering domains using natural language analysis. For the cleansremove example, we were able to algorithmically generate several biologically meaningful keywords, including defend, that are not obviously related to the engineering problem. We developed a method to organize and rank the set of biologically meaningful keywords identified, and confirmed that we could achieve similar results for two other examples in encapsulation and microassembly. Although we specifically address cross-domain information retrieval from biology, the bridging process presented in this paper is not limited to biology, and can be used for any other domain given the availability of appropriate domain-specific knowledge sources and references.

...read moreread less

Book•

The International Guide to Speech Acquisition

[...]

Sharynne McLeod

23 Mar 2007

TL;DR: The international guide to speech acquisition, The international guideto speech acquisition , کتابخانه دیجیتال جندی اهواز

...read moreread less

Abstract: The international guide to speech acquisition , The international guide to speech acquisition , کتابخانه دیجیتال جندی شاپور اهواز

...read moreread less

Book Chapter•DOI•

Textual Affect Sensing for Sociable and Expressive Online Communication

[...]

Alena Neviarouskaya¹, Helmut Prendinger², Mitsuru Ishizuka¹•Institutions (2)

University of Tokyo¹, National Institute of Informatics²

12 Sep 2007

TL;DR: The developed Affect Analysis Model was designed to handle not only correctly written text, but also informal messages written in abbreviated or expressive manner, and an avatar was created in order to reflect the detected affective information and social behaviour.

...read moreread less

Abstract: In this paper, we address the tasks of recognition and interpretation of affect communicated through text messaging. The evolving nature of language in online conversations is a main issue in affect sensing from this media type, since sentence parsing might fail while syntactical structure analysis. The developed Affect Analysis Model was designed to handle not only correctly written text, but also informal messages written in abbreviated or expressive manner. The proposed rule-based approach processes each sentence in sequential stages, including symbolic cue processing, detection and transformation of abbreviations, sentence parsing, and word/phrase/sentence-level analyses. In a study based on 160 sentences, the system result agrees with at least two out of three human annotators in 70% of the cases. In order to reflect the detected affective information and social behaviour, an avatar was created.

...read moreread less

Reference Entry•DOI•

Acquiring Linguistic Constructions

[...]

Michael Tomasello¹•Institutions (1)

Max Planck Society¹

01 Jun 2007

TL;DR: Major theories of language acquisition are presented, along with the basic facts of language development, from childrens' acquisition of simple grammatical constructions to complex constructions and discourse.

...read moreread less

Abstract: This chapter is about how young children master the use of a language, with a focus on grammatical constructions. Major theories of language acquisition are presented, along with the basic facts of language development, from childrens' acquisition of simple grammatical constructions to complex constructions and discourse. Also covered are the language children hear, the acquisition of morphology, individual differences, and atypical development. Keywords: analogy; constructions; distribution learning; grammar; language acquisition; learning

...read moreread less

Book•

The Gestural Origin of Language

[...]

David F. Armstrong¹, Sherman Wilcox²•Institutions (2)

University of Washington¹, University of New Mexico²

19 Apr 2007

TL;DR: This book discusses language in the Wild, Gesture, Sign, and Speech, and the Ritualization of Language, as well as conceptual Spaces and Embodied Actions, and The Gesture-Language Interface.

...read moreread less

Abstract: 1. Grasping Language: Sign and the Evolution of Language 2. Language in the Wild: Paleontological and Primatological Evidence for Gestural Origins 3. Gesture, Sign, and Speech 4. Gesture, Sign, and Grammar: The Ritualization of Language 5. Conceptual Spaces and Embodied Actions 6. The Gesture-Language Interface 7. Invention of Visual Languages

...read moreread less

Journal Article•

Engaging teachers in language analysis: A functional linguistics approach to reflective literacy

[...]

Mariana Achugar, Mary J. Schleppegrell, Teresa Oteíza

01 Sep 2007-English Teaching-practice and Critique

TL;DR: The authors describe three professional development contexts in the U.S., where teachers have engaged in language analysis based on functional linguistics that has given them new insights into both content and learning processes.

...read moreread less

Abstract: Classrooms around the world are becoming more multilingual and teachers in all subject areas are faced with new challenges in enabling learners' academic language development without losing focus on content. These challenges require new ways of conceptualizing the relationship between language and content as well as new pedagogies that incorporate a dual focus on language and content in subject matter instruction. This article describes three professional development contexts in the U.S., where teachers have engaged in language analysis based on functional linguistics (for example, Halliday & Hasan, 1989; Christie, 1989) that has given them new insights into both content and learning processes. In these contexts, teachers in history classrooms with English Language Learners and teachers of languages other than English in classrooms with heritage speakers needed support to develop students' academic language development in a second language. The functional linguistics metalanguage and analysis skills they developed gave them new ways of approaching the texts read and written in their classrooms and enabled them to recognize how language constructs the content they are teaching, to critically assess how the content is presented in their teaching materials, and to engage students in richer conversation about content.

...read moreread less

Book•DOI•

Encoding classifications into lightweight ontologies

[...]

Fausto Giunchiglia¹, Maurizio Marchese¹, Ilya Zaihrayeu¹•Institutions (1)

University of Trento¹

01 Jan 2007-Journal on Data Semantics

TL;DR: This paper introduces the novel notion of Formal Classification, as a graph structure where labels are written in a propositional concept language, which allows to reason about classifications, and to reduce document classification and query answering to reasoning about subsumption.

...read moreread less

Abstract: Classifications have been used for centuries with the goal of cataloguing and searching large sets of objects. In the early days it was mainly books; lately it has also become Web pages, pictures and any kind of digital resources. Classifications describe their contents using natural language labels, an approach which has proved very effective in manual classification. However natural language labels show their limitations when one tries to automate the process, as they make it very hard to reason about classifications and their contents. In this paper we introduce the novel notion of Formal Classification, as a graph structure where labels are written in a propositional concept language. Formal Classifications turn out to be some form of lightweight ontologies. This, in turn, allows us to reason about them, to associate to each node a normal form formula which univocally describes its contents, and to reduce document classification and query answering to reasoning about subsumption.

...read moreread less

Patent•

Method and apparatus for named entity recognition in natural language

[...]

Pengju Yan¹, Yufei Sun¹, Tsuzuki Takashi¹•Institutions (1)

Panasonic¹

15 May 2007

TL;DR: In this article, a method for recognizing a named entity included in natural language, comprising the steps of: performing gradual parsing model training with the natural language to obtain a classification model, performing gradually parsing and recognition according to the obtained classification model to obtain information on positions and types of candidate named entities; performing a refusal recognition process for the candidate named entity; and generating a candidate named Entity lattice from the refusal-recognition-processed candidate namedEntity, and searching for a optimal path.

...read moreread less

Abstract: The present invention provides a method for recognizing a named entity included in natural language, comprising the steps of: performing gradual parsing model training with the natural language to obtain a classification model; performing gradual parsing and recognition according to the obtained classification model to obtain information on positions and types of candidate named entities; performing a refusal recognition process for the candidate named entities; and generating a candidate named entity lattice from the refusal-recognition-processed candidate named entities, and searching for a optimal path. The present invention uses a one-class classifier to score or evaluate these results to obtain the most reliable beginning and end borders of the named entities on the basis of the forward and backward parsing and recognizing results obtained only by using the local features.

...read moreread less

Collapse