Showing papers on "Phrase published in 2010"

PDF

Open Access

Journal Article•DOI•

Composition in distributional models of semantics.

[...]

Jeff Mitchell¹, Mirella Lapata¹•Institutions (1)

01 Nov 2010-Cognitive Science

TL;DR: This article proposes a framework for representing the meaning of word combinations in vector space in terms of additive and multiplicative functions, and introduces a wide range of composition models that are evaluated empirically on a phrase similarity task.

...read moreread less

981 citations

Journal Article•DOI•

More than words: Frequency effects for multi-word phrases

[...]

Inbal Arnon¹, Neal Snider²•Institutions (2)

Stanford University¹, University of Rochester²

01 Jan 2010-Journal of Memory and Language

TL;DR: The authors showed that comprehenders are sensitive to the frequencies of compositional four-word phrases (e.g. don't have to worry) and that more frequent phrases are processed faster.

...read moreread less

492 citations

Proceedings Article•

Discriminative Instance Weighting for Domain Adaptation in Statistical Machine Translation

[...]

George Foster¹, Cyril Goutte¹, Roland Kuhn¹•Institutions (1)

National Research Council¹

09 Oct 2010

TL;DR: A new approach to SMT adaptation that weights out-of-domain phrase pairs according to their relevance to the target domain, determined by both how similar to it they appear to be, and whether they belong to general language or not is described.

...read moreread less

Abstract: We describe a new approach to SMT adaptation that weights out-of-domain phrase pairs according to their relevance to the target domain, determined by both how similar to it they appear to be, and whether they belong to general language or not. This extends previous work on discriminative weighting by using a finer granularity, focusing on the properties of instances rather than corpus components, and using a simpler training procedure. We incorporate instance weighting into a mixture-model framework, and find that it yields consistent improvements over a wide range of baselines.

...read moreread less

232 citations

Journal Article•DOI•

The Corpus of Contemporary American English as the first reliable monitor corpus of English

[...]

Mark Davies¹•Institutions (1)

Brigham Young University¹

01 Dec 2010-Literary and Linguistic Computing

TL;DR: The Corpus of Contemporary American English is the first large, genre-balanced corpus of any language, which has been designed and constructed from the ground up as a 'monitor corpus', and which can be used to accurately track and study recent changes in the language.

...read moreread less

Abstract: The Corpus of Contemporary American English is the first large, genre-balanced corpus of any language, which has been designed and constructed from the ground up as a 'monitor corpus', and which can be used to accurately track and study recent changes in the language. The 400 million words corpus is evenly divided between spoken, fiction, popular magazines, newspapers, and academic journals. Most importantly, the genre balance stays almost exactly the same from year to year, which allows it to accurately model changes in the 'real world'. After discussing the corpus design, we provide a number of concrete examples of how the corpus can be used to look at recent changes in English, including morph- ology (new suffixes -friendly and -gate), syntax (including prescriptive rules, quotative like, so not ADJ, the get passive, resultatives, and verb complementa- tion), semantics (such as changes in meaning with web, green, or gay), and lexis-- including word and phrase frequency by year, and using the corpus architecture to produce lists of all words that have had large shifts in frequency between specific historical periods.

...read moreread less

221 citations

Patent•

Apparatus and Method for Analyzing Intention

[...]

Jung-Eun Kim¹, Jeong-mi Cho¹•Institutions (1)

Samsung¹

30 Sep 2010

TL;DR: In this paper, an apparatus and system for analyzing intention is presented, which applies a context-free grammar to each of one or more sentences in units of one-or more phrases to perform phrase spotting on each sentence, thereby extending a recognition range for an out-of-grammar (OOG) expression.

...read moreread less

Abstract: An apparatus and system for analyzing intention are provided. The apparatus for analyzing an intention applies a context-free grammar to each of one or more sentences in units of one or more phrases to perform phrase spotting on each sentence, thereby extending a recognition range for an out-of-grammar (OOG) expression. Meanwhile, the apparatus for analyzing an intention determines whether sentences that have undergone phrase spotting are grammatically valid by applying a dependency grammar to the sentences to filter an invalid sentence, and generates the intention analysis result of a valid sentence, thereby and grammatically and/or semantically verifying a sentence that has undergone speech recognition while extending a speech recognition range.

...read moreread less

206 citations

Proceedings Article•

Summarizing Microblogs Automatically

[...]

Beaux Sharifi¹, Mark-Anthony Hutton¹, Jugal Kalita¹•Institutions (1)

University of Colorado Colorado Springs¹

02 Jun 2010

TL;DR: An algorithm is developed that takes a trending phrase or any phrase specified by a user, collects a large number of posts containing the phrase, and provides an automatically created summary of the posts related to the term.

...read moreread less

Abstract: In this paper, we focus on a recent Web trend called microblogging, and in particular a site called Twitter. The content of such a site is an extraordinarily large number of small textual messages, posted by millions of users, at random or in response to perceived events or situations. We have developed an algorithm that takes a trending phrase or any phrase specified by a user, collects a large number of posts containing the phrase, and provides an automatically created summary of the posts related to the term. We present examples of summaries we produce along with initial evaluation.

...read moreread less

203 citations

Journal Article•DOI•

Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis

[...]

Tejashri Inadarchand Jain, Dipak Nemade

10 Sep 2010-International Journal of Computer Applications

TL;DR: A new approach to phrase-level sentiment analysis is presented that first determines whether an expression is neutral or polar and then disambiguates the polarity of the polar expressions, achieving results that are significantly better than baseline.

...read moreread less

Abstract: There has been a recent swell of interest in the automatic identification and extraction of opinions, emotions, and sentiments in text. Motivation for this task comes from the desire to provide tools for information analysts in government, commercial, and political domains, who want to automatically track attitudes and feelings in the news and on-line forums. How do people feel about recent events in the Middle East? Is the rhetoric from a particular opposition group intensifying? What is the range of opinions being expressed in the world press about the best course of action in Iraq? A system that could automatically identify opinions and emotions from text would be an enormous help to someone trying to answer these kinds of questions. Researchers from many subareas of Artificial Intelligence and Natural Language Processing have been working on the automatic identification of opinions and related tasks. To date, most such work has focused on sentiment or subjectivity classification at the document or sentence level. Document classification tasks include, for example, distinguishing editorials from news articles and classifying reviews as positive or negative. A common sentence-level task is to classify sentences as subjective or objective. This paper presents a new approach to phrase-level sentiment analysis that first determines whether an expression is neutral or polar and then disambiguates the polarity of the polar expressions. With this approach, the system is able to automatically identify the contextual polarity for a large subset of sentiment expressions, achieving results that are significantly better than baseline.

...read moreread less

202 citations

Journal Article•

The learnability of abstract syntactic principles

[...]

Amy Perfors¹, Joshua B. Tenenbaum², Terry Regier³•Institutions (3)

University of Adelaide¹, Massachusetts Institute of Technology², University of California, Berkeley³

01 Jan 2010-Developmental Cell

TL;DR: This article used a Bayesian framework for grammar induction and showed that an ideal learner could recognize the hierarchical phrase structure of language without having this knowledge innately specified as part of the language faculty.

...read moreread less

191 citations

Journal Article•DOI•

Eye movements and processing difficulty in object relative clauses

[...]

Adrian Staub¹•Institutions (1)

University of Massachusetts Amherst¹

01 Jul 2010-Cognition

TL;DR: The results suggest that the violation of expectations and the difficulty of memory retrieval both contribute to the difficulties of object relative clauses, but that these two sources of difficulty have qualitatively distinct behavioral consequences in normal reading.

...read moreread less

176 citations

Patent•

Method and device for transliteration

[...]

Piyush Kumar Rai¹, Samarth Vinod Deo¹•Institutions (1)

Samsung¹

25 Oct 2010

TL;DR: In this paper, a method for transliteration includes receiving input such as a word, a sentence, a phrase, and a paragraph, in a source language, creating source language sub-phonetic units for the word and converting the source language SUB-PHONETs to target language subphonETs.

...read moreread less

Abstract: A method for transliteration includes receiving input such as a word, a sentence, a phrase, and a paragraph, in a source language, creating source language sub-phonetic units for the word and converting the source language sub-phonetic units for the word to target language sub-phonetic units, retrieving ranking for each of the target language sub-phonetic units from a database and creating target language words for the word in the source language based on the target language sub-phonetic units and ranking of the each of the target language sub-phonetic units. The method further includes identifying candidate target language words based predefined criteria, and displaying candidate target language words.

...read moreread less

164 citations

Proceedings Article•

Phrase-Based Statistical Language Generation Using Graphical Models and Active Learning

[...]

François Mairesse¹, Milica Gasic¹, Filip Jurčíček¹, Simon Keizer¹, Blaise Thomson¹, Kai Yu¹, Steve Young¹ - Show less +3 more•Institutions (1)

University of Cambridge¹

11 Jul 2010

TL;DR: Bagel is presented, a statistical language generator which uses dynamic Bayesian networks to learn from semantically-aligned data produced by 42 untrained annotators, and can generate natural and informative utterances from unseen inputs in the information presentation domain.

...read moreread less

Abstract: Most previous work on trainable language generation has focused on two paradigms: (a) using a statistical model to rank a set of generated utterances, or (b) using statistics to inform the generation decision process. Both approaches rely on the existence of a handcrafted generator, which limits their scalability to new domains. This paper presents Bagel, a statistical language generator which uses dynamic Bayesian networks to learn from semantically-aligned data produced by 42 untrained annotators. A human evaluation shows that Bagel can generate natural and informative utterances from unseen inputs in the information presentation domain. Additionally, generation performance on sparse datasets is improved significantly by using certainty-based active learning, yielding ratings close to the human gold standard with a fraction of the data.

...read moreread less

Journal Article•DOI•

An Efficient Concept-Based Mining Model for Enhancing Text Clustering

[...]

Shady Shehata¹, Fakhri Karray¹, Mohamed S. Kamel¹•Institutions (1)

University of Waterloo¹

01 Oct 2010-IEEE Transactions on Knowledge and Data Engineering

TL;DR: A new concept-based mining model that analyzes terms on the sentence, document, and corpus levels rather than the traditional analysis of the document only is introduced and can efficiently find significant matching concepts between documents, according to the semantics of their sentences.

...read moreread less

Abstract: Most of the common techniques in text mining are based on the statistical analysis of a term, either word or phrase. Statistical analysis of a term frequency captures the importance of the term within a document only. However, two terms can have the same frequency in their documents, but one term contributes more to the meaning of its sentences than the other term. Thus, the underlying text mining model should indicate terms that capture the semantics of text. In this case, the mining model can capture terms that present the concepts of the sentence, which leads to discovery of the topic of the document. A new concept-based mining model that analyzes terms on the sentence, document, and corpus levels is introduced. The concept-based mining model can effectively discriminate between nonimportant terms with respect to sentence semantics and terms which hold the concepts that represent the sentence meaning. The proposed mining model consists of sentence-based concept analysis, document-based concept analysis, corpus-based concept-analysis, and concept-based similarity measure. The term which contributes to the sentence semantics is analyzed on the sentence, document, and corpus levels rather than the traditional analysis of the document only. The proposed model can efficiently find significant matching concepts between documents, according to the semantics of their sentences. The similarity between documents is calculated based on a new concept-based similarity measure. The proposed similarity measure takes full advantage of using the concept analysis measures on the sentence, document, and corpus levels in calculating the similarity between documents. Large sets of experiments using the proposed concept-based mining model on different data sets in text clustering are conducted. The experiments demonstrate extensive comparison between the concept-based analysis and the traditional analysis. Experimental results demonstrate the substantial enhancement of the clustering quality using the sentence-based, document-based, corpus-based, and combined approach concept analysis.

...read moreread less

Journal Article•DOI•

Grammatical aspect and mental simulation.

[...]

Benjamin Bergen, Kathryn Wheeler

01 Mar 2010-Brain and Language

TL;DR: It is shown that progressive sentences about hand motion facilitate manual action in the same direction, while perfect sentences that are identical in every way except their aspect do not.

...read moreread less

Proceedings Article•

Automatic Generation of Story Highlights

[...]

Kristian Woodsend¹, Mirella Lapata¹•Institutions (1)

University of Edinburgh¹

11 Jul 2010

TL;DR: Experimental results show that the model's output is comparable to human-written highlights in terms of both grammaticality and content.

...read moreread less

Abstract: In this paper we present a joint content selection and compression model for single-document summarization. The model operates over a phrase-based representation of the source document which we obtain by merging information from PCFG parse trees and dependency graphs. Using an integer linear programming formulation, the model learns to select and combine phrases subject to length, coverage and grammar constraints. We evaluate the approach on the task of generating "story highlights"---a small number of brief, self-contained sentences that allow readers to quickly gather information on news stories. Experimental results show that the model's output is comparable to human-written highlights in terms of both grammaticality and content.

...read moreread less

Journal Article•DOI•

On the flexibility of grammatical advance planning during sentence production: Effects of cognitive load on multiple lexical access

[...]

Valentin Wagner¹, Jörg D. Jescheniak¹, Herbert Schriefers²•Institutions (2)

Leipzig University¹, Radboud University Nijmegen²

01 Mar 2010-Journal of Experimental Psychology: Learning, Memory and Cognition

TL;DR: The data suggest that the scope of advance planning during grammatical encoding in sentence production is flexible, rather than structurally fixed.

...read moreread less

Abstract: Three picture-word interference experiments addressed the question of whether the scope of grammatical advance planning in sentence production corresponds to some fixed unit or rather is flexible. Subjects produced sentences of different formats under varying amounts of cognitive load. When speakers described 2-object displays with simple sentences of the form "the frog is next to the mug," the 2 nouns were found to be lexically-semantically activated to similar degrees at speech onset, as indexed by similarly sized interference effects from semantic distractors related to either the first or the second noun. When speakers used more complex sentences (including prenominal color adjectives; e.g., "the blue frog is next to the blue mug") much larger interference effects were observed for the first than the second noun, suggesting that the second noun was lexically-semantically activated before speech onset on only a subset of trials. With increased cognitive load, introduced by an additional conceptual decision task and variable utterance formats, the interference effect for the first noun was increased and the interference effect for second noun disappeared, suggesting that the scope of advance planning had been narrowed. By contrast, if cognitive load was induced by a secondary working memory task to be performed during speech planning, the interference effect for both nouns was increased, suggesting that the scope of advance planning had not been affected. In all, the data suggest that the scope of advance planning during grammatical encoding in sentence production is flexible, rather than structurally fixed.

...read moreread less

Journal Article•DOI•

Children’s (in)ability to recover from garden paths in a verb-final language: Evidence for developing control in sentence processing

[...]

Youngon Choi¹, John C. Trueswell¹•Institutions (1)

University of Pennsylvania¹

01 May 2010-Journal of Experimental Child Psychology

TL;DR: An eye-tracking study explored Korean-speaking adults' and 4- and 5-year-olds' ability to recover from misinterpretations of temporarily ambiguous phrases during spoken language comprehension, finding that children, but not adults, had difficulty in recovering from these misinterpretations despite strong disambiguating evidence at the end of the sentence.

...read moreread less

Journal Article•DOI•

A blog emotion corpus for emotional expression analysis in Chinese

[...]

Changqin Quan¹, Fuji Ren¹•Institutions (1)

Central China Normal University¹

01 Oct 2010-Computer Speech & Language

TL;DR: This work uses blogs as object and data source for Chinese emotional expression analysis, and based on this model, a relatively fine-grained annotation scheme is proposed for manual annotation of an emotion corpus.

...read moreread less

Proceedings Article•

Automatically Learning Source-side Reordering Rules for Large Scale Machine Translation

[...]

Dmitriy Genzel¹•Institutions (1)

Google¹

23 Aug 2010

TL;DR: An approach to automatically learn reordering rules to be applied as a preprocessing step in phrase-based machine translation, showing BLEU improvements for all of them, and demonstrating that many important order transformations can be captured.

...read moreread less

Abstract: We describe an approach to automatically learn reordering rules to be applied as a preprocessing step in phrase-based machine translation. We learn rules for 8 different language pairs, showing BLEU improvements for all of them, and demonstrate that many important order transformations (SVO to SOV or VSO, head-modifier, verb movement) can be captured by this approach.

...read moreread less

Patent•

Dynamic bi-phrases for statistical machine translation

[...]

Marc Dymetman¹, Wilker Aziz¹, Nicola Cancedda¹, Jean-Marc Coursimault¹, Vassilina Nikoulina¹, Lucia Specia¹ - Show less +2 more•Institutions (1)

Xerox¹

20 May 2010

TL;DR: In this article, a system and a method for phrase-based translation are disclosed, which includes receiving source language text to be translated into target language text, and a translation, based on the hypothesis scores, is then output.

...read moreread less

Abstract: A system and a method for phrase-based translation are disclosed. The method includes receiving source language text to be translated into target language text. One or more dynamic bi-phrases are generated, based on the source text and the application of one or more rules, which may be based on user descriptions. A dynamic feature value is associated with each of the dynamic bi-phrases. For a sentence of the source text, static bi-phrases are retrieved from a bi-phrase table, each of the static bi-phrases being associated with one or more values of static features. Any of the dynamic bi-phrases which each cover at least one word of the source text are also retrieved, which together form a set of active bi-phrases. Translation hypotheses are generated using active bi-phrases from the set and scored with a translation scoring model which takes into account the static and dynamic feature values of the bi-phrases used in the respective hypothesis. A translation, based on the hypothesis scores, is then output.

...read moreread less

Journal Article•DOI•

Syntactic alignment and shared word order in code-switched sentence production: Evidence from bilingual monologue and dialogue

[...]

Gerrit Jan Kootstra¹, Janet G. van Hell², Janet G. van Hell¹, Ton Dijkstra¹•Institutions (2)

Radboud University Nijmegen¹, Pennsylvania State University²

01 Aug 2010-Journal of Memory and Language

TL;DR: This paper investigated the role of shared word order and alignment with a dialogue partner in the production of code-switched sentences and found that participants had a clear preference for using the shared order when they switched languages, but also aligned their word order choices and code switching patterns with the confederate.

...read moreread less

Proceedings Article•

A Large Scale Ranker-Based System for Search Query Spelling Correction

[...]

Jianfeng Gao¹, Xiaolong Li¹, Daniel Micol¹, Chris Quirk¹, Xu Sun² - Show less +1 more•Institutions (2)

Microsoft¹, University of Tokyo²

23 Aug 2010

TL;DR: The noisy channel model is subsumed by a more general ranker, which allows a variety of features to be easily incorporated and a distributed infrastructure is proposed for training and applying Web scale n-gram language models.

...read moreread less

Abstract: This paper makes three significant extensions to a noisy channel speller designed for standard written text to target the challenging domain of search queries. First, the noisy channel model is subsumed by a more general ranker, which allows a variety of features to be easily incorporated. Second, a distributed infrastructure is proposed for training and applying Web scale n-gram language models. Third, a new phrase-based error model is presented. This model places a probability distribution over transformations between multi-word phrases, and is estimated using large amounts of query-correction pairs derived from search logs. Experiments show that each of these extensions leads to significant improvements over the state-of-the-art baseline methods.

...read moreread less

Journal Article•DOI•

The role of syntactic structure in guiding prosody perception with ordinary listeners and everyday speech

[...]

Jennifer Cole¹, Yoonsook Mo¹, Soondo Baek¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

11 Jun 2010-Language and Cognitive Processes

TL;DR: This paper investigated the relationship between syntactic and prosodic phrase structures in the production and perception of spontaneous speech and found that syntax influences prosody production, listeners' perception of prosodic boundaries is sensitive to acoustic duration, and syntax directly influences boundary perception.

...read moreread less

Abstract: The relationship between syntactic and prosodic phrase structures is investigated in the production and perception of spontaneous speech. Three hypotheses are tested: (1) syntax influences prosody production; (2) listeners' perception of prosodic boundaries is sensitive to acoustic duration; and (3) syntax directly influences boundary perception, (partly) independent of the acoustic evidence for boundaries. Data are from the Buckeye corpus of conversational speech, and the real-time prosodic transcription of those data by 97 untrained listeners. Inter-transcriber agreement codes boundary strength at word junctures, and Boundary scores are shown to be correlated with both the syntactic context and vowel duration of a word. Vowel duration is also correlated with syntactic context, but the effect of syntactic context on boundary perception is not fully explained by vowel duration. Regression analyses show that syntactic clause boundaries and vowel duration are the first and second strongest predictors of bou...

...read moreread less

Proceedings Article•DOI•

Clickthrough-based translation models for web search: from word models to phrase models

[...]

Jianfeng Gao¹, Xiaodong He¹, Jian-Yun Nie²•Institutions (2)

Microsoft¹, Université de Montréal²

26 Oct 2010

TL;DR: This paper provides a quantitative analysis of the language discrepancy issue, and explores the use of clickthrough data to bridge documents and queries, and demonstrates that standard statistical machine translation techniques can be adapted for building a better Web document retrieval system.

...read moreread less

Abstract: Web search is challenging partly due to the fact that search queries and Web documents use different language styles and vocabularies. This paper provides a quantitative analysis of the language discrepancy issue, and explores the use of clickthrough data to bridge documents and queries. We assume that a query is parallel to the titles of documents clicked on for that query. Two translation models are trained and integrated into retrieval models: A word-based translation model that learns the translation probability between single words, and a phrase-based translation model that learns the translation probability between multi-term phrases. Experiments are carried out on a real world data set. The results show that the retrieval systems that use the translation models outperform significantly the systems that do not. The paper also demonstrates that standard statistical machine translation techniques such as word alignment, bilingual phrase extraction, and phrase-based decoding, can be adapted for building a better Web document retrieval system.

...read moreread less

Proceedings Article•

Jane: Open Source Hierarchical Translation, Extended with Reordering and Lexicon Models

[...]

David Vilar¹, Daniel Stein¹, Matthias Huck¹, Hermann Ney¹•Institutions (1)

RWTH Aachen University¹

15 Jul 2010

TL;DR: A novel reordering model for the hierarchical phrase-based approach is introduced which further enhances translation performance, and the effect some recent extended lexicon models have on the performance of the system is analyzed.

...read moreread less

Abstract: We present Jane, RWTH's hierarchical phrase-based translation system, which has been open sourced for the scientific community. This system has been in development at RWTH for the last two years and has been successfully applied in different machine translation evaluations. It includes extensions to the hierarchical approach developed by RWTH as well as other research institutions. In this paper we give an overview of its main features. We also introduce a novel reordering model for the hierarchical phrase-based approach which further enhances translation performance, and analyze the effect some recent extended lexicon models have on the performance of the system.

...read moreread less

Book•

Meaning and the Lexicon: The Parallel Architecture 1975-2010

[...]

Ray Jackendoff

02 May 2010

TL;DR: This book discusses the architecture of the Linguistic-Spatial Interface, as well as Morphological and Semantic Regularities in the Lexicon, and the Ecology of English Noun-Noun Compounds.

...read moreread less

Abstract: 1. prologue: The Parallel Architecture and its Components 2. Morphological and Semantic Regularities in the Lexicon 3. On Beyond Zebra: The Relation of Linguistics and Visual Information 4. The Architecture of the Linguistic-Spatial Interface 5. Parts and Boundaries 6. The Proper Treatment of Measuring Out, Telicity, and Perhaps Even Quantification in English 7. English Particle Constructions, the Lexicon, and the Autonomy of Syntax 8. Twistin' the Night Away 9. The English Resultative as a Family of Constructions 10. On The phrase The Phrase 'the phrase' 11. Contrastive Focus Reduplication in English (the salad-salad paper) 12. Construction After Construction and its Theoretical Challenges 13. The Ecology of English Noun-Noun Compounds References

...read moreread less

Proceedings Article•

Training Phrase Translation Models with Leaving-One-Out

[...]

Joern Wuebker¹, Arne Mauser¹, Hermann Ney¹•Institutions (1)

RWTH Aachen University¹

11 Jul 2010

TL;DR: A novel leaving-one-out approach to prevent over-fitting is described that allows us to train phrase models that show improved translation performance on the WMT08 Europarl German-English task.

...read moreread less

Abstract: Several attempts have been made to learn phrase translation probabilities for phrase-based statistical machine translation that go beyond pure counting of phrases in word-aligned training data. Most approaches report problems with over-fitting. We describe a novel leaving-one-out approach to prevent over-fitting that allows us to train phrase models that show improved translation performance on the WMT08 Europarl German-English task. In contrast to most previous work where phrase models were trained separately from other models used in translation, we include all components such as single word lexica and reordering models in training. Using this consistent training of phrase models we are able to achieve improvements of up to 1.4 points in BLEU. As a side effect, the phrase table size is reduced by more than 80%.

...read moreread less

Proceedings Article•

Thinking with the Body

[...]

David Kirsh¹•Institutions (1)

University of California, Los Angeles¹

01 Jan 2010

TL;DR: Kirsh et al. as mentioned in this paper explored how dancers and choreographers use their bodies to think about dance phrases and found that the body in motion can serve as an anchor and vehicle for thought.

...read moreread less

Abstract: Thinking with the Body David Kirsh (kirsh@ucsd.edu) Dept of Cognitive Science University of California, San Diego Abstract To explore the question of physical thinking – using the body as an instrument of cognition – we collected extensive video and interview data on the creative process of a noted choreographer and his company as they made a new dance. A striking case of physical thinking is found in the phenomenon of marking. Marking refers to dancing a phrase in a less than complete manner. Dancers mark to save energy. But they also mark to explore the tempo of a phrase, or its movement sequence, or the intention behind it. Because of its representational nature, marking can serve as a vehicle for thought. Importantly, this vehicle is less complex than the version of the same phrase danced ‘full-out’. After providing evidence for distinguishing different types of marking, three ways of understanding marking as a form of thought are considered: marking as a gestural language for encoding aspects of a target movement, marking as a method of priming neural systems involved in the target movement, and marking as a method for improving the precision of mentally projecting aspects of the target. Keywords: Marking; multimodality; thinking, embodied cognition, ethnography. 1. Introduction This paper explores how dancers and choreographers use their bodies to think about dance phrases. My specific focus is a technique called ‘marking’. Marking refers to dancing a phrase in a less than complete manner. See fig. 1 for an example of hand marking, a form that is far smaller than the more typical method of marking that involves modeling a phrase with the whole body. Marking is part of the practice of dance, pervasive in all phases of creation, practice, rehearsal, and reflection. Virtually all English speaking dancers know the term, though few, if any, scholarly articles exist that describe the process or give instructions on how to do it. 1 When dancers mark a phrase, they use their body’s movement and form as a representational vehicle. They do not recreate the full dance phrase they normally perform; instead, they create a simplified or abstracted version – a model. Dancers mark to save energy, to avoid strenuous movement such as jumps, and sometimes to review or explore specific aspects of a phrase, such as tempo, movement sequence, or underlying intention, without the mental complexity involved in creating the phrase ‘full-out’. Marking is not the only way dancers ‘mentally’ explore phrases. Many imagine themselves performing a phrase. Some of the professional dancers we studied reported visualizing their phrase in bed before going to sleep, others reporting mentally reviewing their phrases while traveling on the tube on their way home. Our evidence suggests that marking, however, gives more insight than mental rehearsal: by physically executing a synoptic version of the whole phrase – by creating a simplified version externally – dancers are able to understand the shape, dynamics, emotion, and spatial elements of a phrase better than through imagination alone. They use marking as an anchor and vehicle for thought. It is this idea – that a body in motion can serve as an anchor and vehicle of thought – that is explored in this paper. It is a highly general claim. It has been said that gesture can facilitate thought, [Golden Meadow 05]; that physically simulating a process can help a thinker understand a process [Collins et al 91], and that mental rehearsal is improved by overt physical movement. [Coffman 90] Why? What extra can physical action or physical structure offer to imagination? The answer, I suggest, is that creating an external structure connected to a thought – whether that external structure be a gesture, dance form, or linguistic structure – is part of an interactive strategy of bootstrapping thought by providing an anchor for mental projection. [Hutchins, 05, Kirsh 09, 10]. Marking a phrase provides the scaffold to mentally project more detailed structure than could otherwise be held in mind. It is part of an interactive strategy for augmenting cognition. By marking, dancers harness their bodies to drive thought deeper than through mental simulation and unaided thinking alone. Hand Marking Fig 1a Fig 1b In Fig 1a an Irish river dancer is caught in mid move. In 1b, the same move is marked using just the hands. River dancing is a type of step dancing where the arms are keep still. Typically, river dancers mark steps and positions using one hand for the movement and the other for the floor. Most marking involves modeling phrases with the whole body, and not just the hands. Search by professional librarians of dance in the UK and US has yet to turn up scholarly articles on the practice of marking.

...read moreread less

Patent•

Word-dependent language model

[...]

Yun-Cheng Ju¹, Ivan Tashev¹, Chad R. Heinemann¹•Institutions (1)

Microsoft¹

23 Dec 2010

TL;DR: The authors describe word-dependent language models, as well as their creation and use, which can be useful in many contexts, including those where one or more letters of the expected phrase are known to the speaker.

...read moreread less

Abstract: This document describes word-dependent language models, as well as their creation and use. A word-dependent language model can permit a speech-recognition engine to accurately verify that a speech utterance matches a multi-word phrase. This is useful in many contexts, including those where one or more letters of the expected phrase are known to the speaker.

...read moreread less

Proceedings Article•

Hitting the Right Paraphrases in Good Time

[...]

Stanley Kok¹, Chris Brockett²•Institutions (2)

University of Washington¹, Microsoft²

02 Jun 2010

TL;DR: Manual evaluation of generated output shows that the random-walk-based approach to learning paraphrases from bilingual parallel corpora outperforms the state-of-the-art system of Callison-Burch (2008).

...read moreread less

Abstract: We present a random-walk-based approach to learning paraphrases from bilingual parallel corpora. The corpora are represented as a graph in which a node corresponds to a phrase, and an edge exists between two nodes if their corresponding phrases are aligned in a phrase table. We sample random walks to compute the average number of steps it takes to reach a ranking of paraphrases with better ones being "closer" to a phrase of interest. This approach allows "feature" nodes that represent domain knowledge to be built into the graph, and incorporates truncation techniques to prevent the graph from growing too large for efficiency. Current approaches, by contrast, implicitly presuppose the graph to be bipartite, are limited to finding paraphrases that are of length two away from a phrase, and do not generally permit easy incorporation of domain knowledge. Manual evaluation of generated output shows that our approach outperforms the state-of-the-art system of Callison-Burch (2008).

...read moreread less

Proceedings Article•

SemEval-2010 Task 2: Cross-Lingual Lexical Substitution

[...]

Rada Mihalcea¹, Ravi Sinha¹, Diana McCarthy•Institutions (1)

University of North Texas¹

15 Jul 2010

TL;DR: The SemEval-2010 Cross-Lingual Lexical Substitution task, where given an English target word in context, participating systems had to find an alternative substitute phrase in Spanish, is described.

...read moreread less

Abstract: In this paper we describe the SemEval-2010 Cross-Lingual Lexical Substitution task, where given an English target word in context, participating systems had to find an alternative substitute word or phrase in Spanish. The task is based on the English Lexical Substitution task run at SemEval-2007. In this paper we provide background and motivation for the task, we describe the data annotation process and the scoring system, and present the results of the participating systems.

...read moreread less

Collapse