scispace - formally typeset
Search or ask a question
Topic

Phrase

About: Phrase is a research topic. Over the lifetime, 12580 publications have been published within this topic receiving 317823 citations. The topic is also known as: syntagma & phrases.


Papers
More filters
Journal ArticleDOI
Steven Franks1
TL;DR: Proposed system allows for a more explanatory analysis of GEN-Q assignment and accounts for several distinctions between QP and NP subjects within Russian, also motivating the absence of these distinctions in Serbo-Croatian.
Abstract: Numeral phrases in Russian display many unusual morphosyntactic properties, e.g., (i) the numeral sometimes assigns genitive (GEN-Q) to the following noun and sometimes agrees with it and (ii) the numeral phrase sometimes induces subject-verb agreement and sometimes does not. In this paper existing analyses of these properties are parametrized to accommodate related phenomena in other Slavic languages. First, Babby's (1987) proposal that GEN-Q is structural in Russian is shown not to extend to Serbo-Croatian, where it must be analyzed as inherent. Second, Pesetsky's (1982) idea that Russian numeral phrases may be either QPs or NPs also does not extend to Serbo-Croatian, where these are only NPs. This set of assumptions explains a range of seemingly unrelated facts about the behavior of numeral phrases in the two languages. Pesetsky's analysis is recast in terms of more recent hypotheses about phrase structure: (i) NPs are actually embedded in DPs and (ii) subjects are D-Structure VP-specifiers. Proposal (i) allows for a more explanatory analysis of GEN-Q assignment and proposal (ii) accounts for several distinctions between QP and NP subjects within Russian, also motivating the absence of these distinctions in Serbo-Croatian. Finally, it is shown that Polish can be assimilated to the proposed system.

104 citations

Book
08 Sep 2014
TL;DR: This thesis addresses the technical and linguistic aspects of discourse-level processing in phrase-based statistical machine translation (SMT) with a focus on connected texts.
Abstract: This thesis addresses the technical and linguistic aspects of discourse-level processing in phrase-based statistical machine translation (SMT). Connected texts can have complex text-level linguisti ...

104 citations

Proceedings Article
27 Jul 2011
TL;DR: This paper presents three kinds of caches to store relevant document-level information: a dynamic cache, which stores bilingual phrase pairs from the best translation hypotheses of previous sentences in the test document; a static cache,which stores relevantilingual phrase pairs extracted from similar bilingual document pairs in the training parallel corpus; and a topic cache,Which stores the target-side topic words related with the test documents in the source-side.
Abstract: Statistical machine translation systems are usually trained on a large amount of bilingual sentence pairs and translate one sentence at a time, ignoring document-level information In this paper, we propose a cache-based approach to document-level translation Since caches mainly depend on relevant data to supervise subsequent decisions, it is critical to fill the caches with highly-relevant data of a reasonable size In this paper, we present three kinds of caches to store relevant document-level information: 1) a dynamic cache, which stores bilingual phrase pairs from the best translation hypotheses of previous sentences in the test document; 2) a static cache, which stores relevant bilingual phrase pairs extracted from similar bilingual document pairs (ie source documents similar to the test document and their corresponding target documents) in the training parallel corpus; 3) a topic cache, which stores the target-side topic words related with the test document in the source-side In particular, three new features are designed to explore various kinds of document-level information in above three kinds of caches Evaluation shows the effectiveness of our cache-based approach to document-level translation with the performance improvement of 081 in BLUE score over Moses Especially, detailed analysis and discussion are presented to give new insights to document-level translation

104 citations

Journal ArticleDOI
TL;DR: The results suggest that readers encode focused information more carefully, either upon first encountering it or during a second-pass reading of it, and that the enhanced memory representations for focused information found in previous studies may be due in part to differences in reading patterns.
Abstract: In two experiments, we explored how readers encode information that is linguistically focused. Subjects read sentences in which a word or phrase was focused by a syntactic manipulation (Experiment 1) or by a preceding context (Experiment 2) while their eye movements were monitored. Readers had longer reading times while reading a region of the sentence that was focused than when the same region was not focused. The results suggest that readers encode focused information more carefully, either upon first encountering it or during a second-pass reading of it. We conclude that the enhanced memory representations for focused information found in previous studies may be due in part to differences in reading patterns for focused information.

104 citations

Proceedings ArticleDOI
26 Oct 2010
TL;DR: This paper provides a quantitative analysis of the language discrepancy issue, and explores the use of clickthrough data to bridge documents and queries, and demonstrates that standard statistical machine translation techniques can be adapted for building a better Web document retrieval system.
Abstract: Web search is challenging partly due to the fact that search queries and Web documents use different language styles and vocabularies. This paper provides a quantitative analysis of the language discrepancy issue, and explores the use of clickthrough data to bridge documents and queries. We assume that a query is parallel to the titles of documents clicked on for that query. Two translation models are trained and integrated into retrieval models: A word-based translation model that learns the translation probability between single words, and a phrase-based translation model that learns the translation probability between multi-term phrases. Experiments are carried out on a real world data set. The results show that the retrieval systems that use the translation models outperform significantly the systems that do not. The paper also demonstrates that standard statistical machine translation techniques such as word alignment, bilingual phrase extraction, and phrase-based decoding, can be adapted for building a better Web document retrieval system.

104 citations


Network Information
Related Topics (5)
Sentence
41.2K papers, 929.6K citations
92% related
Vocabulary
44.6K papers, 941.5K citations
88% related
Natural language
31.1K papers, 806.8K citations
84% related
Grammar
33.8K papers, 767.6K citations
83% related
Perception
27.6K papers, 937.2K citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023467
20221,079
2021360
2020470
2019525
2018535