scispace - formally typeset
Search or ask a question
Topic

Phrase

About: Phrase is a research topic. Over the lifetime, 12580 publications have been published within this topic receiving 317823 citations. The topic is also known as: syntagma & phrases.


Papers
More filters
Proceedings Article
19 Jun 2011
TL;DR: An unsupervised model for joint phrase alignment and extraction using non-parametric Bayesian methods and inversion transduction grammars (ITGs) is presented, which matches the accuracy of traditional two-step word alignment/phrase extraction approach while reducing the phrase table to a fraction of the original size.
Abstract: We present an unsupervised model for joint phrase alignment and extraction using non-parametric Bayesian methods and inversion transduction grammars (ITGs). The key contribution is that phrases of many granularities are included directly in the model through the use of a novel formulation that memorizes phrases generated not only by terminal, but also non-terminal symbols. This allows for a completely probabilistic model that is able to create a phrase table that achieves competitive accuracy on phrase-based machine translation tasks directly from unaligned sentence pairs. Experiments on several language pairs demonstrate that the proposed model matches the accuracy of traditional two-step word alignment/phrase extraction approach while reducing the phrase table to a fraction of the original size.

90 citations

Patent
13 Sep 1996
TL;DR: In this paper, a system, method, and program enables construction of statements, including queries, programs, and commands, by using drag and drop templates, which are generated and displayed to a user.
Abstract: The system, method, and program of this invention enables construction of statements, including queries, programs, and commands, by using drag and drop templates. A predefined phrase template, which is generated and displayed to a user, imposes syntactic and semantic constraints in constructing the statement. Objects representing entities and objects representing subphrases can be dragged and dropped onto phrase receptacles within the phrase and subphrase templates. Complex statements can be constructed from a nesting of subphrases using the drag and drop technique. The constructed statement is displayed to the user along with the subphrase structure of the statement through nested panels.

90 citations

Proceedings Article
01 Apr 2007
TL;DR: Unlike existing methods for phrase classification, the proposed method can classify phrases consisting of unseen words and is proposed to use unlabeled data for a seed set of probability computation.
Abstract: We propose a method for extracting semantic orientations of phrases (pairs of an adjective and a noun): positive, negative, or neutral Given an adjective, the semantic orientation classification of phrases can be reduced to the classification of words We construct a lexical network by connecting similar/related words In the network, each node has one of the three orientation values and the neighboring nodes tend to have the same value We adopt the Potts model for the probability model of the lexical network For each adjective, we estimate the states of the nodes, which indicate the semantic orientations of the adjective-noun pairs Unlike existing methods for phrase classification, the proposed method can classify phrases consisting of unseen words We also propose to use unlabeled data for a seed set of probability computation Empirical evaluation shows the effectiveness of the proposed method

90 citations

Journal ArticleDOI
TL;DR: Experiments show that tweet segmentation quality is significantly improved by learning both global and local contexts compared with using global context alone, and that high accuracy is achieved in named entity recognition by applying segment-based part-of-speech (POS) tagging.
Abstract: Twitter has attracted millions of users to share and disseminate most up-to-date information, resulting in large volumes of data produced everyday However, many applications in Information Retrieval (IR) and Natural Language Processing (NLP) suffer severely from the noisy and short nature of tweets In this paper, we propose a novel framework for tweet segmentation in a batch mode, called HybridSeg By splitting tweets into meaningful segments, the semantic or context information is well preserved and easily extracted by the downstream applications HybridSeg finds the optimal segmentation of a tweet by maximizing the sum of the stickiness scores of its candidate segments The stickiness score considers the probability of a segment being a phrase in English (ie, global context) and the probability of a segment being a phrase within the batch of tweets (ie, local context) For the latter, we propose and evaluate two models to derive local context by considering the linguistic features and term-dependency in a batch of tweets, respectively HybridSeg is also designed to iteratively learn from confident segments as pseudo feedback Experiments on two tweet data sets show that tweet segmentation quality is significantly improved by learning both global and local contexts compared with using global context alone Through analysis and comparison, we show that local linguistic features are more reliable for learning local context compared with term-dependency As an application, we show that high accuracy is achieved in named entity recognition by applying segment-based part-of-speech (POS) tagging

89 citations

Journal ArticleDOI
TL;DR: It is argued that this type of allomorphy is not conditioned by a syntactic adjacency condition, and is found when the head and phrase in question are contained in the same prosodic phrase at the interface that connects syntax and phonology (PF).
Abstract: This paper deals with a class of morphological alternations that seem to involve syntactic adjacency. More specifically, it deals with alternative realizations of syntactic terminals that occur when a particular phrase immediately follows a particular head. We argue that this type of allomorphy is not conditioned by a syntactic adjacency condition. Instead, it is found when the head and phrase in question are contained in the same prosodic phrase at the interface that connects syntax and phonology (PF). We illustrate our approach with six case studies, concerning agreement weakening in Dutch and Arabic, pronoun weakening in Middle Dutch and Celtic, and pro-drop in Old French and Arabic.

89 citations


Network Information
Related Topics (5)
Sentence
41.2K papers, 929.6K citations
92% related
Vocabulary
44.6K papers, 941.5K citations
88% related
Natural language
31.1K papers, 806.8K citations
84% related
Grammar
33.8K papers, 767.6K citations
83% related
Perception
27.6K papers, 937.2K citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023467
20221,079
2021360
2020470
2019525
2018535