scispace - formally typeset
Search or ask a question
Topic

Word order

About: Word order is a research topic. Over the lifetime, 5051 publications have been published within this topic receiving 130611 citations.


Papers
More filters
Proceedings Article
Tomas Mikolov1, Ilya Sutskever1, Kai Chen1, Greg S. Corrado1, Jeffrey Dean1 
05 Dec 2013
TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
Abstract: The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships. In this paper we present several extensions that improve both the quality of the vectors and the training speed. By subsampling of the frequent words we obtain significant speedup and also learn more regular word representations. We also describe a simple alternative to the hierarchical softmax called negative sampling. An inherent limitation of word representations is their indifference to word order and their inability to represent idiomatic phrases. For example, the meanings of "Canada" and "Air" cannot be easily combined to obtain "Air Canada". Motivated by this example, we present a simple method for finding phrases in text, and show that learning good vector representations for millions of phrases is possible.

24,012 citations

Posted Content
Tomas Mikolov1, Ilya Sutskever1, Kai Chen1, Greg S. Corrado1, Jeffrey Dean1 
TL;DR: In this paper, the Skip-gram model is used to learn high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships and improve both the quality of the vectors and the training speed.
Abstract: The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships. In this paper we present several extensions that improve both the quality of the vectors and the training speed. By subsampling of the frequent words we obtain significant speedup and also learn more regular word representations. We also describe a simple alternative to the hierarchical softmax called negative sampling. An inherent limitation of word representations is their indifference to word order and their inability to represent idiomatic phrases. For example, the meanings of "Canada" and "Air" cannot be easily combined to obtain "Air Canada". Motivated by this example, we present a simple method for finding phrases in text, and show that learning good vector representations for millions of phrases is possible.

11,343 citations

Book
01 Jan 1984
TL;DR: Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Linguistics and Philosophy, 1984.
Abstract: Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Linguistics and Philosophy, 1984.

1,086 citations

Journal ArticleDOI
TL;DR: In this paper, the authors investigated the behavior of subjects in Germanic, Celtic/Arabic, Romance, and Greek, and showed that Germanic and Greek are two major classes of move/merge X0 languages.
Abstract: The paper investigates a number of asymmetries in the behavior of subjects in Germanic, Celtic/Arabic, Romance, and Greek. The languages under investigation divide into two main groups with respect to a cluster of properties, including the availability of pro-drop with referential subjects, the possibility of VSO/VOS orders, the A/A′ status of subjects in SVO orders, the presence/absence of Definiteness Restriction (DR)-effects in unaccusative constructions, the existence of verb-raising independently of V-2, and others. We argue that the key factor in this split is a parametrization in the way the Extended Projection Principle (EPP) is checked: move/merge XP vs. move/merge X0. The first option is taken in Germanic, the second in Celtic, Greek, and Romance. According to our proposal, the EPP relates to checking of a nominal feature of AGR (cf. Chomsky 1995), and move/merge X0 languages satisfy the EPP via V-raising, as their verbal agreement morphology includes the requisite nominal feature (cf. Taraldsen 1978). Moreover, we demonstrate that the further differences that exist between Celtic/Arabic on the one hand and Romance/Greek on the other are related to the parametric availability of Spec,TP for subjects (cf. Jonas and Bobaljik 1993, Bobaljik and Jonas 1996). In Celtic and Arabic, Spec,TP for subjects is licensed, resulting in VSO orders with VP external subjects. In Greek and Romance, Spec,TP is not licensed, resulting in 'subject inverted' orders with VP internal subjects. In other words, we show that within the class of move/merge X0 languages, a further partition emerges which is due to the same parameter dividing Germanic languages into two major classes. We demonstrate that combining the proposed EPP/AGR parameter with the Spec,TP parameter gives four language-types with distinct properties.

769 citations

Journal ArticleDOI
01 Jun 1983-Language
TL;DR: This second edition has been revised and updated to take full account of new research in universals and typology in the past decade, and more generally to consider how the approach advocated here relates to recent advances in generative grammatical theory.
Abstract: Since its first publication, "Language Universals and Linguistic Typology" has become established as the leading introductory account of one of the most productive areas of linguistics-the analysis, comparison, and classification of the common features and forms of the organization of languages. Adopting an approach to the subject pioneered by Greenberg and others, Bernard Comrie is particularly concerned with syntactico-semantic universals, devoting chapters to word order, case making, relative clauses, and causative constructions. His book is informed throughout by the conviction that an exemplary account of universal properties of human language cannot restrict itself to purely formal aspects, nor focus on analysis of a single language. Rather, it must also consider language use, relate formal properties to testable claims about cognition and cognitive development, and treat data from a wide range of languages. This second edition has been revised and updated to take full account of new research in universals and typology in the past decade, and more generally to consider how the approach advocated here relates to recent advances in generative grammatical theory.

739 citations


Network Information
Related Topics (5)
Grammar
33.8K papers, 767.6K citations
91% related
Sentence
41.2K papers, 929.6K citations
86% related
Vocabulary
44.6K papers, 941.5K citations
86% related
Second-language acquisition
15.2K papers, 598.6K citations
86% related
Language acquisition
33.9K papers, 957.2K citations
85% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023122
2022263
2021148
2020173
2019200
2018189