scispace - formally typeset
Search or ask a question
Author

Nathan Schneider

Bio: Nathan Schneider is an academic researcher from Georgetown University. The author has contributed to research in topics: Parsing & Annotation. The author has an hindex of 25, co-authored 130 publications receiving 5138 citations. Previous affiliations of Nathan Schneider include University of Washington & Carnegie Mellon University.


Papers
More filters
Proceedings Article
01 Aug 2013
TL;DR: A sembank of simple, whole-sentence semantic structures will spur new work in statistical natural language understanding and generation, like the Penn Treebank encouraged work on statistical parsing.
Abstract: We describe Abstract Meaning Representation (AMR), a semantic representation language in which we are writing down the meanings of thousands of English sentences. We hope that a sembank of simple, whole-sentence semantic structures will spur new work in statistical natural language understanding and generation, like the Penn Treebank encouraged work on statistical parsing. This paper gives an overview of AMR and tools associated with it.

1,197 citations

Proceedings ArticleDOI
19 Jun 2011
TL;DR: A tagset is developed, data is annotated, features are developed, and results nearing 90% accuracy are reported on the problem of part-of-speech tagging for English data from the popular micro-blogging service Twitter.
Abstract: We address the problem of part-of-speech tagging for English data from the popular micro-blogging service Twitter. We develop a tagset, annotate data, develop features, and report tagging results nearing 90% accuracy. The data and tools have been made available to the research community with the goal of enabling richer text analysis of Twitter and related social media data sets.

1,053 citations

Proceedings Article
01 Jun 2013
TL;DR: This work systematically evaluates the use of large-scale unsupervised word clustering and new lexical features to improve tagging accuracy on Twitter and achieves state-of-the-art tagging results on both Twitter and IRC POS tagging tasks.
Abstract: We consider the problem of part-of-speech tagging for informal, online conversational text. We systematically evaluate the use of large-scale unsupervised word clustering and new lexical features to improve tagging accuracy. With these features, our system achieves state-of-the-art tagging results on both Twitter and IRC POS tagging tasks; Twitter tagging is improved from 90% to 93% accuracy (more than 3% absolute). Qualitative analysis of these word clusters yields insights about NLP and linguistic phenomena in this genre. Additionally, we contribute the first POS annotation guidelines for such text and release a new dataset of English language tweets annotated using these guidelines. Tagging software, annotation guidelines, and large-scale word clusters are available at: http://www.ark.cs.cmu.edu/TweetNLP This paper describes release 0.3 of the “CMU Twitter Part-of-Speech Tagger” and annotated data. [This paper is forthcoming in Proceedings of NAACL 2013; Atlanta, GA, USA.]

780 citations

Journal ArticleDOI
TL;DR: A two-stage statistical model that takes lexical targets in their sentential contexts and predicts frame-semantic structures and results in qualitatively better structures than naïve local predictors, which outperforms the prior state of the art by significant margins.
Abstract: Frame semantics is a linguistic theory that has been instantiated for English in the FrameNet lexicon. We solve the problem of frame-semantic parsing using a two-stage statistical model that takes lexical targets i.e., content words and phrases in their sentential contexts and predicts frame-semantic structures. Given a target in context, the first stage disambiguates it to a semantic frame. This model uses latent variables and semi-supervised learning to improve frame disambiguation for targets unseen at training time. The second stage finds the target's locally expressed semantic arguments. At inference time, a fast exact dual decomposition algorithm collectively predicts all the arguments of a frame at once in order to respect declaratively stated linguistic constraints, resulting in qualitatively better structures than nave local predictors. Both components are feature-based and discriminatively trained on a small set of annotated frame-semantic parses. On the SemEval 2007 benchmark data set, the approach, along with a heuristic identifier of frame-evoking targets, outperforms the prior state of the art by significant margins. Additionally, we present experiments on the much larger FrameNet 1.5 data set. We have released our frame-semantic parser as open-source software.

257 citations

Proceedings ArticleDOI
01 Oct 2014
TL;DR: A new dependency parser for English tweets, TWEEBOPARSER, which builds on several contributions: new syntactic annotations for a corpus of tweets, with conventions informed by the domain; adaptations to a statistical parsing algorithm; and a new approach to exploiting out-of-domain Penn Treebank data.
Abstract: We describe a new dependency parser for English tweets, TWEEBOPARSER. The parser builds on several contributions: new syntactic annotations for a corpus of tweets (TWEEBANK), with conventions informed by the domain; adaptations to a statistical parsing algorithm; and a new approach to exploiting out-of-domain Penn Treebank data. Our experiments show that the parser achieves over 80% unlabeled attachment accuracy on our new, high-quality test set and measure the benefit of our contributions. Our dataset and parser can be found at http://www.ark.cs.cmu.edu/TweetNLP.

227 citations


Cited by
More filters
Proceedings Article
07 Dec 2015
TL;DR: A new methodology is defined that resolves this bottleneck and provides large scale supervised reading comprehension data that allows a class of attention based deep neural networks that learn to read real documents and answer complex questions with minimal prior knowledge of language structure to be developed.
Abstract: Teaching machines to read natural language documents remains an elusive challenge. Machine reading systems can be tested on their ability to answer questions posed on the contents of documents that they have seen, but until now large scale training and test datasets have been missing for this type of evaluation. In this work we define a new methodology that resolves this bottleneck and provides large scale supervised reading comprehension data. This allows us to develop a class of attention based deep neural networks that learn to read real documents and answer complex questions with minimal prior knowledge of language structure.

2,951 citations

Posted Content
TL;DR: An empirical evaluation of the GELU nonlinearity against the ReLU and ELU activations is performed and performance improvements are found across all considered computer vision, natural language processing, and speech tasks.
Abstract: We propose the Gaussian Error Linear Unit (GELU), a high-performing neural network activation function. The GELU activation function is $x\Phi(x)$, where $\Phi(x)$ the standard Gaussian cumulative distribution function. The GELU nonlinearity weights inputs by their value, rather than gates inputs by their sign as in ReLUs ($x\mathbf{1}_{x>0}$). We perform an empirical evaluation of the GELU nonlinearity against the ReLU and ELU activations and find performance improvements across all considered computer vision, natural language processing, and speech tasks.

2,059 citations

01 Jan 2005
TL;DR: In “Constructing a Language,” Tomasello presents a contrasting theory of how the child acquires language: It is not a universal grammar that allows for language development, but two sets of cognitive skills resulting from biological/phylogenetic adaptations are fundamental to the ontogenetic origins of language.
Abstract: Child psychiatrists, pediatricians, and other child clinicians need to have a solid understanding of child language development. There are at least four important reasons that make this necessary. First, slowing, arrest, and deviation of language development are highly associated with, and complicate the course of, child psychopathology. Second, language competence plays a crucial role in emotional and mood regulation, evaluation, and therapy. Third, language deficits are the most frequent underpinning of the learning disorders, ubiquitous in our clinical populations. Fourth, clinicians should not confuse the rich linguistic and dialectal diversity of our clinical populations with abnormalities in child language development. The challenge for the clinician becomes, then, how to get immersed in the captivating field of child language acquisition without getting overwhelmed by its conceptual and empirical complexity. In the past 50 years and since the seminal works of Roger Brown, Jerome Bruner, and Catherine Snow, child language researchers (often known as developmental psycholinguists) have produced a remarkable body of knowledge. Linguists such as Chomsky and philosophers such as Grice have strongly influenced the science of child language. One of the major tenets of Chomskian linguistics (known as generative grammar) is that children’s capacity to acquire language is “hardwired” with “universal grammar”—an innate language acquisition device (LAD), a language “instinct”—at its core. This view is in part supported by the assertion that the linguistic input that children receive is relatively dismal and of poor quality relative to the high quantity and quality of output that they manage to produce after age 2 and that only an advanced, innate capacity to decode and organize linguistic input can enable them to “get from here (prelinguistic infant) to there (linguistic child).” In “Constructing a Language,” Tomasello presents a contrasting theory of how the child acquires language: It is not a universal grammar that allows for language development. Rather, human cognition universals of communicative needs and vocal-auditory processing result in some language universals, such as nouns and verbs as expressions of reference and predication (p. 19). The author proposes that two sets of cognitive skills resulting from biological/phylogenetic adaptations are fundamental to the ontogenetic origins of language. These sets of inherited cognitive skills are intentionreading on the one hand and pattern-finding, on the other. Intention-reading skills encompass the prelinguistic infant’s capacities to share attention to outside events with other persons, establishing joint attentional frames, to understand other people’s communicative intentions, and to imitate the adult’s communicative intentions (an intersubjective form of imitation that requires symbolic understanding and perspective-taking). Pattern-finding skills include the ability of infants as young as 7 months old to analyze concepts and percepts (most relevant here, auditory or speech percepts) and create concrete or abstract categories that contain analogous items. Tomasello, a most prominent developmental scientist with research foci on child language acquisition and on social cognition and social learning in children and primates, succinctly and clearly introduces the major points of his theory and his views on the origins of language in the initial chapters. In subsequent chapters, he delves into the details by covering most language acquisition domains, namely, word (lexical) learning, syntax, and morphology and conversation, narrative, and extended discourse. Although one of the remaining domains (pragmatics) is at the core of his theory and permeates the text throughout, the relative paucity of passages explicitly devoted to discussing acquisition and proBOOK REVIEWS

1,757 citations

Journal ArticleDOI
01 Jan 2003

1,739 citations

Proceedings Article
27 Jul 2011
TL;DR: The novel T-ner system doubles F1 score compared with the Stanford NER system, and leverages the redundancy inherent in tweets to achieve this performance, using LabeledLDA to exploit Freebase dictionaries as a source of distant supervision.
Abstract: People tweet more than 100 Million times daily, yielding a noisy, informal, but sometimes informative corpus of 140-character messages that mirrors the zeitgeist in an unprecedented manner. The performance of standard NLP tools is severely degraded on tweets. This paper addresses this issue by re-building the NLP pipeline beginning with part-of-speech tagging, through chunking, to named-entity recognition. Our novel T-ner system doubles F1 score compared with the Stanford NER system. T-ner leverages the redundancy inherent in tweets to achieve this performance, using LabeledLDA to exploit Freebase dictionaries as a source of distant supervision. LabeledLDA outperforms co-training, increasing F1 by 25% over ten common entity types. Our NLP tools are available at: http://github.com/aritter/twitter_nlp

1,351 citations