scispace - formally typeset
Search or ask a question
Topic

Computational linguistics

About: Computational linguistics is a research topic. Over the lifetime, 5980 publications have been published within this topic receiving 172130 citations.


Papers
More filters
Book
28 May 1999
TL;DR: This foundational text is the first comprehensive introduction to statistical natural language processing (NLP) to appear and provides broad but rigorous coverage of mathematical and linguistic foundations, as well as detailed discussion of statistical methods, allowing students and researchers to construct their own implementations.
Abstract: Statistical approaches to processing natural language text have become dominant in recent years This foundational text is the first comprehensive introduction to statistical natural language processing (NLP) to appear The book contains all the theory and algorithms needed for building NLP tools It provides broad but rigorous coverage of mathematical and linguistic foundations, as well as detailed discussion of statistical methods, allowing students and researchers to construct their own implementations The book covers collocation finding, word sense disambiguation, probabilistic parsing, information retrieval, and other applications

9,295 citations

Book
01 Jan 2000
TL;DR: This book takes an empirical approach to language processing, based on applying statistical and other machine-learning algorithms to large corpora, to demonstrate how the same algorithm can be used for speech recognition and word-sense disambiguation.
Abstract: From the Publisher: This book takes an empirical approach to language processing, based on applying statistical and other machine-learning algorithms to large corpora.Methodology boxes are included in each chapter. Each chapter is built around one or more worked examples to demonstrate the main idea of the chapter. Covers the fundamental algorithms of various fields, whether originally proposed for spoken or written language to demonstrate how the same algorithm can be used for speech recognition and word-sense disambiguation. Emphasis on web and other practical applications. Emphasis on scientific evaluation. Useful as a reference for professionals in any of the areas of speech and language processing.

3,794 citations

Posted Content
TL;DR: NLTK, the Natural Language Toolkit, is a suite of open source program modules, tutorials and problem sets, providing ready-to-use computational linguistics courseware that covers symbolic and statistical natural language processing.
Abstract: NLTK, the Natural Language Toolkit, is a suite of open source program modules, tutorials and problem sets, providing ready-to-use computational linguistics courseware. NLTK covers symbolic and statistical natural language processing, and is interfaced to annotated corpora. Students augment and replace existing components, learn structured programming by example, and manipulate sophisticated models from the outset.

3,345 citations

Proceedings ArticleDOI
17 Jul 2006
TL;DR: The Natural Language Toolkit has been rewritten, simplifying many linguistic data structures and taking advantage of recent enhancements in the Python language.
Abstract: The Natural Language Toolkit is a suite of program modules, data sets and tutorials supporting research and teaching in computational linguistics and natural language processing. NLTK is written in Python and distributed under the GPL open source license. Over the past year the toolkit has been rewritten, simplifying many linguistic data structures and taking advantage of recent enhancements in the Python language. This paper reports on the simplified toolkit and explains how it is used in teaching NLP.

2,835 citations

01 May 1986
TL;DR: A new theory of discourse structure that stresses the role of purpose and processing in discourse is explored and various properties of discourse are described, and explanations for the behavior of cue phrases, referring expressions, and interruptions are explored.
Abstract: In this paper we explore a new theory of discourse structure that stresses the role of purpose and processing in discourse. In this theory, discourse structure is composed of three separate but interrelated components: the structure of the sequence of utterances (called the linguistic structure), a structure of purposes (called the intentional structure), and the state of focus of attention (called the attentional state). The linguistic structure consists of segments of the discourse into which the utterances naturally aggregate. The intentional structure captures the discourse-relevant purposes, expressed in each of the linguistic segments as well as relationships among them. The attentional state is an abstraction of the focus of attention of the participants as the discourse unfolds. The attentional state, being dynamic, records the objects, properties, and relations that are salient at each point of the discourse. The distinction among these components is essential to provide an adequate explanation of such discourse phenomena as cue phrases, referring expressions, and interruptions.The theory of attention, intention, and aggregation of utterances is illustrated in the paper with a number of example discourses. Various properties of discourse are described, and explanations for the behavior of cue phrases, referring expressions, and interruptions are explored.This theory provides a framework for describing the processing of utterances in a discourse. Discourse processing requires recognizing how the utterances of the discourse aggregate into segments, recognizing the intentions expressed in the discourse and the relationships among intentions, and tracking the discourse through the operation of the mechanisms associated with attentional state. This processing description specifies in these recognition tasks the role of information from the discourse and from the participants' knowledge of the domain.

2,748 citations


Network Information
Related Topics (5)
Natural language
31.1K papers, 806.8K citations
90% related
Machine translation
22.1K papers, 574.4K citations
87% related
Parsing
21.5K papers, 545.4K citations
86% related
Language model
17.5K papers, 545K citations
84% related
Semantics
24.9K papers, 653K citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202335
2022165
2021183
2020237
2019243
2018226