scispace - formally typeset
Search or ask a question
Author

Dan Jurafsky

Bio: Dan Jurafsky is an academic researcher from Stanford University. The author has contributed to research in topics: Language model & Parsing. The author has an hindex of 93, co-authored 344 publications receiving 44536 citations. Previous affiliations of Dan Jurafsky include Carnegie Mellon University & University of Colorado Boulder.


Papers
More filters
Book
01 Jan 2000
TL;DR: This book takes an empirical approach to language processing, based on applying statistical and other machine-learning algorithms to large corpora, to demonstrate how the same algorithm can be used for speech recognition and word-sense disambiguation.
Abstract: From the Publisher: This book takes an empirical approach to language processing, based on applying statistical and other machine-learning algorithms to large corpora.Methodology boxes are included in each chapter. Each chapter is built around one or more worked examples to demonstrate the main idea of the chapter. Covers the fundamental algorithms of various fields, whether originally proposed for spoken or written language to demonstrate how the same algorithm can be used for speech recognition and word-sense disambiguation. Emphasis on web and other practical applications. Emphasis on scientific evaluation. Useful as a reference for professionals in any of the areas of speech and language processing.

3,794 citations

Book
01 Dec 1999
TL;DR: It is now clear that HAL's creator, Arthur C. Clarke, was a little optimistic in predicting when an artificial agent such as HAL would be avail-able as discussed by the authors.
Abstract: is one of the most recognizablecharacters in 20th century cinema. HAL is an artificial agent capable of such advancedlanguage behavior as speaking and understanding English, and at a crucial moment inthe plot, even reading lips. It is now clear that HAL’s creator, Arthur C. Clarke, wasa little optimistic in predicting when an artificial agent such as HAL would be avail-able. But just how far off was he? What would it take to create at least the language-relatedpartsofHAL?WecallprogramslikeHALthatconversewithhumansinnatural

3,077 citations

Proceedings ArticleDOI
02 Aug 2009
TL;DR: This work investigates an alternative paradigm that does not require labeled corpora, avoiding the domain dependence of ACE-style algorithms, and allowing the use of corpora of any size.
Abstract: Modern models of relation extraction for tasks like ACE are based on supervised learning of relations from small hand-labeled corpora. We investigate an alternative paradigm that does not require labeled corpora, avoiding the domain dependence of ACE-style algorithms, and allowing the use of corpora of any size. Our experiments use Freebase, a large semantic database of several thousand relations, to provide distant supervision. For each pair of entities that appears in some Freebase relation, we find all sentences containing those entities in a large unlabeled corpus and extract textual features to train a relation classifier. Our algorithm combines the advantages of supervised IE (combining 400,000 noisy pattern features in a probabilistic classifier) and unsupervised IE (extracting large numbers of relations from large corpora of any domain). Our model is able to extract 10,000 instances of 102 relations at a precision of 67.6%. We also analyze feature performance, showing that syntactic parse features are particularly helpful for relations that are ambiguous or lexically distant in their expression.

2,965 citations

Proceedings ArticleDOI
25 Oct 2008
TL;DR: This work explores the use of Amazon's Mechanical Turk system, a significantly cheaper and faster method for collecting annotations from a broad base of paid non-expert contributors over the Web, and proposes a technique for bias correction that significantly improves annotation quality on two tasks.
Abstract: Human linguistic annotation is crucial for many natural language processing tasks but can be expensive and time-consuming. We explore the use of Amazon's Mechanical Turk system, a significantly cheaper and faster method for collecting annotations from a broad base of paid non-expert contributors over the Web. We investigate five tasks: affect recognition, word similarity, recognizing textual entailment, event temporal ordering, and word sense disambiguation. For all five, we show high agreement between Mechanical Turk non-expert annotations and existing gold standard labels provided by expert labelers. For the task of affect recognition, we also show that using non-expert labels for training machine learning algorithms can be as effective as using gold standard annotations from experts. We propose a technique for bias correction that significantly improves annotation quality on two tasks. We conclude that many large labeling tasks can be effectively designed and carried out in this method at a fraction of the usual expense.

2,237 citations

Journal ArticleDOI
TL;DR: A system for identifying the semantic relationships, or semantic roles, filled by constituents of a sentence within a semantic frame, based on statistical classifiers trained on roughly 50,000 sentences that were hand-annotated with semantic roles by the FrameNet semantic labeling project.
Abstract: We present a system for identifying the semantic relationships, or semantic roles, filled by constituents of a sentence within a semantic frame. Given an input sentence and a target word and frame, the system labels constituents with either abstract semantic roles, such as AGENT or PATIENT, or more domain-specific semantic roles, such as SPEAKER, MESSAGE, and TOPIC.The system is based on statistical classifiers trained on roughly 50,000 sentences that were hand-annotated with semantic roles by the FrameNet semantic labeling project. We then parsed each training sentence into a syntactic tree and extracted various lexical and syntactic features, including the phrase type of each constituent, its grammatical function, and its position in the sentence. These features were combined with knowledge of the predicate verb, noun, or adjective, as well as information such as the prior probabilities of various combinations of semantic roles. We used various lexical clustering algorithms to generalize across possible fillers of roles. Test sentences were parsed, were annotated with these features, and were then passed through the classifiers.Our system achieves 82% accuracy in identifying the semantic role of presegmented constituents. At the more difficult task of simultaneously segmenting constituents and identifying their semantic role, the system achieved 65% precision and 61% recall.Our study also allowed us to compare the usefulness of different features and feature combination methods in the semantic role labeling task. We also explore the integration of role labeling with statistical syntactic parsing and attempt to generalize to predicates unseen in the training data.

1,666 citations


Cited by
More filters
Book
18 Nov 2016
TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.
Abstract: Deep learning is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts. Because the computer gathers knowledge from experience, there is no need for a human computer operator to formally specify all the knowledge that the computer needs. The hierarchy of concepts allows the computer to learn complicated concepts by building them out of simpler ones; a graph of these hierarchies would be many layers deep. This book introduces a broad range of topics in deep learning. The text offers mathematical and conceptual background, covering relevant concepts in linear algebra, probability theory and information theory, numerical computation, and machine learning. It describes deep learning techniques used by practitioners in industry, including deep feedforward networks, regularization, optimization algorithms, convolutional networks, sequence modeling, and practical methodology; and it surveys such applications as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames. Finally, the book offers research perspectives, covering such theoretical topics as linear factor models, autoencoders, representation learning, structured probabilistic models, Monte Carlo methods, the partition function, approximate inference, and deep generative models. Deep Learning can be used by undergraduate or graduate students planning careers in either industry or research, and by software engineers who want to begin using deep learning in their products or platforms. A website offers supplementary material for both readers and instructors.

38,208 citations

Journal Article
TL;DR: This book by a teacher of statistics (as well as a consultant for "experimenters") is a comprehensive study of the philosophical background for the statistical design of experiment.
Abstract: THE DESIGN AND ANALYSIS OF EXPERIMENTS. By Oscar Kempthorne. New York, John Wiley and Sons, Inc., 1952. 631 pp. $8.50. This book by a teacher of statistics (as well as a consultant for \"experimenters\") is a comprehensive study of the philosophical background for the statistical design of experiment. It is necessary to have some facility with algebraic notation and manipulation to be able to use the volume intelligently. The problems are presented from the theoretical point of view, without such practical examples as would be helpful for those not acquainted with mathematics. The mathematical justification for the techniques is given. As a somewhat advanced treatment of the design and analysis of experiments, this volume will be interesting and helpful for many who approach statistics theoretically as well as practically. With emphasis on the \"why,\" and with description given broadly, the author relates the subject matter to the general theory of statistics and to the general problem of experimental inference. MARGARET J. ROBERTSON

13,333 citations

Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

01 Jan 2002

9,314 citations

Book
28 May 1999
TL;DR: This foundational text is the first comprehensive introduction to statistical natural language processing (NLP) to appear and provides broad but rigorous coverage of mathematical and linguistic foundations, as well as detailed discussion of statistical methods, allowing students and researchers to construct their own implementations.
Abstract: Statistical approaches to processing natural language text have become dominant in recent years This foundational text is the first comprehensive introduction to statistical natural language processing (NLP) to appear The book contains all the theory and algorithms needed for building NLP tools It provides broad but rigorous coverage of mathematical and linguistic foundations, as well as detailed discussion of statistical methods, allowing students and researchers to construct their own implementations The book covers collocation finding, word sense disambiguation, probabilistic parsing, information retrieval, and other applications

9,295 citations