Natural Language Processing (Almost) from Scratch

Open AccessJournal Article

Natural Language Processing (Almost) from Scratch

Ronan Collobert, +5 more

- 01 Feb 2011 -

Journal of Machine Learning Research

- Vol. 12, Iss: 76, pp 2493-2537

Chats0

TLDR

A unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including part-of-speech tagging, chunking, named entity recognition, and semantic role labeling is proposed.

Abstract:

We propose a unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including part-of-speech tagging, chunking, named entity recognition, and semantic role labeling. This versatility is achieved by trying to avoid task-specific engineering and therefore disregarding a lot of prior knowledge. Instead of exploiting man-made input features carefully optimized for each task, our system learns internal representations on the basis of vast amounts of mostly unlabeled training data. This work is then used as a basis for building a freely available tagging system with good performance and minimal computational requirements.

Citations

PDF

Open Access

More filters

Book

Deep Learning

Ian Goodfellow, +2 more

TL;DR: Deep learning as mentioned in this paper is a form of machine learning that enables computers to learn from experience and understand the world in terms of a hierarchy of concepts, and it is used in many applications such as natural language processing, speech recognition, computer vision, online recommendation systems, bioinformatics, and videogames.

...read moreread less

Proceedings ArticleDOI

Glove: Global Vectors for Word Representation

Jeffrey Pennington, +2 more

TL;DR: A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.

...read moreread less

Posted Content

Efficient Estimation of Word Representations in Vector Space

Tomas Mikolov, +3 more

- 16 Jan 2013 -

arXiv: Computation and Language

TL;DR: This paper proposed two novel model architectures for computing continuous vector representations of words from very large data sets, and the quality of these representations is measured in a word similarity task and the results are compared to the previously best performing techniques based on different types of neural networks.

...read moreread less

Journal ArticleDOI

Machine learning

Thomas G. Dietterich

- 01 Dec 1996 -

ACM Computing Surveys

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

...read moreread less

Journal ArticleDOI

Representation Learning: A Review and New Perspectives

Yoshua Bengio, +2 more

- 01 Aug 2013 -

IEEE Transactions on Pattern Analysis an...

TL;DR: Recent work in the area of unsupervised feature learning and deep learning is reviewed, covering advances in probabilistic models, autoencoders, manifold learning, and deep networks.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Early results for named entity recognition with conditional random fields, feature induction and web-enhanced lexicons

Andrew McCallum, +1 more

TL;DR: This work has shown that conditionally-trained models, such as conditional maximum entropy models, handle inter-dependent features of greedy sequence modeling in NLP well.

...read moreread less

Book ChapterDOI

Probabilistic Interpretation of Feedforward Classification Network Outputs, with Relationships to Statistical Pattern Recognition

John S. Bridle

TL;DR: In this article, the outputs of the network are treated as probabilities of alternatives (e.g. pattern classes), conditioned on the inputs, and two modifications are proposed: probability scoring, which is an alternative to squared error minimisation, and a normalised exponential (softmax) multi-input generalisation of the logistic nonlinearity.

...read moreread less

Journal ArticleDOI

Continuous speech recognition by statistical methods

Frederick Jelinek

TL;DR: Experimental results are presented that indicate the power of the methods and concern modeling of a speaker and of an acoustic processor, extraction of the models' statistical parameters and hypothesis search procedures and likelihood computations of linguistic decoding.

...read moreread less

Proceedings ArticleDOI

Automatic labeling of semantic roles

Daniel Gildea, +1 more

TL;DR: This work presents a system for identifying the semantic relationships, or semantic roles, filled by constituents of a sentence within a semantic frame, derived from parse trees and hand-annotated training data.

...read moreread less

Proceedings ArticleDOI

Dynamic conditional random fields: factorized probabilistic models for labeling and segmenting sequence data

Charles Sutton, +2 more

TL;DR: In this paper, a generalization of linear-chain CRFs, called dynamic conditional random fields (DCRFs), is proposed, in which each time slice contains a set of state variables and edges and parameters are tied across slices.

...read moreread less