scispace - formally typeset
Open AccessProceedings ArticleDOI

Unsupervised POS Induction with Word Embeddings

TLDR
This paper showed that word embeddings can also add value to the problem of unsupervised POS induction, replacing multinomial distributions over the vocabulary with multivariate Gaussian distributions over word embedding and observe consistent improvements in eight languages.
Abstract
Unsupervised word embeddings have been shown to be valuable as features in supervised learning problems; however, their role in unsupervised problems has been less thoroughly explored. In this paper, we show that embeddings can likewise add value to the problem of unsupervised POS induction. In two representative models of POS induction, we replace multinomial distributions over the vocabulary with multivariate Gaussian distributions over word embeddings and observe consistent improvements in eight languages. We also analyze the e ect of various choices while inducing word embeddings on “downstream” POS induction results.

read more

Content maybe subject to copyright    Report

Citations
More filters
Book Chapter

Redefining Context Windows for Word Embedding Models: An Experimental Study

TL;DR: This paper presented a systematic analysis of context windows based on a set of four distinct hyper-parameters, and trained continuous Skip-gram models on two English-language corpora for various combinations of these hyperparameters and evaluated them on both lexical similarity and analogy tasks.
Proceedings ArticleDOI

Mutual Information Maximization for Simple and Accurate Part-Of-Speech Induction

Karl Stratos
TL;DR: This work addresses part-of-speech (POS) induction by maximizing the mutual information between the induced label and its context and shows that the variational lower bound is robust whereas the generalized Brown objective is vulnerable.
Journal ArticleDOI

Sparse Coding of Neural Word Embeddings for Multilingual Sequence Labeling

TL;DR: A sequence labeling framework which solely utilizes sparse indicator features derived from dense distributed word representations, which obtains (near) state-of-the art performance for both part- of-speech tagging and named entity recognition for a variety of languages.

Normalization and parsing algorithms for uncertain input

TL;DR: This research attempts to improve the automatic analysis of spontaneous language by translating it to 'normal' language by developing a modular normalization model, MoNoise, which reaches a new state-of-art performance on a variety of languages.
Journal ArticleDOI

Learning With Annotation of Various Degrees

TL;DR: A novel deep conditional random field model is proposed which utilizes an end-to-end learning manner to smoothly handle fully/un/partially labeled sequences within a unified framework and could be one of the first works to utilize the partially labeled instance for sequence labeling.
References
More filters
Posted Content

Efficient Estimation of Word Representations in Vector Space

TL;DR: This paper proposed two novel model architectures for computing continuous vector representations of words from very large data sets, and the quality of these representations is measured in a word similarity task and the results are compared to the previously best performing techniques based on different types of neural networks.
Journal ArticleDOI

Indexing by Latent Semantic Analysis

TL;DR: A new method for automatic indexing and retrieval to take advantage of implicit higher-order structure in the association of terms with documents (“semantic structure”) in order to improve the detection of relevant documents on the basis of terms found in queries.
Journal Article

Natural Language Processing (Almost) from Scratch

TL;DR: A unified neural network architecture and learning algorithm that can be applied to various natural language processing tasks including part-of-speech tagging, chunking, named entity recognition, and semantic role labeling is proposed.
Proceedings ArticleDOI

A unified architecture for natural language processing: deep neural networks with multitask learning

TL;DR: This work describes a single convolutional neural network architecture that, given a sentence, outputs a host of language processing predictions: part-of-speech tags, chunks, named entity tags, semantic roles, semantically similar words and the likelihood that the sentence makes sense using a language model.
Journal ArticleDOI

Class-based n -gram models of natural language

TL;DR: This work addresses the problem of predicting a word from previous words in a sample of text and discusses n-gram models based on classes of words, finding that these models are able to extract classes that have the flavor of either syntactically based groupings or semanticallybased groupings, depending on the nature of the underlying statistics.
Related Papers (5)