We demonstrate that an unlexicalized PCFG can parse much more accurately than previously shown, by making use of simple, linguistically motivated state splits, which break down false independence assumptions latent in a vanilla treebank grammar. Indeed, its performance of 86.36% (LP/LR F1) is better than that of early lexicalized PCFG models, and surprisingly close to the current state-of-the-art. This result has potential uses beyond establishing a strong lower bound on the maximum possible accuracy of unlexicalized models: an unlexicalized PCFG is much more compact, easier to replicate, and easier to interpret than more complex lexical models, and the parsing algorithms are simpler, more widely understood, of lower asymptotic complexity, and easier to optimize.

/pdf/accurate-unlexicalized-parsing-5azxkuoluw.pdf

Accurate Unlexicalized Parsing

The cultural origins of human cognition

人教版高中英语新课程教材中,语言运用（Using Language）是每个单元必不可少的部分,提供了围绕单元中心话题的听、说、读、写的综合性练习,是单元中心话题的延续和升华.如何设计Using Language部分的教学,使自己的教学模式既不落俗套,又能真正体现新课程标准所倡导的教学理念,正是广大一线英语教师一直努力探索的问题.

打磨Using Language,倡导新理念

where A represents the magnetic vector potential, is an integral of the hydromagnetic equations. This -integral made it possible to formulate a variational principle for the force-free magnetic fields. The integral expresses the fact that motions cannot transform a given field in an entirely arbitrary different field, if the conductivity of the medium isconsidered infinite. In this paper we shall show that the full set of hydromagnetic equations admit five more integrals, besides the energy integral, if dissipative processes are absent. These integrals, as we shall presently verify, are I2 =fbHvdV, (2)

Proceedings of the National Academy of Sciences

In this paper, we train a semantic parser that scales up to Freebase. Instead of relying on annotated logical forms, which is especially expensive to obtain at large scale, we learn from question-answer pairs. The main challenge in this setting is narrowing down the huge number of possible logical predicates for a given question. We tackle this problem in two ways: First, we build a coarse mapping from phrases to predicates using a knowledge base and a large text corpus. Second, we use a bridging operation to generate additional predicates based on neighboring predicates. On the dataset of Cai and Yates (2013), despite not having annotated logical forms, our system outperforms their state-of-the-art parser. Additionally, we collected a more realistic and challenging dataset of question-answer pairs and improves over a natural baseline.

/pdf/semantic-parsing-on-freebase-from-question-answer-pairs-y7fve0pp42.pdf

Semantic Parsing on Freebase from Question-Answer Pairs

https://homepages.inf.ed.ac.uk/sgwater/papers/cognition-hdp.pdf

A Bayesian framework for word segmentation: Exploring the effects of context

Unsupervised learning of linguistic structure is a difficult problem. A common approach is to define a generative model and maximize the probability of the hidden structure given the observed data. Typically, this is done using maximum-likelihood estimation (MLE) of the model parameters. We show using part-of-speech tagging that a fully Bayesian approach can greatly improve performance. Rather than estimating a single set of parameters, the Bayesian approach integrates over all possible parameter values. This difference ensures that the learned structure will have high probability over a range of possible parameters, and permits the use of priors favoring the sparse distributions that are typical of natural language. Our model has the structure of a standard trigram HMM, yet its accuracy is closer to that of a state-of-the-art discriminative model (Smith and Eisner, 2005), up to 14 percentage points better than MLE. We find improvements both when training from data alone, and using a tagging dictionary.

/pdf/a-fully-bayesian-approach-to-unsupervised-part-of-speech-5cw52ycwu4.pdf

A fully Bayesian approach to unsupervised part-of-speech tagging

Developing better methods for segmenting continuous text into words is important for improving the processing of Asian languages, and may shed light on how humans learn to segment speech. We propose two new Bayesian word segmentation methods that assume unigram and bigram models of word dependencies respectively. The bigram model greatly outperforms the unigram model (and previous probabilistic models), demonstrating the importance of such dependencies for word segmentation. We also show that previous probabilistic models rely crucially on sub-optimal search procedures.

/pdf/contextual-dependencies-in-unsupervised-word-segmentation-3z9p25wpi9.pdf

Contextual Dependencies in Unsupervised Word Segmentation

This paper addresses the problem of learning to map sentences to logical form, given training data consisting of natural language sentences paired with logical representations of their meaning. Previous approaches have been designed for particular natural languages or specific meaning representations; here we present a more general method. The approach induces a probabilistic CCG grammar that represents the meaning of individual words and defines how these meanings can be combined to analyze complete sentences. We use higher-order unification to define a hypothesis space containing all grammars consistent with the training data, and develop an online learning algorithm that efficiently searches this space while simultaneously estimating the parameters of a log-linear parsing model. Experiments demonstrate high accuracy on benchmark data sets in four languages with two different meaning representations.

Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification

A weakness of standard Optimality Theory is its inability to account for grammars with free variation. We describe here the Maximum Entropy model, a general statistical model, and show how it can be applied in a constraint-based linguistic framework to model and learn grammars with free variation, as well as categorical grammars. We report the results of using the MaxEnt model for learning two different grammars: one with variation, and one without. Our results are as good as those of a previous probabilistic version of OT, the Gradual Learning Algorithm (Boersma, 1997), and we argue that our model is more general and mathematically well-motivated.

/pdf/learning-ot-constraint-rankings-using-a-maximum-entropy-2of2napc1y.pdf

Sharon Goldwater

Papers

A Bayesian framework for word segmentation: Exploring the effects of context

A fully Bayesian approach to unsupervised part-of-speech tagging

Contextual Dependencies in Unsupervised Word Segmentation

Inducing Probabilistic CCG Grammars from Logical Form with Higher-Order Unification

Learning OT constraint rankings using a maximum entropy model