Top 7 papers published by Hanxiao Liu from Google in 2016

Posted Content•

Gated-Attention Readers for Text Comprehension

[...]

Bhuwan Dhingra¹, Hanxiao Liu², Zhilin Yang², William W. Cohen², Ruslan Salakhutdinov² - Show less +1 more•Institutions (2)

Microsoft¹, Carnegie Mellon University²

05 Jun 2016-arXiv: Computation and Language

TL;DR: The model, the Gated-Attention (GA) Reader, integrates a multi-hop architecture with a novel attention mechanism, which is based on multiplicative interactions between the query embedding and the intermediate states of a recurrent neural network document reader, which enables the reader to build query-specific representations of tokens in the document for accurate answer selection.

...read moreread less

Abstract: In this paper we study the problem of answering cloze-style questions over documents. Our model, the Gated-Attention (GA) Reader, integrates a multi-hop architecture with a novel attention mechanism, which is based on multiplicative interactions between the query embedding and the intermediate states of a recurrent neural network document reader. This enables the reader to build query-specific representations of tokens in the document for accurate answer selection. The GA Reader obtains state-of-the-art results on three benchmarks for this task--the CNN \& Daily Mail news stories and the Who Did What dataset. The effectiveness of multiplicative interaction is demonstrated by an ablation study, and by comparing to alternative compositional operators for implementing the gated-attention. The code is available at this https URL.

...read moreread less

89 citations

Journal Article•DOI•

Learning concept graphs from online educational data

[...]

Hanxiao Liu¹, Wanli Ma¹, Yiming Yang¹, Jaime G. Carbonell¹•Institutions (1)

Carnegie Mellon University¹

01 Jan 2016-Journal of Artificial Intelligence Research

TL;DR: This paper addresses an open challenge in educational data mining, i.e., the problem of automatically mapping online courses from different providers onto a universal space of concepts, and predicting latent prerequisite dependencies among both concepts and courses, with a novel approach for inference within and across course-level and concept-level directed graphs.

...read moreread less

Abstract: This paper addresses an open challenge in educational data mining, i.e., the problem of automatically mapping online courses from different providers (universities, MOOCs, etc.) onto a universal space of concepts, and predicting latent prerequisite dependencies (directed links) among both concepts and courses. We propose a novel approach for inference within and across course-level and concept-level directed graphs. In the training phase, our system projects partially observed course-level prerequisite links onto directed concept-level links; in the testing phase, the induced concept-level links are used to infer the unknown course-level prerequisite links. Whereas courses may be specific to one institution, concepts are shared across different providers. The bi-directional mappings enable our system to perform interlingua-style transfer learning, e.g. treating the concept graph as the interlingua and transferring the prerequisite relations across universities via the interlingua. Experiments on our newly collected datasets of courses from MIT, Caltech, Princeton and CMU show promising results.

...read moreread less

54 citations

Proceedings Article•

Adaptive Smoothed Online Multi-Task Learning

[...]

Keerthiram Murugesan¹, Hanxiao Liu¹, Jaime G. Carbonell¹, Yiming Yang¹•Institutions (1)

Carnegie Mellon University¹

01 Jan 2016

TL;DR: This paper addresses the challenge of jointly learning both the per-task model parameters and the inter-task relationships in a multi-task online learning setting with probabilistic interpretation, efficient updating rules and flexible modulation on whether learners focus on their specific task or on jointly address all tasks.

...read moreread less

Abstract: This paper addresses the challenge of jointly learning both the per-task model parameters and the inter-task relationships in a multi-task online learning setting. The proposed algorithm features probabilistic interpretation, efficient updating rules and flexible modulation on whether learners focus on their specific task or on jointly address all tasks. The paper also proves a sub-linear regret bound as compared to the best linear predictor in hindsight. Experiments over three multi-task learning benchmark datasets show advantageous performance of the proposed approach over several state-of-the-art online multi-task learning baselines.

...read moreread less

36 citations

Proceedings Article•DOI•

Cross-lingual Text Classification via Model Translation with Limited Dictionaries

[...]

Ruochen Xu¹, Yiming Yang¹, Hanxiao Liu¹, Andrew Hsi¹•Institutions (1)

Carnegie Mellon University¹

24 Oct 2016

TL;DR: Two new approaches that combines unsupervised word embedding in different languages, supervised mapping of embedded words across languages, and probabilistic translation of classification models are proposed that show significant performance improvement in CLTC.

...read moreread less

Abstract: Cross-lingual text classification (CLTC) refers to the task of classifying documents in different languages into the same taxonomy of categories. An open challenge in CLTC is to classify documents for the languages where labeled training data are not available. Existing approaches rely on the availability of either high-quality machine translation of documents (to the languages where massively training data are available), or rich bilingual dictionaries for effective translation of trained classification models (to the languages where labeled training data are lacking). This paper studies the CLTC challenge under the assumption that neither condition is met. That is, we focus on the problem of translating classification models with highly incomplete bilingual dictionaries. Specifically, we propose two new approaches that combines unsupervised word embedding in different languages, supervised mapping of embedded words across languages, and probabilistic translation of classification models. The approaches show significant performance improvement in CLTC on a benchmark corpus of Reuters news stories (RCV1/RCV2) in English, Spanish, German, French and Chinese and an internal dataset in Uzbek, compared to representative baseline methods using conventional bilingual dictionaries or highly incomplete ones.

...read moreread less

23 citations

Posted Content•

Cross-Graph Learning of Multi-Relational Associations

[...]

Hanxiao Liu¹, Yiming Yang¹•Institutions (1)

Carnegie Mellon University¹

06 May 2016-arXiv: Learning

TL;DR: In this article, a convex optimization framework for cross-graph relational learning is proposed, which enables transductive learning using both labeled and unlabeled tuples, and offers a scalable algorithm that guarantees the optimal solution and enjoys a linear time complexity.

...read moreread less

Abstract: Cross-graph Relational Learning (CGRL) refers to the problem of predicting the strengths or labels of multi-relational tuples of heterogeneous object types, through the joint inference over multiple graphs which specify the internal connections among each type of objects. CGRL is an open challenge in machine learning due to the daunting number of all possible tuples to deal with when the numbers of nodes in multiple graphs are large, and because the labeled training instances are extremely sparse as typical. Existing methods such as tensor factorization or tensor-kernel machines do not work well because of the lack of convex formulation for the optimization of CGRL models, the poor scalability of the algorithms in handling combinatorial numbers of tuples, and/or the non-transductive nature of the learning methods which limits their ability to leverage unlabeled data in training. This paper proposes a novel framework which formulates CGRL as a convex optimization problem, enables transductive learning using both labeled and unlabeled tuples, and offers a scalable algorithm that guarantees the optimal solution and enjoys a linear time complexity with respect to the sizes of input graphs. In our experiments with a subset of DBLP publication records and an Enzyme multi-source dataset, the proposed method successfully scaled to the large cross-graph inference problem, and outperformed other representative approaches significantly.

...read moreread less

8 citations

Proceedings Article•

Cross-graph learning of multi-relational associations

[...]

Hanxiao Liu¹, Yiming Yang¹•Institutions (1)

Carnegie Mellon University¹

19 Jun 2016

TL;DR: A novel framework is proposed which formulates CGRL as a convex optimization problem, enables transductive learning using both labeled and unlabeled tuples, and offers a scalable algorithm that guarantees the optimal solution and enjoys a linear time complexity with respect to the sizes of input graphs.

...read moreread less

Abstract: Cross-graph Relational Learning (CGRL) refers to the problem of predicting the strengths or labels of multi-relational tuples of heterogeneous object types, through the joint inference over multiple graphs which specify the internal connections among each type of objects. CGRL is an open challenge in machine learning due to the daunting number of all possible tuples to deal with when the numbers of nodes in multiple graphs are large, and because the labeled training instances are extremely sparse as typical. Existing methods such as tensor factorization or tensor-kernel machines do not work well because of the lack of convex formulation for the optimization of CGRL models, the poor scalability of the algorithms in handling combinatorial numbers of tuples, and/or the non-transductive nature of the learning methods which limits their ability to leverage unlabeled data in training. This paper proposes a novel framework which formulates CGRL as a convex optimization problem, enables transductive learning using both labeled and unlabeled tuples, and offers a scalable algorithm that guarantees the optimal solution and enjoys a linear time complexity with respect to the sizes of input graphs. In our experiments with a subset of DBLP publication records and an Enzyme multi-source dataset, the proposed method successfully scaled to the large cross-graph inference problem, and outperformed other representative approaches significantly.

...read moreread less

7 citations

Proceedings Article•

Semi-Supervised Learning with Adaptive Spectral Transform

[...]

Hanxiao Liu¹, Yiming Yang¹•Institutions (1)

Carnegie Mellon University¹

02 May 2016

TL;DR: A novel nonparametric framework for semi-supervised learning and for optimizing the Laplacian spectrum of the data manifold simultaneously and can be interpreted as to asymptotically minimize the generalization error bound of semi- supervised learning with respect to the graph spectrum.

...read moreread less

Abstract: This paper proposes a novel nonparametric framework for semi-supervised learning and for optimizing the Laplacian spectrum of the data manifold simultaneously. Our formulation leads to a convex optimization problem that can be efficiently solved via the bundle method, and can be interpreted as to asymptotically minimize the generalization error bound of semi-supervised learning with respect to the graph spectrum. Experiments over benchmark datasets in various domains show advantageous performance of the proposed method over strong baselines.

...read moreread less

1 citations

Showing papers by "Hanxiao Liu published in 2016"