scispace - formally typeset
Z

Zhilin Yang

Researcher at Carnegie Mellon University

Publications -  69
Citations -  16355

Zhilin Yang is an academic researcher from Carnegie Mellon University. The author has contributed to research in topics: Computer science & Language model. The author has an hindex of 31, co-authored 57 publications receiving 11112 citations. Previous affiliations of Zhilin Yang include Tsinghua University.

Papers
More filters
Proceedings ArticleDOI

Controllable Generation from Pre-trained Language Models via Inverse Prompting

TL;DR: This article proposed an innovative method, inverse prompting, to better control text generation, which uses generated text to inversely predict the prompt during beam search, which enhances the relevance between the prompt and the generated text and provides better controllability.
Posted Content

Differentiable Learning of Logical Rules for Knowledge Base Completion.

TL;DR: A neural controller system which learns how to sequentially compose the these primitive differentiable operations to solve reasoning tasks, and in particular, to perform knowledge base completion is described.
Proceedings Article

Mastering the Dungeon: Grounded Language Learning by Mechanical Turker Descent.

TL;DR: This work proposes an interactive learning procedure called Mechanical Turker Descent (MTD) and uses it to train agents to execute natural language commands grounded in a fantasy text adventure game.
Proceedings ArticleDOI

Controllable Generation from Pre-trained Language Models via Inverse Prompting

TL;DR: The authors use generated text to inversely predict the prompt during beam search, which enhances the relevance between the prompt and the generated text and thus improves controllability, and demonstrate that their proposed method substantially outperforms the baselines.
Proceedings Article

GLoMo: Unsupervised Learning of Transferable Relational Graphs

TL;DR: This work explores the possibility of learning generic latent relational graphs that capture dependencies between pairs of data units from large-scale unlabeled data and transferring the graphs to downstream tasks, and shows that the learned graphs are generic enough to be transferred to different embeddings on which the graphs have been trained.