Search or ask a question

Showing papers by "Zhilin Yang published in 2023"

PDF

Open Access

Journal Article•DOI•

CodeGeeX: A Pre-Trained Model for Code Generation with Multilingual Evaluations on HumanEval-X

[...]

Qinkai Zheng, Xiao Xia, Xu Zhou, Yuxiao Dong, Shan Wang, Yufei Xue, Zi-Yuan Wang, Lei Shen, Teng Su, Zhilin Yang, Jie Tang - Show less +7 more

30 Mar 2023-arXiv.org

TL;DR: CodeGeeX as mentioned in this paper is a pre-trained code generation model on 850 billion tokens of 23 programming languages, including C++, Java, JavaScript, and Go, and it outperforms multilingual code models of similar scale for both the tasks of code generation and translation.

...read moreread less

Abstract: Large pre-trained code generation models, such as OpenAI Codex, can generate syntax- and function-correct code, making the coding of programmers more productive and our pursuit of artificial general intelligence closer. In this paper, we introduce CodeGeeX, a multilingual model with 13 billion parameters for code generation. CodeGeeX is pre-trained on 850 billion tokens of 23 programming languages as of June 2022. Our extensive experiments suggest that CodeGeeX outperforms multilingual code models of similar scale for both the tasks of code generation and translation on HumanEval-X. Building upon HumanEval (Python only), we develop the HumanEval-X benchmark for evaluating multilingual models by hand-writing the solutions in C++, Java, JavaScript, and Go. In addition, we build CodeGeeX-based extensions on Visual Studio Code, JetBrains, and Cloud Studio, generating 4.7 billion tokens for tens of thousands of active users per week. Our user study demonstrates that CodeGeeX can help to increase coding efficiency for 83.4% of its users. Finally, CodeGeeX is publicly accessible and in Sep. 2022, we open-sourced its code, model weights (the version of 850B tokens), API, extensions, and HumanEval-X at https://github.com/THUDM/CodeGeeX.

...read moreread less

16 citations

Proceedings Article•

Compositional Task Representations for Large Language Models

[...]

Nan Shao, Zefan Cai, Hanwei Xu, Chonghua Liao, Yanan Zheng, Zhilin Yang - Show less +2 more

1 citations

C ompositional t ask r epresentations for l arge l anguage m odels

[...]

Nan Shao, Zefan Cai, Hanwei Xu, Chonghua Liao, Yanan Zheng, Zhilin Yang - Show less +2 more

TL;DR: The authors introduced a compositional generalization perspective to improve the performance of language models for cross-task generalization, which led to numerous studies on improving prompts. But this perspective is not applicable to our work.

...read moreread less

Abstract: Large language models have shown a remarkable cross-task generalization ability. Most prior works assumed that prompts effectively extract knowledge from language models to facilitate generalization to new tasks. This perspective led to numerous studies on improving prompts. In contrast, we introduce a new perspective, compositional generalization

...read moreread less

Proceedings Article•

Not All Tasks Are Born Equal: Understanding Zero-Shot Generalization

[...]

Jing Zhou, Zongyu Lin, Yanan Zheng, Zhilin Yang

Understanding zero-shot generalization

[...]

Jing Zhou, Zongyu Lin, Yanan Zheng, Zhilin Yang

TL;DR: This article showed that training on a small number of key tasks outperforms using all the training tasks, while removing these key tasks substantially hurts performance, and they also found that the key tasks are mostly question answering (QA) tasks.

...read moreread less

Abstract: Recent work has achieved remarkable zero-shot performance with multi-task prompted pretraining, but little has been understood. For the first time, we show that training on a small number of key tasks beats using all the training tasks, while removing these key tasks substantially hurts performance. We also find that these key tasks are mostly question answering (QA) tasks. These novel findings combined deepen our understanding about zero-shot generalization—training on certain tasks such as QA encodes general knowledge transferable to a wide range of tasks. In addition, to automate this procedure, we devise a method that (1) identifies key training tasks without observing the test tasks by examining the pairwise generalization results and (2) resamples training tasks for better data distribution. Empirically, our approach achieves improved results across various model scales and tasks. 1

...read moreread less