D
Dongju Park
Publications - 4
Citations - 21
Dongju Park is an academic researcher. The author has contributed to research in topics: Language model & Tokenization (data security). The author has an hindex of 2, co-authored 4 publications receiving 12 citations.
Papers
More filters
Posted Content
GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation
TL;DR: This paper proposed a data augmentation technique that leverages large-scale language models to generate realistic text samples from a mixture of real samples, effectively distilling knowledge from the language models and creating textual perturbations simultaneously.
Posted Content
What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers
Boseop Kim,HyoungSeok Kim,Sang Woo Lee,Gichang Lee,Dong-Hyun Kwak,Dong Hyeon Jeon,Sunghyun Park,Sungju Kim,Seonhoon Kim,Dongpil Seo,Heungsub Lee,Minyoung Jeong,Sungjae Lee,Minsub Kim,Suk Hyun Ko,Seokhun Kim,Taeyong Park,Jinuk Kim,Soyoung Kang,Na-Hyeon Ryu,Kang Min Yoo,Minsuk Chang,Soobin Suh,Sookyo In,Jin-Seong Park,Kyungduk Kim,Hiun Kim,Jisu Jeong,Yong Goo Yeo,Donghoon Ham,Dongju Park,Min Young Lee,Jae-Wook Kang,Inho Kang,Jung-Woo Ha,Woo-Myoung Park,Nako Sung +36 more
TL;DR: HyperCLOVA as discussed by the authors is a Korean variant of 82B GPT-3 trained on a Korean-centric corpus of 560B tokens, which shows state-of-the-art zero-shot and few-shot learning performances on various downstream tasks in Korean.
Proceedings Article
GPT3Mix: Leveraging Large-scale Language Models for Text Augmentation
TL;DR: The authors proposed a data augmentation technique that leverages large-scale language models to generate realistic text samples from a mixture of real samples, effectively distilling knowledge from the language models and creating textual perturbations simultaneously.
Proceedings Article
What Changes Can Large-scale Language Models Bring? Intensive Study on HyperCLOVA: Billions-scale Korean Generative Pretrained Transformers
Boseop Kim,HyoungSeok Kim,Sang Woo Lee,Gichang Lee,Dong-Hyun Kwak,Dong Hyeon Jeon,Sunghyun Park,Sungju Kim,Seonhoon Kim,Dongpil Seo,Heungsub Lee,Minyoung Jeong,Sungjae Lee,Minsub Kim,Suk Hyun Ko,Seokhun Kim,Taeyong Park,Jinuk Kim,Soyoung Kang,Na-Hyeon Ryu,Kang Min Yoo,Minsuk Chang,Soobin Suh,Sookyo In,Jin-Seong Park,Kyungduk Kim,Hiun Kim,Jisu Jeong,Yong Goo Yeo,Donghoon Ham,Dongju Park,Min Young Lee,Jae-Wook Kang,Inho Kang,Jung-Woo Ha,Woo-Myoung Park,Nako Sung +36 more
TL;DR: HyperCLOVA as mentioned in this paper is a Korean variant of 82B GPT-3 trained on a Korean-centric corpus of 560B tokens, which shows state-of-the-art zero-shot and few-shot learning performances on various downstream tasks in Korean.