Z
Zhao Cao
Researcher at Huawei
Publications - 39
Citations - 156
Zhao Cao is an academic researcher from Huawei. The author has contributed to research in topics: Computer science & Language model. The author has an hindex of 1, co-authored 7 publications receiving 9 citations.
Papers
More filters
Proceedings ArticleDOI
Pre-training for Ad-hoc Retrieval: Hyperlink is Also You Need
TL;DR: Zhang et al. as discussed by the authors proposed to leverage the large-scale hyperlinks and anchor texts to pre-train the language model for ad-hoc retrieval, which can help to build more accurate and reliable pre-training samples than a specific algorithm.
Proceedings ArticleDOI
Pre-training for Ad-hoc Retrieval: Hyperlink is Also You Need
TL;DR: This article proposed to leverage the large-scale hyperlinks and anchor texts to pre-train the language model for ad-hoc retrieval, which can help to build more accurate and reliable pre-training samples than a specific algorithm.
Posted Content
Early Exiting with Ensemble Internal Classifiers.
Tianxiang Sun,Yunhua Zhou,Xiangyang Liu,Xinyu Zhang,Hao Jiang,Zhao Cao,Xuanjing Huang,Xipeng Qiu +7 more
TL;DR: It is shown that a novel objective function for the training of the ensemble internal classifiers can be naturally induced from the perspective of ensemble learning and information theory and a simple voting-based strategy is proposed that can achieve better accuracy-speed trade-offs.
Proceedings ArticleDOI
Hyperlink-induced Pre-training for Passage Retrieval in Open-domain Question Answering
Jiawei Zhou,Xiaoguang Li,Lifeng Shang,Lan Luo,Ke Zhan,Enrui Hu,Hao Jiang,Zhao Cao,Fan Yu,Xin Jiang,Qun Liu,Lei Chen +11 more
TL;DR: It is demonstrated that the hyperlink-based structures of dual-link and co-mention can provide effective relevance signals for large-scale pre-training that better facilitate downstream passage retrieval.
Proceedings ArticleDOI
Webformer: Pre-training with Web Pages for Information Retrieval
TL;DR: This paper proposes to leverage large-scale web pages and their DOM (Document Object Model) tree structures to pre-train models for information retrieval by using the hierarchical structure contained in web pages to get richer contextual information for training better language models.