Home
/
Authors
/
Hai Zhao

Author

Hai Zhao

Other affiliations: Microsoft, City University of Hong Kong, Northeastern University ...read more

Bio: Hai Zhao is an academic researcher from Shanghai Jiao Tong University. The author has contributed to research in topics: Machine translation & Parsing. The author has an hindex of 53, co-authored 555 publications receiving 8915 citations. Previous affiliations of Hai Zhao include Microsoft & City University of Hong Kong.

Topics: Machine translation, Parsing, Language model, Semantic role labeling, Sentence ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2001
1998

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Semantics-Aware BERT for Language Understanding

[...]

Zhuosheng Zhang¹, Yuwei Wu¹, Hai Zhao¹, Zuchao Li¹, Shuailiang Zhang¹, Xi Zhou, Xiang Zhou - Show less +3 more•Institutions (1)

Shanghai Jiao Tong University¹

03 Apr 2020

TL;DR: SemBERT as discussed by the authors incorporates explicit contextual semantics from pre-trained semantic role labeling, and introduces an improved language representation model, Semantics-aware BERT, which is capable of explicitly absorbing contextual semantics over a BERT backbone.

...read moreread less

Abstract: The latest work on language representations carefully integrates contextualized features into language model training, which enables a series of success especially in various machine reading comprehension and natural language inference tasks. However, the existing language representation models including ELMo, GPT and BERT only exploit plain context-sensitive features such as character or word embeddings. They rarely consider incorporating structured semantic information which can provide rich semantics for language representation. To promote natural language understanding, we propose to incorporate explicit contextual semantics from pre-trained semantic role labeling, and introduce an improved language representation model, Semantics-aware BERT (SemBERT), which is capable of explicitly absorbing contextual semantics over a BERT backbone. SemBERT keeps the convenient usability of its BERT precursor in a light fine-tuning way without substantial task-specific modifications. Compared with BERT, semantics-aware BERT is as simple in concept but more powerful. It obtains new state-of-the-art or substantially improves results on ten reading comprehension and language inference tasks.

...read moreread less

199 citations

Proceedings Article•

Modeling Multi-turn Conversation with Deep Utterance Aggregation

[...]

Zhuosheng Zhang¹, Jiangtong Li¹, Pengfei Zhu², Hai Zhao³, Gongshen Liu - Show less +1 more•Institutions (3)

Shanghai Jiao Tong University¹, East China Normal University², Chinese Academy of Sciences³

24 Jun 2018

TL;DR: This paper formulate previous utterances into context using a proposed deep utterance aggregation model to form a fine-grained context representation, and shows the model outperforms the state-of-the-art methods on three multi-turn conversation benchmarks, including a newly introduced e-commerce dialogue corpus.

...read moreread less

Abstract: Multi-turn conversation understanding is a major challenge for building intelligent dialogue systems. This work focuses on retrieval-based response matching for multi-turn conversation whose related work simply concatenates the conversation utterances, ignoring the interactions among previous utterances for context modeling. In this paper, we formulate previous utterances into context using a proposed deep utterance aggregation model to form a fine-grained context representation. In detail, a self-matching attention is first introduced to route the vital information in each utterance. Then the model matches a response with each refined utterance and the final matching score is obtained after attentive turns aggregation. Experimental results show our model outperforms the state-of-the-art methods on three multi-turn conversation benchmarks, including a newly introduced e-commerce dialogue corpus.

...read moreread less

198 citations

Posted Content•

Semantics-aware BERT for Language Understanding

[...]

Zhuosheng Zhang¹, Yuwei Wu¹, Hai Zhao¹, Zuchao Li¹, Shuailiang Zhang¹, Xi Zhou, Xiang Zhou - Show less +3 more•Institutions (1)

Shanghai Jiao Tong University¹

05 Sep 2019-arXiv: Computation and Language

TL;DR: This work proposes to incorporate explicit contextual semantics from pre-trained semantic role labeling, and introduces an improved language representation model, Semantics-aware BERT (SemBERT), which is capable of explicitly absorbing contextual semantics over a BERT backbone.

...read moreread less

197 citations

Proceedings Article•

An Improved Chinese Word Segmentation System with Conditional Random Field

[...]

Hai Zhao, Changning Huang, Mu Li

01 Jul 2006

TL;DR: It is found that the use of a 6-tag set, tone feature of Chinese character and assistant segmenters trained on other corpora further improve Chinese word segmentation performance.

...read moreread less

Abstract: In this paper, we describe a Chinese word segmentation system that we developed for the Third SIGHAN Chinese Language Processing Bakeoff (Bakeoff2006). We took part in six tracks, namely the closed and open track on three corpora, Academia Sinica (CKIP), City University of Hong Kong (CityU), and University of Pennsylvania/University of Colorado (UPUC). Based on a conditional random field based approach, our word segmenter achieved the highest F measures in four tracks, and the third highest in the other two tracks. We found that the use of a 6-tag set, tone feature of Chinese character and assistant segmenters trained on other corpora further improve Chinese word segmentation performance.

...read moreread less

192 citations

Posted Content•

Retrospective Reader for Machine Reading Comprehension

[...]

Zhuosheng Zhang¹, Junjie Yang¹, Hai Zhao¹•Institutions (1)

Shanghai Jiao Tong University¹

27 Jan 2020-arXiv: Computation and Language

TL;DR: Inspired by how humans solve reading comprehension questions, a retrospective reader (Retro-Reader) is proposed that integrates two stages of reading and verification strategies: 1) sketchy reading that briefly investigates the overall interactions of passage and question, and yield an initial judgment; 2) intensive reading that verifies the answer and gives the final prediction.

...read moreread less

Abstract: Machine reading comprehension (MRC) is an AI challenge that requires machine to determine the correct answers to questions based on a given passage. MRC systems must not only answer question when necessary but also distinguish when no answer is available according to the given passage and then tactfully abstain from answering. When unanswerable questions are involved in the MRC task, an essential verification module called verifier is especially required in addition to the encoder, though the latest practice on MRC modeling still most benefits from adopting well pre-trained language models as the encoder block by only focusing on the "reading". This paper devotes itself to exploring better verifier design for the MRC task with unanswerable questions. Inspired by how humans solve reading comprehension questions, we proposed a retrospective reader (Retro-Reader) that integrates two stages of reading and verification strategies: 1) sketchy reading that briefly investigates the overall interactions of passage and question, and yield an initial judgment; 2) intensive reading that verifies the answer and gives the final prediction. The proposed reader is evaluated on two benchmark MRC challenge datasets SQuAD2.0 and NewsQA, achieving new state-of-the-art results. Significance tests show that our model is significantly better than the strong ELECTRA and ALBERT baselines. A series of analysis is also conducted to interpret the effectiveness of the proposed reader.

...read moreread less

170 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•

다중혈관 관상동맥 환자에서 y-문합을 이용하여 양쪽 내흉동맥만을 사용한 우회술의 조기 성적

[...]

성기익, 이영탁, 박계현, 전태국, 박표원, 한일용, 장윤희 - Show less +3 more

01 Mar 2003-The Korean Journal of Thoracic and Cardiovascular Surgery

28,685 citations

Proceedings Article•DOI•

Random graphs

[...]

Alan Frieze¹•Institutions (1)

Carnegie Mellon University¹

22 Jan 2006

TL;DR: Some of the major results in random graphs and some of the more challenging open problems are reviewed, including those related to the WWW.

...read moreread less

Abstract: We will review some of the major results in random graphs and some of the more challenging open problems. We will cover algorithmic and structural questions. We will touch on newer models, including those related to the WWW.

...read moreread less

7,116 citations

Journal Article•

Thinking fast and slow.

[...]

Neil McGlynn

01 Dec 2014-Australian Veterinary Journal

TL;DR: Prospect Theory led cognitive psychology in a new direction that began to uncover other human biases in thinking that are probably not learned but are part of the authors' brain’s wiring.

...read moreread less

Abstract: In 1974 an article appeared in Science magazine with the dry-sounding title “Judgment Under Uncertainty: Heuristics and Biases” by a pair of psychologists who were not well known outside their discipline of decision theory. In it Amos Tversky and Daniel Kahneman introduced the world to Prospect Theory, which mapped out how humans actually behave when faced with decisions about gains and losses, in contrast to how economists assumed that people behave. Prospect Theory turned Economics on its head by demonstrating through a series of ingenious experiments that people are much more concerned with losses than they are with gains, and that framing a choice from one perspective or the other will result in decisions that are exactly the opposite of each other, even if the outcomes are monetarily the same. Prospect Theory led cognitive psychology in a new direction that began to uncover other human biases in thinking that are probably not learned but are part of our brain’s wiring.

...read moreread less

4,351 citations

Proceedings Article•

XLNet: Generalized Autoregressive Pretraining for Language Understanding

[...]

Zhilin Yang¹, Zihang Dai¹, Yiming Yang¹, Jaime G. Carbonell¹, Ruslan Salakhutdinov¹, Quoc V. Le² - Show less +2 more•Institutions (2)

Carnegie Mellon University¹, Google²

19 Jun 2019

TL;DR: The authors proposes XLNet, a generalized autoregressive pretraining method that enables learning bidirectional contexts by maximizing the expected likelihood over all permutations of the factorization order and overcomes the limitations of BERT The authors.

...read moreread less

Abstract: With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling. However, relying on corrupting the input with masks, BERT neglects dependency between the masked positions and suffers from a pretrain-finetune discrepancy. In light of these pros and cons, we propose XLNet, a generalized autoregressive pretraining method that (1) enables learning bidirectional contexts by maximizing the expected likelihood over all permutations of the factorization order and (2) overcomes the limitations of BERT thanks to its autoregressive formulation. Furthermore, XLNet integrates ideas from Transformer-XL, the state-of-the-art autoregressive model, into pretraining. Empirically, under comparable experiment setting, XLNet outperforms BERT on 20 tasks, often by a large margin, including question answering, natural language inference, sentiment analysis, and document ranking.

...read moreread less

3,863 citations

Posted Content•

XLNet: Generalized Autoregressive Pretraining for Language Understanding

[...]

Zhilin Yang¹, Zihang Dai¹, Yiming Yang¹, Jaime G. Carbonell¹, Ruslan Salakhutdinov¹, Quoc V. Le² - Show less +2 more•Institutions (2)

Carnegie Mellon University¹, Google²

19 Jun 2019-arXiv: Computation and Language

TL;DR: XLNet is proposed, a generalized autoregressive pretraining method that enables learning bidirectional contexts by maximizing the expected likelihood over all permutations of the factorization order and overcomes the limitations of BERT thanks to its autore progressive formulation.

...read moreread less

Abstract: With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling. However, relying on corrupting the input with masks, BERT neglects dependency between the masked positions and suffers from a pretrain-finetune discrepancy. In light of these pros and cons, we propose XLNet, a generalized autoregressive pretraining method that (1) enables learning bidirectional contexts by maximizing the expected likelihood over all permutations of the factorization order and (2) overcomes the limitations of BERT thanks to its autoregressive formulation. Furthermore, XLNet integrates ideas from Transformer-XL, the state-of-the-art autoregressive model, into pretraining. Empirically, under comparable experiment settings, XLNet outperforms BERT on 20 tasks, often by a large margin, including question answering, natural language inference, sentiment analysis, and document ranking.

...read moreread less

3,009 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse