Home
/
Authors
/
Xikun Zhang

Author

Xikun Zhang

University of Illinois at Urbana–Champaign

Bio: Xikun Zhang is an academic researcher from University of Illinois at Urbana–Champaign. The author has contributed to research in topics: Physics & Embedding. The author has an hindex of 7, co-authored 9 publications receiving 307 citations. Previous affiliations of Xikun Zhang include Stanford University.

Topics: Physics, Embedding, Feature learning, Quantum dot, Language model ...read more

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Predicting Dynamic Embedding Trajectory in Temporal Interaction Networks

[...]

Srijan Kumar¹, Xikun Zhang², Jure Leskovec³•Institutions (3)

Georgia Institute of Technology¹, University of Illinois at Urbana–Champaign², Stanford University³

03 Aug 2019-arXiv: Social and Information Networks

TL;DR: JODIE as mentioned in this paper employs two recurrent neural networks to update the embedding of a user and an item at every interaction, which can be used to predict future user-item interactions.

...read moreread less

Abstract: Modeling sequential interactions between users and items/products is crucial in domains such as e-commerce, social networking, and education. Representation learning presents an attractive opportunity to model the dynamic evolution of users and items, where each user/item can be embedded in a Euclidean space and its evolution can be modeled by an embedding trajectory in this space. However, existing dynamic embedding methods generate embeddings only when users take actions and do not explicitly model the future trajectory of the user/item in the embedding space. Here we propose JODIE, a coupled recurrent neural network model that learns the embedding trajectories of users and items. JODIE employs two recurrent neural networks to update the embedding of a user and an item at every interaction. Crucially, JODIE also models the future embedding trajectory of a user/item. To this end, it introduces a novel projection operator that learns to estimate the embedding of the user at any time in the future. These estimated embeddings are then used to predict future user-item interactions. To make the method scalable, we develop a t-Batch algorithm that creates time-consistent batches and leads to 9x faster training. We conduct six experiments to validate JODIE on two prediction tasks---future interaction prediction and state change prediction---using four real-world datasets. We show that JODIE outperforms six state-of-the-art algorithms in these tasks by at least 20% in predicting future interactions and 12% in state change prediction.

...read moreread less

297 citations

Proceedings Article•DOI•

Predicting Dynamic Embedding Trajectory in Temporal Interaction Networks

[...]

Srijan Kumar¹, Xikun Zhang², Jure Leskovec³•Institutions (3)

Georgia Institute of Technology¹, University of Illinois at Urbana–Champaign², Stanford University³

25 Jul 2019

TL;DR: JODIE is proposed, a coupled recurrent neural network model that learns the embedding trajectories of users and items that outperforms six state-of-the-art algorithms in predicting future interactions and 12% in state change prediction.

...read moreread less

227 citations

Posted Content•

On the Opportunities and Risks of Foundation Models.

[...]

Rishi Bommasani, Drew A. Hudson, Ehsan Adeli, Russ B. Altman, Simran Arora, Sydney von Arx, Michael S. Bernstein, Jeannette Bohg, Antoine Bosselut, Emma Brunskill, Erik Brynjolfsson, Shyamal Buch, Dallas Card, Rodrigo Castellon, Niladri S. Chatterji, Annie Chen, Kathleen Creel, Jared Davis, Dora Demszky, Chris Donahue, Moussa Doumbouya, Esin Durmus, Stefano Ermon, John Etchemendy, Kawin Ethayarajh, Li Fei-Fei, Chelsea Finn, Trevor Gale, Lauren Gillespie, Karan Goel¹, Noah D. Goodman, Shelby Grossman, Neel Guha, Tatsunori Hashimoto, Peter Henderson, John Hewitt, Daniel E. Ho, Jenny Hong, Kyle Hsu, Jing Huang, Thomas Icard, Saahil Jain, Dan Jurafsky, Pratyusha Kalluri, Siddharth Karamcheti, Geoff Keeling, Fereshte Khani, Omar Khattab, Pang Wei Koh, Mark Krass, Ranjay Krishna, Rohith Kuditipudi, Ananya Kumar, Faisal Ladhak, Mina Lee, Tony Lee, Jure Leskovec, Isabelle Levent, Xiang Lisa Li, Xuechen Li, Tengyu Ma, Ali Ahmad Malik, Christopher D. Manning, Suvir Mirchandani, Eric Mitchell, Zanele Munyikwa, Suraj Nair, Avanika Narayan, Deepak Narayanan, Ben Newman, Allen Nie, Juan Carlos Niebles, Hamed Nilforoshan, Julian Nyarko, Giray Ogut, Laurel Orr, Isabel Papadimitriou, Joon Sung Park, Chris Piech, Eva Portelance, Christopher Potts, Aditi Raghunathan, Rob Reich, Hongyu Ren, Frieda Rong, Yusuf H. Roohani, Camilo Ruiz, Jack Ryan, Christopher Ré, Dorsa Sadigh, Shiori Sagawa, Keshav Santhanam, Andy Shih, Krishnan Srinivasan, Alex Tamkin, Rohan Taori, Armin W. Thomas, Florian Tramèr, Rose E. Wang, William Yang Wang, Bohan Wu, Jiajun Wu, Yuhuai Wu, Sang Michael Xie, Michihiro Yasunaga, Jiaxuan You, Matei Zaharia, Michael Zhang, Tianyi Zhang, Xikun Zhang, Yuhui Zhang, Lucia Zheng, Kaitlyn Zhou, Percy Liang - Show less +110 more•Institutions (1)

Stanford University¹

16 Aug 2021-arXiv: Learning

TL;DR: The authors provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles(e. g.g. model architectures, training procedures, data, systems, security, evaluation, theory) to their applications.

...read moreread less

Abstract: AI is undergoing a paradigm shift with the rise of models (e.g., BERT, DALL-E, GPT-3) that are trained on broad data at scale and are adaptable to a wide range of downstream tasks. We call these models foundation models to underscore their critically central yet incomplete character. This report provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles(e.g., model architectures, training procedures, data, systems, security, evaluation, theory) to their applications (e.g., law, healthcare, education) and societal impact (e.g., inequity, misuse, economic and environmental impact, legal and ethical considerations). Though foundation models are based on standard deep learning and transfer learning, their scale results in new emergent capabilities,and their effectiveness across so many tasks incentivizes homogenization. Homogenization provides powerful leverage but demands caution, as the defects of the foundation model are inherited by all the adapted models downstream. Despite the impending widespread deployment of foundation models, we currently lack a clear understanding of how they work, when they fail, and what they are even capable of due to their emergent properties. To tackle these questions, we believe much of the critical research on foundation models will require deep interdisciplinary collaboration commensurate with their fundamentally sociotechnical nature.

...read moreread less

76 citations

Journal Article•DOI•

Ultrafast coherent control of a hole spin qubit in a germanium quantum dot

[...]

Ke Wang, Gang Xu, Fei-Fei Gao, He Liu, Rong Ma, Xikun Zhang, Zhan Wang, Gang Cao, Ting Wang, Jian-Jun Zhang, Dimitrie Culcer, Xuedong Hu, Hong-Wen Jiang, Haiou Li, Guang-Can Guo - Show less +11 more

11 Jan 2022-Nature Communications

TL;DR: In this article , a hole-based double quantum dot in a germanium hut wire (GHW) was demonstrated to achieve a Rabi frequency exceeding 540 MHz at a magnetic field of 100 mT, setting a record for ultrafast spin qubit control.

...read moreread less

Abstract: Abstract Operation speed and coherence time are two core measures for the viability of a qubit. Strong spin-orbit interaction (SOI) and relatively weak hyperfine interaction make holes in germanium (Ge) intriguing candidates for spin qubits with rapid, all-electrical coherent control. Here we report ultrafast single-spin manipulation in a hole-based double quantum dot in a germanium hut wire (GHW). Mediated by the strong SOI, a Rabi frequency exceeding 540 MHz is observed at a magnetic field of 100 mT, setting a record for ultrafast spin qubit control in semiconductor systems. We demonstrate that the strong SOI of heavy holes (HHs) in our GHW, characterized by a very short spin-orbit length of 1.5 nm, enables the rapid gate operations we accomplish. Our results demonstrate the potential of ultrafast coherent control of hole spin qubits to meet the requirement of DiVincenzo’s criteria for a scalable quantum information processor.

...read moreread less

39 citations

Posted Content•

Do Language Embeddings Capture Scales

[...]

Xikun Zhang¹, Deepak Ramachandran², Ian Tenney², Yanai Elazar³, Dan Roth⁴ - Show less +1 more•Institutions (4)

Stanford University¹, Google², Bar-Ilan University³, University of Pennsylvania⁴

11 Oct 2020-arXiv: Computation and Language

TL;DR: This work identifies contextual information in pre-training and numeracy as two key factors affecting their performance, and shows that a simple method of canonicalizing numbers can have a significant effect on the results.

...read moreread less

Abstract: Pretrained Language Models (LMs) have been shown to possess significant linguistic, common sense, and factual knowledge. One form of knowledge that has not been studied yet in this context is information about the scalar magnitudes of objects. We show that pretrained language models capture a significant amount of this information but are short of the capability required for general common-sense reasoning. We identify contextual information in pre-training and numeracy as two key factors affecting their performance and show that a simple method of canonicalizing numbers can have a significant effect on the results.

...read moreread less

34 citations

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Survey of Hallucination in Natural Language Generation

[...]

Ziwei Ji, Nayeon Lee, Rita Frieske, Tiezheng Yu, D. Su, Yan Xu, Etsuko Ishii, Yejin Bang, Wenliang Dai, Andrea Madotto, Pascale Fung - Show less +7 more

08 Feb 2022-ACM Computing Surveys

TL;DR: This survey serves tofacilitate collaborative efforts among researchers in tackling the challenge of hallucinated texts in NLG by providing a broad overview of the research progress and challenges in the hallucination problem inNLG.

...read moreread less

Abstract: Natural Language Generation (NLG) has improved exponentially in recent years thanks to the development of sequence-to-sequence deep learning technologies such as Transformer-based language models. This advancement has led to more fluent and coherent NLG, leading to improved development in downstream tasks such as abstractive summarization, dialogue generation, and data-to-text generation. However, it is also apparent that deep learning based generation is prone to hallucinate unintended text, which degrades the system performance and fails to meet user expectations in many real-world scenarios. To address this issue, many studies have been presented in measuring and mitigating hallucinated texts, but these have never been reviewed in a comprehensive manner before. In this survey, we thus provide a broad overview of the research progress and challenges in the hallucination problem in NLG. The survey is organized into two parts: (1) a general overview of metrics, mitigation methods, and future directions, and (2) an overview of task-specific research progress on hallucinations in the following downstream tasks, namely abstractive summarization, dialogue generation, generative question answering, data-to-text generation, and machine translation. This survey serves to facilitate collaborative efforts among researchers in tackling the challenge of hallucinated texts in NLG.

...read moreread less

314 citations

Journal Article•DOI•

Political discourse content analysis: a critical overview of a computerized text analysis program linguistic inquiry and word count (liwc)

[...]

Angelika Yanovets, Oksana Smal

30 Jan 2020

TL;DR: The authors examined and analyzed the linguistic and psychological features of political discourse using a computer-based Linguistic Inquiry and Word Count (LIWC) content analysis program to explore the relationship between political discourse and the personality of politicians.

...read moreread less

Abstract: The article examines and analyzes the linguistic and psychological features of political discourse using a computer-based Linguistic Inquiry and Word Count (LIWC) content analysis program to explore the relationship between political discourse and the personality of politicians. As for political discourse, it is perhaps the communicator, the linguistic personality, who plays the most important role in the communication. The linguistic personality of a politician is of particular interest in political discourse content-analysis, since it has the greatest influence on the public consciousness via mass media. Using text as a source of psychological and cognitive information has been gaining popularity. Researchers use a variety of methods to analyze texts, but Linguistic Inquiry Word Count (LIWC) has proved to be the most common technique. The analysis of linguistic patterns of political discourse shows that in the context of political speech events such as media interviews, politicians make a unique choice of lexical units, which can be interpreted as a manifestation of certain personality traits. However, despite the significance of the results, there are clear limitations to the use of computerized methodologies to make political discourse content-analysis, such as the limited interpretive capacity of software to understand pragmatic and contextual use of lexical units.

...read moreread less

286 citations

Posted Content•

Temporal Graph Networks for Deep Learning on Dynamic Graphs.

[...]

Emanuele Rossi, Ben Chamberlain, Fabrizio Frasca, Davide Eynard, Federico Monti, Michael M. Bronstein - Show less +2 more

18 Jun 2020-arXiv: Learning

TL;DR: This paper presents Temporal Graph Networks (TGNs), a generic, efficient framework for deep learning on dynamic graphs represented as sequences of timed events that significantly outperform previous approaches being at the same time more computationally efficient.

...read moreread less

Abstract: Graph Neural Networks (GNNs) have recently become increasingly popular due to their ability to learn complex systems of relations or interactions arising in a broad spectrum of problems ranging from biology and particle physics to social networks and recommendation systems Despite the plethora of different models for deep learning on graphs, few approaches have been proposed thus far for dealing with graphs that present some sort of dynamic nature (eg evolving features or connectivity over time) In this paper, we present Temporal Graph Networks (TGNs), a generic, efficient framework for deep learning on dynamic graphs represented as sequences of timed events Thanks to a novel combination of memory modules and graph-based operators, TGNs are able to significantly outperform previous approaches being at the same time more computationally efficient We furthermore show that several previous models for learning on dynamic graphs can be cast as specific instances of our framework We perform a detailed ablation study of different components of our framework and devise the best configuration that achieves state-of-the-art performance on several transductive and inductive prediction tasks for dynamic graphs

...read moreread less

238 citations

Posted Content•

Time2Vec: Learning a Vector Representation of Time

[...]

Seyed Mehran Kazemi, Rishab Goel, Sepehr Eghbali, Janahan Ramanan, Jaspreet Sahota, Sanjay Thakur, Stella Wu, Cathal Smyth, Pascal Poupart, Marcus A. Brubaker - Show less +6 more

11 Jul 2019-arXiv: Learning

TL;DR: This paper provides a model-agnostic vector representation for time, called Time2Vec, that can be easily imported into many existing and future architectures and improve their performances.

...read moreread less

Abstract: Time is an important feature in many applications involving events that occur synchronously and/or asynchronously. To effectively consume time information, recent studies have focused on designing new architectures. In this paper, we take an orthogonal but complementary approach by providing a model-agnostic vector representation for time, called Time2Vec, that can be easily imported into many existing and future architectures and improve their performances. We show on a range of models and problems that replacing the notion of time with its Time2Vec representation improves the performance of the final model.

...read moreread less

147 citations

Journal Article•DOI•

Foundations and Modeling of Dynamic Networks Using Dynamic Graph Neural Networks: A Survey

[...]

Joakim Skarding¹, Bogdan Gabrys¹, Katarzyna Musial¹•Institutions (1)

University of Technology, Sydney¹

25 May 2021-IEEE Access

TL;DR: This work establishes a foundation of dynamic networks with consistent, detailed terminology and notation and presents a comprehensive survey of dynamic graph neural network models using the proposed terminology.

...read moreread less

Abstract: Dynamic networks are used in a wide range of fields, including social network analysis, recommender systems and epidemiology. Representing complex networks as structures changing over time allow network models to leverage not only structural but also temporal patterns. However, as dynamic network literature stems from diverse fields and makes use of inconsistent terminology, it is challenging to navigate. Meanwhile, graph neural networks (GNNs) have gained a lot of attention in recent years for their ability to perform well on a range of network science tasks, such as link prediction and node classification. Despite the popularity of graph neural networks and the proven benefits of dynamic network models, there has been little focus on graph neural networks for dynamic networks. To address the challenges resulting from the fact that this research crosses diverse fields as well as to survey dynamic graph neural networks, this work is split into two main parts. First, to address the ambiguity of the dynamic network terminology we establish a foundation of dynamic networks with consistent, detailed terminology and notation. Second, we present a comprehensive survey of dynamic graph neural network models using the proposed terminology.

...read moreread less

144 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131

Collapse