Showing papers by "Xi Chen published in 2020"

PDF

Open Access

Journal Article•DOI•

NeuroCard: one cardinality estimator for all tables

[...]

Zongheng Yang¹, Amog Kamsetty¹, Sifei Luan¹, Eric Liang¹, Yan Duan, Xi Chen, Ion Stoica¹ - Show less +3 more•Institutions (1)

University of California, Berkeley¹

01 Sep 2020

TL;DR: This work shows that it is possible to learn the correlations across all tables in a database without any independence assumptions, and presents NeuroCard, a join cardinality estimator that builds a single neural density estimator over an entire database.

...read moreread less

Abstract: Query optimizers rely on accurate cardinality estimates to produce good execution plans. Despite decades of research, existing cardinality estimators are inaccurate for complex queries, due to making lossy modeling assumptions and not capturing inter-table correlations. In this work, we show that it is possible to learn the correlations across all tables in a database without any independence assumptions. We present NeuroCard, a join cardinality estimator that builds a single neural density estimator over an entire database. Leveraging join sampling and modern deep autoregressive models, NeuroCard makes no inter-table or inter-column independence assumptions in its probabilistic modeling. NeuroCard achieves orders of magnitude higher accuracy than the best prior methods (a new state-of-the-art result of 8.5x maximum error on JOB-light), scales to dozens of tables, while being compact in space (several MBs) and efficient to construct or update (seconds to minutes).

...read moreread less

75 citations

Posted Content•

NeuroCard: One Cardinality Estimator for All Tables

[...]

Zongheng Yang¹, Amog Kamsetty¹, Sifei Luan¹, Eric Liang¹, Yan Duan, Xi Chen, Ion Stoica¹ - Show less +3 more•Institutions (1)

University of California, Berkeley¹

15 Jun 2020-arXiv: Databases

TL;DR: NeuroCard as mentioned in this paper is a join cardinality estimator that builds a single neural density estimator over an entire database, which makes no inter-table or inter-column independence assumptions in its probabilistic modeling.

...read moreread less

Abstract: Query optimizers rely on accurate cardinality estimates to produce good execution plans. Despite decades of research, existing cardinality estimators are inaccurate for complex queries, due to making lossy modeling assumptions and not capturing inter-table correlations. In this work, we show that it is possible to learn the correlations across all tables in a database without any independence assumptions. We present NeuroCard, a join cardinality estimator that builds a single neural density estimator over an entire database. Leveraging join sampling and modern deep autoregressive models, NeuroCard makes no inter-table or inter-column independence assumptions in its probabilistic modeling. NeuroCard achieves orders of magnitude higher accuracy than the best prior methods (a new state-of-the-art result of 8.5$\times$ maximum error on JOB-light), scales to dozens of tables, while being compact in space (several MBs) and efficient to construct or update (seconds to minutes).

...read moreread less

12 citations

Posted Content•

Variable Skipping for Autoregressive Range Density Estimation

[...]

Eric Liang, Zongheng Yang, Ion Stoica, Pieter Abbeel, Yan Duan, Xi Chen - Show less +2 more

10 Jul 2020-arXiv: Learning

TL;DR: This paper shows that variable skipping provides 10-100$\times$ efficiency improvements when targeting challenging high-quantile error metrics, enables complex applications such as text pattern matching, and can be realized via a simple data augmentation procedure without changing the usual maximum likelihood objective.

...read moreread less

Abstract: Deep autoregressive models compute point likelihood estimates of individual data points. However, many applications (i.e., database cardinality estimation) require estimating range densities, a capability that is under-explored by current neural density estimation literature. In these applications, fast and accurate range density estimates over high-dimensional data directly impact user-perceived performance. In this paper, we explore a technique, variable skipping, for accelerating range density estimation over deep autoregressive models. This technique exploits the sparse structure of range density queries to avoid sampling unnecessary variables during approximate inference. We show that variable skipping provides 10-100$\times$ efficiency improvements when targeting challenging high-quantile error metrics, enables complex applications such as text pattern matching, and can be realized via a simple data augmentation procedure without changing the usual maximum likelihood objective.

...read moreread less

5 citations

Posted Content•DOI•

Alzheimer's pathology is associated with dedifferentiation of functional memory networks in aging

[...]

Kaitlin Cassady¹, Jenna N. Adams², Xi Chen², Anne Maass, Theresa M. Harrison², Susan M. Landau², Suzanne L. Baker², William J. Jagust² - Show less +4 more•Institutions (2)

Lawrence Berkeley National Laboratory¹, University of California, Berkeley²

15 Oct 2020-bioRxiv

TL;DR: It was found that AT and PM networks were less segregated in older than younger adults and this reduced specialization was associated with more tau and Aβ in the same regions, suggesting a compensation phase followed by a degenerative phase in the early, preclinical phase of AD.

...read moreread less

Abstract: In presymptomatic Alzheimer9s disease (AD), beta-amyloid plaques and tau tangles accumulate in distinct spatiotemporal patterns within the brain, tracking closely with episodic memory decline. Here, we tested whether age-related changes in the segregation of the brain9s functional episodic memory networks - anterior-temporal (AT) and posterior-medial (PM) networks - are associated with the accumulation of beta-amyloid, tau and memory decline using fMRI and PET. We found that AT and PM networks were less segregated in older than younger adults and this reduced specialization was associated with more tau and beta-amyloid in the same regions. The effect of network dedifferentiation on memory depended on the amount of beta-amyloid and tau, with low segregation and pathology associated with better performance at baseline and low segregation and high pathology related to worse performance over time. This pattern suggests a compensation phase followed by a degenerative phase in the early, preclinical phase of AD.

...read moreread less