scispace - formally typeset
Search or ask a question

Showing papers by "Xi Chen published in 2020"


Journal ArticleDOI
01 Sep 2020
TL;DR: This work shows that it is possible to learn the correlations across all tables in a database without any independence assumptions, and presents NeuroCard, a join cardinality estimator that builds a single neural density estimator over an entire database.
Abstract: Query optimizers rely on accurate cardinality estimates to produce good execution plans. Despite decades of research, existing cardinality estimators are inaccurate for complex queries, due to making lossy modeling assumptions and not capturing inter-table correlations. In this work, we show that it is possible to learn the correlations across all tables in a database without any independence assumptions. We present NeuroCard, a join cardinality estimator that builds a single neural density estimator over an entire database. Leveraging join sampling and modern deep autoregressive models, NeuroCard makes no inter-table or inter-column independence assumptions in its probabilistic modeling. NeuroCard achieves orders of magnitude higher accuracy than the best prior methods (a new state-of-the-art result of 8.5x maximum error on JOB-light), scales to dozens of tables, while being compact in space (several MBs) and efficient to construct or update (seconds to minutes).

75 citations


Posted Content
TL;DR: NeuroCard as mentioned in this paper is a join cardinality estimator that builds a single neural density estimator over an entire database, which makes no inter-table or inter-column independence assumptions in its probabilistic modeling.
Abstract: Query optimizers rely on accurate cardinality estimates to produce good execution plans. Despite decades of research, existing cardinality estimators are inaccurate for complex queries, due to making lossy modeling assumptions and not capturing inter-table correlations. In this work, we show that it is possible to learn the correlations across all tables in a database without any independence assumptions. We present NeuroCard, a join cardinality estimator that builds a single neural density estimator over an entire database. Leveraging join sampling and modern deep autoregressive models, NeuroCard makes no inter-table or inter-column independence assumptions in its probabilistic modeling. NeuroCard achieves orders of magnitude higher accuracy than the best prior methods (a new state-of-the-art result of 8.5$\times$ maximum error on JOB-light), scales to dozens of tables, while being compact in space (several MBs) and efficient to construct or update (seconds to minutes).

12 citations


Posted Content
TL;DR: This paper shows that variable skipping provides 10-100$\times$ efficiency improvements when targeting challenging high-quantile error metrics, enables complex applications such as text pattern matching, and can be realized via a simple data augmentation procedure without changing the usual maximum likelihood objective.
Abstract: Deep autoregressive models compute point likelihood estimates of individual data points. However, many applications (i.e., database cardinality estimation) require estimating range densities, a capability that is under-explored by current neural density estimation literature. In these applications, fast and accurate range density estimates over high-dimensional data directly impact user-perceived performance. In this paper, we explore a technique, variable skipping, for accelerating range density estimation over deep autoregressive models. This technique exploits the sparse structure of range density queries to avoid sampling unnecessary variables during approximate inference. We show that variable skipping provides 10-100$\times$ efficiency improvements when targeting challenging high-quantile error metrics, enables complex applications such as text pattern matching, and can be realized via a simple data augmentation procedure without changing the usual maximum likelihood objective.

5 citations


Posted ContentDOI
15 Oct 2020-bioRxiv
TL;DR: It was found that AT and PM networks were less segregated in older than younger adults and this reduced specialization was associated with more tau and Aβ in the same regions, suggesting a compensation phase followed by a degenerative phase in the early, preclinical phase of AD.
Abstract: In presymptomatic Alzheimer9s disease (AD), beta-amyloid plaques and tau tangles accumulate in distinct spatiotemporal patterns within the brain, tracking closely with episodic memory decline. Here, we tested whether age-related changes in the segregation of the brain9s functional episodic memory networks - anterior-temporal (AT) and posterior-medial (PM) networks - are associated with the accumulation of beta-amyloid, tau and memory decline using fMRI and PET. We found that AT and PM networks were less segregated in older than younger adults and this reduced specialization was associated with more tau and beta-amyloid in the same regions. The effect of network dedifferentiation on memory depended on the amount of beta-amyloid and tau, with low segregation and pathology associated with better performance at baseline and low segregation and high pathology related to worse performance over time. This pattern suggests a compensation phase followed by a degenerative phase in the early, preclinical phase of AD.