scispace - formally typeset
Search or ask a question
Author

Babak Shahbaba

Bio: Babak Shahbaba is an academic researcher from University of California, Irvine. The author has contributed to research in topics: Hybrid Monte Carlo & Markov chain Monte Carlo. The author has an hindex of 24, co-authored 98 publications receiving 2271 citations. Previous affiliations of Babak Shahbaba include University of Toronto & Stanford University.


Papers
More filters
Journal ArticleDOI
TL;DR: Higher maternal cortisol levels in early gestation was associated with more affective problems in girls, and this association was mediated, in part, by amygdala volume, while no association between maternal cortisol in pregnancy and child hippocampus volume was observed in either sex.
Abstract: Stress-related variation in the intrauterine milieu may impact brain development and emergent function, with long-term implications in terms of susceptibility for affective disorders. Studies in animals suggest limbic regions in the developing brain are particularly sensitive to exposure to the stress hormone cortisol. However, the nature, magnitude, and time course of these effects have not yet been adequately characterized in humans. A prospective, longitudinal study was conducted in 65 normal, healthy mother–child dyads to examine the association of maternal cortisol in early, mid-, and late gestation with subsequent measures at approximately 7 y age of child amygdala and hippocampus volume and affective problems. After accounting for the effects of potential confounding pre- and postnatal factors, higher maternal cortisol levels in earlier but not later gestation was associated with a larger right amygdala volume in girls (a 1 SD increase in cortisol was associated with a 6.4% increase in right amygdala volume), but not in boys. Moreover, higher maternal cortisol levels in early gestation was associated with more affective problems in girls, and this association was mediated, in part, by amygdala volume. No association between maternal cortisol in pregnancy and child hippocampus volume was observed in either sex. The current findings represent, to the best of our knowledge, the first report linking maternal stress hormone levels in human pregnancy with subsequent child amygdala volume and affect. The results underscore the importance of the intrauterine environment and suggest the origins of neuropsychiatric disorders may have their foundations early in life.

503 citations

Journal Article
TL;DR: In this article, the Dirichlet process mixtures are used to model the joint distribution of response variable, y, and covariates, x, non-parametrically using Dirichlets.
Abstract: We introduce a new nonlinear model for classification, in which we model the joint distribution of response variable, y, and covariates, x, non-parametrically using Dirichlet process mixtures. We keep the relationship between y and x linear within each component of the mixture. The overall relationship becomes nonlinear if the mixture contains more than one component, with different regression coefficients. We use simulated data to compare the performance of this new approach to alternative methods such as multinomial logit (MNL) models, decision trees, and support vector machines. We also evaluate our approach on two classification problems: identifying the folding class of protein sequences and detecting Parkinson's disease. Our model can sometimes improve predictive accuracy. Moreover, by grouping observations into sub-populations (i.e., mixture components), our model can sometimes provide insight into hidden structure in the data.

261 citations

Journal ArticleDOI
TL;DR: The finding provides the first preliminary evidence in human beings that maternal psychological stress during pregnancy may exert a "programming" effect on the developing telomere biology system that is already apparent at birth, as reflected by the setting of newborn LTL.

198 citations

Journal ArticleDOI
TL;DR: It is hypothesized that a multivariate approach incorporating these 3 measures would have the greatest predictive value when treating human subjects with restorative therapies poststroke.
Abstract: Objective This study was undertaken to better understand the high variability in response seen when treating human subjects with restorative therapies poststroke. Preclinical studies suggest that neural function, neural injury, and clinical status each influence treatment gains; therefore, the current study hypothesized that a multivariate approach incorporating these 3 measures would have the greatest predictive value. Methods Patients 3 to 6 months poststroke underwent a battery of assessments before receiving 3 weeks of standardized upper extremity robotic therapy. Candidate predictors included measures of brain injury (including to gray and white matter), neural function (cortical function and cortical connectivity), and clinical status (demographics/medical history, cognitive/mood, and impairment). Results Among all 29 patients, predictors of treatment gains identified measures of brain injury (smaller corticospinal tract [CST] injury), cortical function (greater ipsilesional motor cortex [M1] activation), and cortical connectivity (greater interhemispheric M1–M1 connectivity). Multivariate modeling found that best prediction was achieved using both CST injury and M1–M1 connectivity (r2 = 0.44, p = 0.002), a result confirmed using Lasso regression. A threshold was defined whereby no subject with >63% CST injury achieved clinically significant gains. Results differed according to stroke subtype; gains in patients with lacunar stroke were best predicted by a measure of intrahemispheric connectivity. Interpretation Response to a restorative therapy after stroke is best predicted by a model that includes measures of both neural injury and function. Neuroimaging measures were the best predictors and may have an ascendant role in clinical decision making for poststroke rehabilitation, which remains largely reliant on behavioral assessments. Results differed across stroke subtypes, suggesting the utility of lesion-specific strategies. ANN NEUROL 2015;77:132–145

175 citations

Proceedings Article
21 Jun 2014
TL;DR: This work argues that stochastic gradient MCMC algorithms are particularly suited for distributed inference because individual chains can draw mini-batches from their local pool of data for a flexible amount of time before jumping to or syncing with other chains, which greatly reduces communication overhead and allows adaptive load balancing.
Abstract: Probabilistic inference on a big data scale is becoming increasingly relevant to both the machine learning and statistics communities. Here we introduce the first fully distributed MCMC algorithm based on stochastic gradients. We argue that stochastic gradient MCMC algorithms are particularly suited for distributed inference because individual chains can draw mini-batches from their local pool of data for a flexible amount of time before jumping to or syncing with other chains. This greatly reduces communication overhead and allows adaptive load balancing. Our experiments for LDA on Wikipedia and Pubmed show that relative to the state of the art in distributed MCMC we reduce compute time from 27 hours to half an hour in order to reach the same perplexity level.

109 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

Journal ArticleDOI

6,278 citations

Journal Article
TL;DR: The No-U-Turn Sampler (NUTS), an extension to HMC that eliminates the need to set a number of steps L, and derives a method for adapting the step size parameter {\epsilon} on the fly based on primal-dual averaging.
Abstract: Hamiltonian Monte Carlo (HMC) is a Markov chain Monte Carlo (MCMC) algorithm that avoids the random walk behavior and sensitivity to correlated parameters that plague many MCMC methods by taking a series of steps informed by first-order gradient information. These features allow it to converge to high-dimensional target distributions much more quickly than simpler methods such as random walk Metropolis or Gibbs sampling. However, HMC's performance is highly sensitive to two user-specified parameters: a step size e and a desired number of steps L. In particular, if L is too small then the algorithm exhibits undesirable random walk behavior, while if L is too large the algorithm wastes computation. We introduce the No-U-Turn Sampler (NUTS), an extension to HMC that eliminates the need to set a number of steps L. NUTS uses a recursive algorithm to build a set of likely candidate points that spans a wide swath of the target distribution, stopping automatically when it starts to double back and retrace its steps. Empirically, NUTS performs at least as efficiently as (and sometimes more effciently than) a well tuned standard HMC method, without requiring user intervention or costly tuning runs. We also derive a method for adapting the step size parameter e on the fly based on primal-dual averaging. NUTS can thus be used with no hand-tuning at all, making it suitable for applications such as BUGS-style automatic inference engines that require efficient "turnkey" samplers.

1,988 citations

Journal ArticleDOI

1,484 citations