scispace - formally typeset
Search or ask a question

Showing papers on "Latent variable model published in 2020"


Proceedings Article
30 Apr 2020
TL;DR: Dreamer is presented, a reinforcement learning agent that solves long-horizon tasks purely by latent imagination and efficiently learn behaviors by backpropagating analytic gradients of learned state values through trajectories imagined in the compact state space of a learned world model.
Abstract: To select effective actions in complex environments, intelligent agents need to generalize from past experience. World models can represent knowledge about the environment to facilitate such generalization. While learning world models from high-dimensional sensory inputs is becoming feasible through deep learning, there are many potential ways for deriving behaviors from them. We present Dreamer, a reinforcement learning agent that solves long-horizon tasks purely by latent imagination. We efficiently learn behaviors by backpropagating analytic gradients of learned state values through trajectories imagined in the compact state space of a learned world model. On 20 challenging visual control tasks, Dreamer exceeds existing approaches in data-efficiency, computation time, and final performance.

604 citations


Journal ArticleDOI
TL;DR: In this paper, a practical guide to conducting latent profile analysis (LPA) in the Mplus software system is presented, which is intended for researchers familiar with some latent variable modes.
Abstract: The present guide provides a practical guide to conducting latent profile analysis (LPA) in the Mplus software system. This guide is intended for researchers familiar with some latent variable mode...

205 citations


Journal ArticleDOI
TL;DR: The goal is to encourage researchers to more critically evaluate how they obtain, justify, and use multiple-item scale scores and to raise awareness that sum scoring requires rather strict constraints.
Abstract: A common way to form scores from multiple-item scales is to sum responses of all items. Though sum scoring is often contrasted with factor analysis as a competing method, we review how factor analysis and sum scoring both fall under the larger umbrella of latent variable models, with sum scoring being a constrained version of a factor analysis. Despite similarities, reporting of psychometric properties for sum scored or factor analyzed scales are quite different. Further, if researchers use factor analysis to validate a scale but subsequently sum score the scale, this employs a model that differs from validation model. By framing sum scoring within a latent variable framework, our goal is to raise awareness that (a) sum scoring requires rather strict constraints, (b) imposing these constraints requires the same type of justification as any other latent variable model, and (c) sum scoring corresponds to a statistical model and is not a model-free arithmetic calculation. We discuss how unjustified sum scoring can have adverse effects on validity, reliability, and qualitative classification from sum score cut-offs. We also discuss considerations for how to use scale scores in subsequent analyses and how these choices can alter conclusions. The general goal is to encourage researchers to more critically evaluate how they obtain, justify, and use multiple-item scale scores.

185 citations


Proceedings Article
30 Apr 2020
TL;DR: The proposed sequential latent variable model can keep track of the prior and posterior distribution over knowledge and can not only reduce the ambiguity caused from the diversity in knowledge selection of conversation but also better leverage the response information for proper choice of knowledge.
Abstract: Knowledge-grounded dialogue is a task of generating an informative response based on both discourse context and external knowledge. As we focus on better modeling the knowledge selection in the multi-turn knowledge-grounded dialogue, we propose a sequential latent variable model as the first approach to this matter. The model named sequential knowledge transformer (SKT) can keep track of the prior and posterior distribution over knowledge; as a result, it can not only reduce the ambiguity caused from the diversity in knowledge selection of conversation but also better leverage the response information for proper choice of knowledge. Our experimental results show that the proposed model improves the knowledge selection accuracy and subsequently the performance of utterance generation. We achieve the new state-of-the-art performance on Wizard of Wikipedia (Dinan et al., 2019) as one of the most large-scale and challenging benchmarks. We further validate the effectiveness of our model over existing conversation methods in another knowledge-based dialogue Holl-E dataset (Moghe et al., 2018).

117 citations


Proceedings Article
03 Jun 2020
TL;DR: This work proposes a new deep sequential latent variable model for dimensionality reduction and data imputation of multivariate time series from the domains of computer vision and healthcare, and demonstrates that this approach outperforms several classical and deep learning-based data imputations methods on high-dimensional data.
Abstract: Multivariate time series with missing values are common in areas such as healthcare and finance, and have grown in number and complexity over the years This raises the question whether deep learning methodologies can outperform classical data imputation methods in this domain However, naive applications of deep learning fall short in giving reliable confidence estimates and lack interpretability We propose a new deep sequential latent variable model for dimensionality reduction and data imputation Our modeling assumption is simple and interpretable: the high dimensional time series has a lower-dimensional representation which evolves smoothly in time according to a Gaussian process The non-linear dimensionality reduction in the presence of missing data is achieved using a VAE approach with a novel structured variational approximation We demonstrate that our approach outperforms several classical and deep learning-based data imputation methods on high-dimensional data from the domains of computer vision and healthcare, while additionally improving the smoothness of the imputations and providing interpretable uncertainty estimates

105 citations


Proceedings Article
01 Jan 2020
TL;DR: The stochastic latent actor-critic (SLAC) algorithm as discussed by the authors is a sample-efficient and highperforming RL algorithm for learning policies for complex continuous control tasks directly from high-dimensional image inputs.
Abstract: Deep reinforcement learning (RL) algorithms can use high-capacity deep networks to learn directly from image observations. However, these high-dimensional observation spaces present a number of challenges in practice, since the policy must now solve two problems: representation learning and task learning. In this work, we tackle these two problems separately, by explicitly learning latent representations that can accelerate reinforcement learning from images. We propose the stochastic latent actor-critic (SLAC) algorithm: a sample-efficient and high-performing RL algorithm for learning policies for complex continuous control tasks directly from high-dimensional image inputs. SLAC provides a novel and principled approach for unifying stochastic sequential models and RL into a single method, by learning a compact latent representation and then performing RL in the model's learned latent space. Our experimental evaluation demonstrates that our method outperforms both model-free and model-based alternatives in terms of final performance and sample efficiency, on a range of difficult image-based control tasks. Our code and videos of our results are available at our website.

86 citations


Journal ArticleDOI
TL;DR: This work forms the road traffic forecasting problem as a latent variable model, assuming that traffic data is not generated randomly but from a latent space with fewer dimensions containing the underlying characteristics of traffic, and proposes a variational autoencoder (VAE) model to learn how traffic data are generated and inferred.
Abstract: Efforts devoted to mitigate the effects of road traffic congestion have been conducted since 1970s. Nowadays, there is a need for prominent solutions capable of mining information from messy and multidimensional road traffic data sets with few modeling constraints. In that sense, we propose a unique and versatile model to address different major challenges of traffic forecasting in an unsupervised manner. We formulate the road traffic forecasting problem as a latent variable model, assuming that traffic data is not generated randomly but from a latent space with fewer dimensions containing the underlying characteristics of traffic. We solve the problem by proposing a variational autoencoder (VAE) model to learn how traffic data are generated and inferred, while validating it against three different real-world traffic data sets. Under this framework, we propose an online unsupervised imputation method for unobserved traffic data with missing values. Additionally, taking advantage of the low dimension latent space learned, we compress the traffic data before applying a prediction model obtaining improvements in the forecasting accuracy. Finally, given that the model not only learns useful forecasting features but also meaningful characteristics, we explore the latent space as a tool for model and data selection and traffic anomaly detection from the point of view of traffic modelers.

82 citations


Posted ContentDOI
21 Oct 2020-bioRxiv
TL;DR: Analysis of data from two mouse decision-making experiments found that choice behavior relies on an interplay between multiple interleaved strategies, characterized by states in a hidden Markov model, which persist for tens to hundreds of trials before switching, and may alternate multiple times within a session.
Abstract: Classical models of perceptual decision-making assume that animals use a single, consistent strategy to integrate sensory evidence and form decisions during an experiment. Here we provide analyses showing that this common view is incorrect. We use a latent variable modeling framework to show that decision-making behavior in mice reflects an interplay between different strategies that alternate on a timescale of tens to hundreds of trials. This model provides a powerful alternate explanation for “lapses” commonly observed during psychophysical experiments. Formally, our approach consists of a Hidden Markov Model (HMM) with states corresponding to different decision-making strategies, each parameterized by a distinct Bernoulli generalized linear model (GLM). We fit the resulting model (GLM-HMM) to choice data from two large cohorts of mice in different perceptual decision-making tasks. For both datasets, we found that mouse decision-making was far better described by a GLM-HMM with 3 or 4 states than by a traditional psychophysical model with lapses. The identified states were highly consistent across animals, consisting of a single “engaged” state, in which the strategy relied heavily on the sensory stimulus, and multiple biased or disengaged states in which accuracy was low. These states persisted for many trials, suggesting that lapses were not independent, but reflected state dynamics in which animals were relatively engaged or disengaged for extended periods of time. We found that for most animals, response times and violation rates were positively correlated with disengagement, providing independent correlates of the identified changes in strategy. The GLM-HMM framework thus provides a powerful lens for the analysis of decision-making, and suggests that standard measures of psychophysical performance mask the presence of slow but dramatic alternations in strategy across trials.

66 citations


Proceedings ArticleDOI
01 Nov 2020
TL;DR: A prior selection module is enhanced with the necessary posterior information obtained from the specially designed Posterior Information Prediction Module (PIPM) and a Knowledge Distillation Based Training Strategy (KDBTS) is proposed to train the decoder with the knowledge selected from the prior distribution, removing the exposure bias of knowledge selection.
Abstract: Knowledge selection plays an important role in knowledge-grounded dialogue, which is a challenging task to generate more informative responses by leveraging external knowledge. Recently, latent variable models have been proposed to deal with the diversity of knowledge selection by using both prior and posterior distributions over knowledge and achieve promising performance. However, these models suffer from a huge gap between prior and posterior knowledge selection. Firstly, the prior selection module may not learn to select knowledge properly because of lacking the necessary posterior information. Secondly, latent variable models suffer from the exposure bias that dialogue generation is based on the knowledge selected from the posterior distribution at training but from the prior distribution at inference. Here, we deal with these issues on two aspects: (1) We enhance the prior selection module with the necessary posterior information obtained from the specially designed Posterior Information Prediction Module (PIPM); (2) We propose a Knowledge Distillation Based Training Strategy (KDBTS) to train the decoder with the knowledge selected from the prior distribution, removing the exposure bias of knowledge selection. Experimental results on two knowledge-grounded dialogue datasets show that both PIPM and KDBTS achieve performance improvement over the state-of-the-art latent variable model and their combination shows further improvement.

63 citations


Posted Content
TL;DR: FlyingSquid is built, a weak supervision framework that runs orders of magnitude faster than previous weak supervision approaches and requires fewer assumptions, and proves bounds on generalization error without assuming that the latent variable model can exactly parameterize the underlying data distribution.
Abstract: Weak supervision is a popular method for building machine learning models without relying on ground truth annotations. Instead, it generates probabilistic training labels by estimating the accuracies of multiple noisy labeling sources (e.g., heuristics, crowd workers). Existing approaches use latent variable estimation to model the noisy sources, but these methods can be computationally expensive, scaling superlinearly in the data. In this work, we show that, for a class of latent variable models highly applicable to weak supervision, we can find a closed-form solution to model parameters, obviating the need for iterative solutions like stochastic gradient descent (SGD). We use this insight to build FlyingSquid, a weak supervision framework that runs orders of magnitude faster than previous weak supervision approaches and requires fewer assumptions. In particular, we prove bounds on generalization error without assuming that the latent variable model can exactly parameterize the underlying data distribution. Empirically, we validate FlyingSquid on benchmark weak supervision datasets and find that it achieves the same or higher quality compared to previous approaches without the need to tune an SGD procedure, recovers model parameters 170 times faster on average, and enables new video analysis and online learning applications.

60 citations


Book ChapterDOI
Sergio Casas1, Cole Gulino1, Simon Suo1, Katie Luo1, Renjie Liao1, Raquel Urtasun1 
23 Aug 2020
TL;DR: In this article, the authors use graph neural networks to learn a distributed latent representation of the scene and obtain trajectory samples that are consistent across traffic participants, achieving state-of-the-art results in motion forecasting and interaction understanding.
Abstract: In order to plan a safe maneuver an autonomous vehicle must accurately perceive its environment, and understand the interactions among traffic participants. In this paper, we aim to learn scene-consistent motion forecasts of complex urban traffic directly from sensor data. In particular, we propose to characterize the joint distribution over future trajectories via an implicit latent variable model. We model the scene as an interaction graph and employ powerful graph neural networks to learn a distributed latent representation of the scene. Coupled with a deterministic decoder, we obtain trajectory samples that are consistent across traffic participants, achieving state-of-the-art results in motion forecasting and interaction understanding. Last but not least, we demonstrate that our motion forecasts result in safer and more comfortable motion planning.

Posted Content
Guangzhi Sun1, Yu Zhang2, Ron Weiss2, Yuan Cao2, Heiga Zen2, Yonghui Wu2 
TL;DR: This paper proposes a hierarchical, fine-grained and interpretable latent variable model for prosody based on the Tacotron 2 text-to-speech model that achieves multi-resolution modeling of prosody by conditioning finer level representations on coarser level ones.
Abstract: This paper proposes a hierarchical, fine-grained and interpretable latent variable model for prosody based on the Tacotron 2 text-to-speech model. It achieves multi-resolution modeling of prosody by conditioning finer level representations on coarser level ones. Additionally, it imposes hierarchical conditioning across all latent dimensions using a conditional variational auto-encoder (VAE) with an auto-regressive structure. Evaluation of reconstruction performance illustrates that the new structure does not degrade the model while allowing better interpretability. Interpretations of prosody attributes are provided together with the comparison between word-level and phone-level prosody representations. Moreover, both qualitative and quantitative evaluations are used to demonstrate the improvement in the disentanglement of the latent dimensions.

Posted Content
Sergio Casas1, Cole Gulino1, Simon Suo1, Katie Luo1, Renjie Liao1, Raquel Urtasun1 
TL;DR: This paper proposes to characterize the joint distribution over future trajectories via an implicit latent variable model and model the scene as an interaction graph and employs powerful graph neural networks to learn a distributed latent representation of the scene.
Abstract: In order to plan a safe maneuver an autonomous vehicle must accurately perceive its environment, and understand the interactions among traffic participants. In this paper, we aim to learn scene-consistent motion forecasts of complex urban traffic directly from sensor data. In particular, we propose to characterize the joint distribution over future trajectories via an implicit latent variable model. We model the scene as an interaction graph and employ powerful graph neural networks to learn a distributed latent representation of the scene. Coupled with a deterministic decoder, we obtain trajectory samples that are consistent across traffic participants, achieving state-of-the-art results in motion forecasting and interaction understanding. Last but not least, we demonstrate that our motion forecasts result in safer and more comfortable motion planning.

Journal ArticleDOI
TL;DR: In this article, the authors investigated the relationship between electronic service quality (e-SQ) and e-satisfaction, as well as that between e-Satisfaction and E-loyalty within Spanish fashion brand e-retailers.

Proceedings ArticleDOI
Guangzhi Sun1, Yu Zhang2, Ron Weiss2, Yuan Cao2, Heiga Zen2, Yonghui Wu2 
04 May 2020
TL;DR: This paper proposed a hierarchical, fine-grained and interpretable latent variable model for prosody based on the Tacotron 2 text-to-speech model, which achieves multi-resolution modeling of prosody by conditioning finer level representations on coarser level ones.
Abstract: This paper proposes a hierarchical, fine-grained and interpretable latent variable model for prosody based on the Tacotron 2 text-to-speech model. It achieves multi-resolution modeling of prosody by conditioning finer level representations on coarser level ones. Additionally, it imposes hierarchical conditioning across all latent dimensions using a conditional variational auto-encoder (VAE) with an auto-regressive structure. Evaluation of reconstruction performance illustrates that the new structure does not degrade the model while allowing better interpretability. Interpretations of prosody attributes are provided together with the comparison between word-level and phone-level prosody representations. Moreover, both qualitative and quantitative evaluations are used to demonstrate the improvement in the disentanglement of the latent dimensions.

Proceedings ArticleDOI
01 Nov 2020
TL;DR: This paper proposes a probabilistic dialog model, called the LAtent BElief State (LABES) model, where belief states are represented as discrete latent variables and jointly modeled with system responses given user inputs to develop semi-supervised learning under the principled variational learning framework.
Abstract: Structured belief states are crucial for user goal tracking and database query in task-oriented dialog systems. However, training belief trackers often requires expensive turn-level annotations of every user utterance. In this paper we aim at alleviating the reliance on belief state labels in building end-to-end dialog systems, by leveraging unlabeled dialog data towards semi-supervised learning. We propose a probabilistic dialog model, called the LAtent BElief State (LABES) model, where belief states are represented as discrete latent variables and jointly modeled with system responses given user inputs. Such latent variable modeling enables us to develop semi-supervised learning under the principled variational learning framework. Furthermore, we introduce LABES-S2S, which is a copy-augmented Seq2Seq model instantiation of LABES. In supervised experiments, LABES-S2S obtains strong results on three benchmark datasets of different scales. In utilizing unlabeled dialog data, semi-supervised LABES-S2S significantly outperforms both supervised-only and semi-supervised baselines. Remarkably, we can reduce the annotation demands to 50% without performance loss on MultiWOZ.

Journal ArticleDOI
TL;DR: A supervised nonlinear dynamic system (NDS) based on variational auto-encoder (VAE) is introduced for processes with dynamic behaviors and nonlinear characteristics and can extract effective nonlinear features for latent variable regression.
Abstract: Dynamic data modeling has been attracting much attention from researchers and has been introduced into the probabilistic latent variable model in the process industry. It is a huge challenge to extend these dynamic probabilistic latent variable models to nonlinear forms. In this article, a supervised nonlinear dynamic system (NDS) based on variational auto-encoder (VAE) is introduced for processes with dynamic behaviors and nonlinear characteristics. Based on the framework of VAE, which has a probabilistic data representation and a high fitting ability, the supervised NDS can extract effective nonlinear features for latent variable regression. The feasibility of the proposed supervised NDS is tested on two numerical examples and an industrial case. Detailed comparisons verify the effectiveness and superiority of the proposed model.

Posted Content
TL;DR: This paper proposed a latent variable model that builds on normalizing flows with affine coupling layers to generate 3D point clouds of an arbitrary size given a latent shape representation, and applied this model for shape modeling.
Abstract: Generative models have proven effective at modeling 3D shapes and their statistical variations. In this paper we investigate their application to point clouds, a 3D shape representation widely used in computer vision for which, however, only few generative models have yet been proposed. We introduce a latent variable model that builds on normalizing flows with affine coupling layers to generate 3D point clouds of an arbitrary size given a latent shape representation. To evaluate its benefits for shape modeling we apply this model for generation, autoencoding, and single-view shape reconstruction tasks. We improve over recent GAN-based models in terms of most metrics that assess generation and autoencoding. Compared to recent work based on continuous flows, our model offers a significant speedup in both training and inference times for similar or better performance. For single-view shape reconstruction we also obtain results on par with state-of-the-art voxel, point cloud, and mesh-based methods.

Journal ArticleDOI
TL;DR: It is concluded that standard SEM software is not suitable for the comparison of psychometric networks with latent variable models, and the penta-factor model of intelligence is only of limited value, as it is nonidentified.
Abstract: In memory of Dr. Dennis John McFarland, who passed away recently, our objective is to continue his efforts to compare psychometric networks and latent variable models statistically. We do so by providing a commentary on his latest work, which he encouraged us to write, shortly before his death. We first discuss the statistical procedure McFarland used, which involved structural equation modeling (SEM) in standard SEM software. Next, we evaluate the penta-factor model of intelligence. We conclude that (1) standard SEM software is not suitable for the comparison of psychometric networks with latent variable models, and (2) the penta-factor model of intelligence is only of limited value, as it is nonidentified. We conclude with a reanalysis of the Wechlser Adult Intelligence Scale data McFarland discussed and illustrate how network and latent variable models can be compared using the recently developed R package Psychonetrics. Of substantive theoretical interest, the results support a network interpretation of general intelligence. A novel empirical finding is that networks of intelligence replicate over standardization samples.

Journal ArticleDOI
02 Mar 2020
TL;DR: In this paper, the authors present a framework to map multi-modal data collected in the wild to meaningful feature representations of health-related behaviors, uncover latent patterns comprising combinations of behaviors that best predict health and well-being, and use these learned patterns to make evidence-based recommendations that may improve health.
Abstract: Multiple behaviors typically work together to influence health, making it hard to understand how one behavior might compensate for another. Rich multi-modal datasets from mobile sensors and advances in machine learning are today enabling new kinds of associations to be made between combinations of behaviors objectively assessed from daily life and self-reported levels of stress, mood, and health. In this article, we present a framework to (1) map multi-modal messy data collected in the “wild” to meaningful feature representations of health-related behaviors, (2) uncover latent patterns comprising combinations of behaviors that best predict health and well-being, and (3) use these learned patterns to make evidence-based recommendations that may improve health and well-being. We show how to use supervised latent Dirichlet allocation to model the observed behaviors, and we apply variational inference to uncover the latent patterns. Implementing and evaluating the model on 5,397 days of data from a group of 244 college students, we find that these latent patterns are indeed predictive of daily self-reported levels of stressed-calm, sad-happy, and sick-healthy states. We investigate the patterns of modifiable behaviors present on different days and uncover several ways in which they relate to stress, mood, and health. This work contributes a new method using objective data analysis to help advance understanding of how combinations of modifiable human behaviors may promote human health and well-being.

Proceedings ArticleDOI
14 Jun 2020
TL;DR: In this paper, a modality-consistent embedding network (MCEN) is proposed to learn modalityinvariant representations by projecting images and texts to the same embedding space.
Abstract: Nowadays, driven by the increasing concern on diet and health, food computing has attracted enormous attention from both industry and research community. One of the most popular research topics in this domain is Food Retrieval, due to its profound influence on health-oriented applications. In this paper, we focus on the task of cross-modal retrieval between food images and cooking recipes. We present Modality-Consistent Embedding Network (MCEN) that learns modality-invariant representations by projecting images and texts to the same embedding space. To capture the latent alignments between modalities, we incorporate stochastic latent variables to explicitly exploit the interactions between textual and visual features. Importantly, our method learns the cross-modal alignments during training but computes embeddings of different modalities independently at inference time for the sake of efficiency. Extensive experimental results clearly demonstrate that the proposed MCEN outperforms all existing approaches on the benchmark Recipe1M dataset and requires less computational cost.

Posted Content
TL;DR: This article proposed a sequential latent variable model to better model the knowledge selection in multi-turn knowledge-grounded dialogue, and achieved state-of-the-art performance on Wizard of Wikipedia dataset.
Abstract: Knowledge-grounded dialogue is a task of generating an informative response based on both discourse context and external knowledge. As we focus on better modeling the knowledge selection in the multi-turn knowledge-grounded dialogue, we propose a sequential latent variable model as the first approach to this matter. The model named sequential knowledge transformer (SKT) can keep track of the prior and posterior distribution over knowledge; as a result, it can not only reduce the ambiguity caused from the diversity in knowledge selection of conversation but also better leverage the response information for proper choice of knowledge. Our experimental results show that the proposed model improves the knowledge selection accuracy and subsequently the performance of utterance generation. We achieve the new state-of-the-art performance on Wizard of Wikipedia (Dinan et al., 2019) as one of the most large-scale and challenging benchmarks. We further validate the effectiveness of our model over existing conversation methods in another knowledge-based dialogue Holl-E dataset (Moghe et al., 2018).

Journal ArticleDOI
TL;DR: It is argued that the multilayer network approach may contribute to an understanding of personality as a complex system comprised of interrelated psychological and neural features, potentially allowing the discernment of more complete descriptions of individual differences, and psychiatric and neurological changes that accompany disease.
Abstract: It has long been understood that a multitude of biological systems, from genetics, to brain networks, to psychological factors, all play a role in personality. Understanding how these systems interact with each other to form both relatively stable patterns of behaviour, cognition and emotion, but also vast individual differences and psychiatric disorders, however, requires new methodological insight. This article explores a way in which to integrate multiple levels of personality simultaneously, with particular focus on its neural and psychological constituents. It does so first by reviewing the current methodology of studies used to relate the two levels, where psychological traits, often defined with a latent variable model are used as higher-level concepts to identify the neural correlates of personality (NCPs). This is known as a top-down approach, which though useful in revealing correlations, is not able to include the fine-grained interactions that occur at both levels. As an alternative, we discuss the use of a novel complex system approach known as a multilayer network, a technique that has recently proved successful in revealing veracious interactions between networks at more than one level. The benefits of the multilayer approach to the study of personality neuroscience follow from its well-founded theoretical basis in network science. Its predictive and descriptive power may surpass that of statistical top-down and latent variable models alone, potentially allowing the discernment of more complete descriptions of individual differences, and psychiatric and neurological changes that accompany disease. Though in its infancy, and subject to a number of methodological unknowns, we argue that the multilayer network approach may contribute to an understanding of personality as a complex system comprised of interrelated psychological and neural features.

Journal ArticleDOI
TL;DR: An integrated choice and latent variable model is utilized to capture individuals' likelihood to adopt level 4 CAVs based on their social values in their peer network using an institutional survey dataset and results suggest that households with high income and frequent car buyers are more likely to adopt CAVs.
Abstract: Adoption of connected and autonomous vehicles (CAVs) is viewed as one of the vital factors by public and private agencies as benefits are slowly getting quantified with further advancement in technology. From a wide variety of CAV perception and demand estimation studies, the literature lacks the impact of adoption based on an individual's social network and values. In this paper, we utilize an integrated choice and latent variable model to capture individuals' likelihood to adopt level 4 CAVs based on their social values in their peer network using an institutional survey dataset. The model results suggest that households with high income and frequent car buyers are more likely to adopt CAVs. CAV adoption will have a positive influence on an individual's social values among his peers. The proposed framework can be used to provide useful insights for policymakers to quantify consumers' preferences about CAV adoption based on their social values.

Book ChapterDOI
20 Jul 2020
TL;DR: A latent variable model is introduced that builds on normalizing flows with affine coupling layers to generate 3D point clouds of an arbitrary size given a latent shape representation and offers a significant speedup in both training and inference times for similar or better performance.
Abstract: Generative models have proven effective at modeling 3D shapes and their statistical variations. In this paper we investigate their application to point clouds, a 3D shape representation widely used in computer vision for which, however, only few generative models have yet been proposed. We introduce a latent variable model that builds on normalizing flows with affine coupling layers to generate 3D point clouds of an arbitrary size given a latent shape representation. To evaluate its benefits for shape modeling we apply this model for generation, autoencoding, and single-view shape reconstruction tasks. We improve over recent GAN-based models in terms of most metrics that assess generation and autoencoding. Compared to recent work based on continuous flows, our model offers a significant speedup in both training and inference times for similar or better performance. For single-view shape reconstruction we also obtain results on par with state-of-the-art voxel, point cloud, and mesh-based methods.

Posted Content
TL;DR: A noise-aware encoder-decoder framework to disentangle a clean saliency predictor from noisy training examples, where the noisy labels are generated by unsupervised handcrafted feature-based methods is proposed.
Abstract: In this paper, we propose a noise-aware encoder-decoder framework to disentangle a clean saliency predictor from noisy training examples, where the noisy labels are generated by unsupervised handcrafted feature-based methods. The proposed model consists of two sub-models parameterized by neural networks: (1) a saliency predictor that maps input images to clean saliency maps, and (2) a noise generator, which is a latent variable model that produces noises from Gaussian latent vectors. The whole model that represents noisy labels is a sum of the two sub-models. The goal of training the model is to estimate the parameters of both sub-models, and simultaneously infer the corresponding latent vector of each noisy label. We propose to train the model by using an alternating back-propagation (ABP) algorithm, which alternates the following two steps: (1) learning back-propagation for estimating the parameters of two sub-models by gradient ascent, and (2) inferential back-propagation for inferring the latent vectors of training noisy examples by Langevin Dynamics. To prevent the network from converging to trivial solutions, we utilize an edge-aware smoothness loss to regularize hidden saliency maps to have similar structures as their corresponding images. Experimental results on several benchmark datasets indicate the effectiveness of the proposed model.

Posted Content
TL;DR: It is shown that "learning to fly" can be achieved with less than 30 minutes of experience with a single drone, and can be deployed solely using onboard computational resources and sensors, on a self-built drone.
Abstract: Learning to control robots without requiring engineered models has been a long-term goal, promising diverse and novel applications. Yet, reinforcement learning has only achieved limited impact on real-time robot control due to its high demand of real-world interactions. In this work, by leveraging a learnt probabilistic model of drone dynamics, we learn a thrust-attitude controller for a quadrotor through model-based reinforcement learning. No prior knowledge of the flight dynamics is assumed; instead, a sequential latent variable model, used generatively and as an online filter, is learnt from raw sensory input. The controller and value function are optimised entirely by propagating stochastic analytic gradients through generated latent trajectories. We show that "learning to fly" can be achieved with less than 30 minutes of experience with a single drone, and can be deployed solely using onboard computational resources and sensors, on a self-built drone.

Posted Content
TL;DR: It is shown how, given only a latent variable model for states and actions, policy value can be identified from off-policy data, and optimal balancing can be combined with such learned ratios to obtain policy value while avoiding direct modeling of reward functions.
Abstract: Off-policy evaluation (OPE) in reinforcement learning is an important problem in settings where experimentation is limited, such as education and healthcare. But, in these very same settings, observed actions are often confounded by unobserved variables making OPE even more difficult. We study an OPE problem in an infinite-horizon, ergodic Markov decision process with unobserved confounders, where states and actions can act as proxies for the unobserved confounders. We show how, given only a latent variable model for states and actions, policy value can be identified from off-policy data. Our method involves two stages. In the first, we show how to use proxies to estimate stationary distribution ratios, extending recent work on breaking the curse of horizon to the confounded setting. In the second, we show optimal balancing can be combined with such learned ratios to obtain policy value while avoiding direct modeling of reward functions. We establish theoretical guarantees of consistency, and benchmark our method empirically.

Journal ArticleDOI
TL;DR: Simulated data studies demonstrate that the LV-GIMME method can reliably detect relations among latent constructs, and that latent constructs provide more power to detect effects than using observed variables directly.
Abstract: Researchers across many domains of psychology increasingly wish to arrive at personalized and generalizable dynamic models of individuals' processes. This is seen in psychophysiological, behavioral, and emotional research paradigms, across a range of data types. Errors of measurement are inherent in most data. For this reason, researchers typically gather multiple indicators of the same latent construct and use methods, such as factor analysis, to arrive at scores from these indices. In addition to accurately measuring individuals, researchers also need to find the model that best describes the relations among the latent constructs. Most currently available data-driven searches do not include latent variables. We present an approach that builds from the strong foundations of group iterative multiple model estimation (GIMME), the idiographic filter, and model implied instrumental variables with two-stage least squares estimation (MIIV-2SLS) to provide researchers with the option to include latent variables in their data-driven model searches. The resulting approach is called latent variable GIMME (LV-GIMME). GIMME is utilized for the data-driven search for relations that exist among latent variables. Unlike other approaches such as the idiographic filter, LV-GIMME does not require that the latent variable model to be constant across individuals. This requirement is loosened by utilizing MIIV-2SLS for estimation. Simulated data studies demonstrate that the method can reliably detect relations among latent constructs, and that latent constructs provide more power to detect effects than using observed variables directly. We use empirical data examples drawn from functional MRI and daily self-report data. (PsycINFO Database Record (c) 2020 APA, all rights reserved).

Proceedings ArticleDOI
01 Nov 2020
TL;DR: A deep latent variable model is proposed that attempts to perform source separation on parallel sentences, isolating what they have in common in a latent semantic vector, and explaining what is left over with language-specific latent vectors.
Abstract: Semantic sentence embedding models encode natural language sentences into vectors, such that closeness in embedding space indicates closeness in the semantics between the sentences. Bilingual data offers a useful signal for learning such embeddings: properties shared by both sentences in a translation pair are likely semantic, while divergent properties are likely stylistic or language-specific. We propose a deep latent variable model that attempts to perform source separation on parallel sentences, isolating what they have in common in a latent semantic vector, and explaining what is left over with language-specific latent vectors. Our proposed approach differs from past work on semantic sentence encoding in two ways. First, by using a variational probabilistic framework, we introduce priors that encourage source separation, and can use our model's posterior to predict sentence embeddings for monolingual data at test time. Second, we use high-capacity transformers as both data generating distributions and inference networks -- contrasting with most past work on sentence embeddings. In experiments, our approach substantially outperforms the state-of-the-art on a standard suite of unsupervised semantic similarity evaluations. Further, we demonstrate that our approach yields the largest gains on more difficult subsets of these evaluations where simple word overlap is not a good indicator of similarity.