scispace - formally typeset
Search or ask a question
Author

Fredrik D. Johansson

Bio: Fredrik D. Johansson is an academic researcher from Chalmers University of Technology. The author has contributed to research in topics: Computer science & Causal inference. The author has an hindex of 20, co-authored 54 publications receiving 2023 citations. Previous affiliations of Fredrik D. Johansson include Technical University of Dortmund & Massachusetts Institute of Technology.


Papers
More filters
Proceedings Article
17 Jul 2017
TL;DR: In this paper, a family of algorithms for predicting individual treatment effect (ITE) from observational data, under the assumption known as strong ignorability, was proposed, where the algorithms learn a "balanced" representation such that the induced treated and control distributions look similar.
Abstract: There is intense interest in applying machine learning to problems of causal inference in fields such as healthcare, economics and education. In particular, individual-level causal inference has important applications such as precision medicine. We give a new theoretical analysis and family of algorithms for predicting individual treatment effect (ITE) from observational data, under the assumption known as strong ignorability. The algorithms learn a "balanced" representation such that the induced treated and control distributions look similar. We give a novel, simple and intuitive generalization-error bound showing that the expected ITE estimation error of a representation is bounded by a sum of the standard generalization-error of that representation and the distance between the treated and control distributions induced by the representation. We use Integral Probability Metrics to measure distances between distributions, deriving explicit bounds for the Wasserstein and Maximum Mean Discrepancy (MMD) distances. Experiments on real and simulated data show the new algorithms match or outperform the state-of-the-art.

300 citations

Posted Content
TL;DR: A new algorithmic framework for counterfactual inference is proposed which brings together ideas from domain adaptation and representation learning and significantly outperforms the previous state-of-the-art approaches.
Abstract: Observational studies are rising in importance due to the widespread accumulation of data in fields such as healthcare, education, employment and ecology. We consider the task of answering counterfactual questions such as, "Would this patient have lower blood sugar had she received a different medication?". We propose a new algorithmic framework for counterfactual inference which brings together ideas from domain adaptation and representation learning. In addition to a theoretical justification, we perform an empirical comparison with previous approaches to causal inference from observational data. Our deep learning algorithm significantly outperforms the previous state-of-the-art.

299 citations

Journal ArticleDOI
TL;DR: New guidelines for reinforcement learning for decisions about patient treatment are provided that are hoped will accelerate the rate at which observational cohorts can inform healthcare practice in a safe, risk-conscious manner.
Abstract: In this Comment, we provide guidelines for reinforcement learning for decisions about patient treatment that we hope will accelerate the rate at which observational cohorts can inform healthcare practice in a safe, risk-conscious manner.

285 citations

Posted Content
TL;DR: A novel, simple and intuitive generalization-error bound is given showing that the expected ITE estimation error of a representation is bounded by a sum of the standard generalized-error of that representation and the distance between the treated and control distributions induced by the representation.
Abstract: There is intense interest in applying machine learning to problems of causal inference in fields such as healthcare, economics and education. In particular, individual-level causal inference has important applications such as precision medicine. We give a new theoretical analysis and family of algorithms for predicting individual treatment effect (ITE) from observational data, under the assumption known as strong ignorability. The algorithms learn a "balanced" representation such that the induced treated and control distributions look similar. We give a novel, simple and intuitive generalization-error bound showing that the expected ITE estimation error of a representation is bounded by a sum of the standard generalization-error of that representation and the distance between the treated and control distributions induced by the representation. We use Integral Probability Metrics to measure distances between distributions, deriving explicit bounds for the Wasserstein and Maximum Mean Discrepancy (MMD) distances. Experiments on real and simulated data show the new algorithms match or outperform the state-of-the-art.

247 citations

Proceedings Article
30 May 2018
TL;DR: In this article, the authors argue that the fairness of predictions should be evaluated in context of the data, and that unfairness induced by inadequate samples sizes or unmeasured predictive variables should be addressed through data collection, rather than by constraining the model.
Abstract: Recent attempts to achieve fairness in predictive models focus on the balance between fairness and accuracy. In sensitive applications such as healthcare or criminal justice, this trade-off is often undesirable as any increase in prediction error could have devastating consequences. In this work, we argue that the fairness of predictions should be evaluated in context of the data, and that unfairness induced by inadequate samples sizes or unmeasured predictive variables should be addressed through data collection, rather than by constraining the model. We decompose cost-based metrics of discrimination into bias, variance, and noise, and propose actions aimed at estimating and reducing each term. Finally, we perform case-studies on prediction of income, mortality, and review ratings, confirming the value of this analysis. We find that data collection is often a means to reduce discrimination without sacrificing accuracy.

199 citations


Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

Journal ArticleDOI
TL;DR: This article provides a comprehensive overview of graph neural networks (GNNs) in data mining and machine learning fields and proposes a new taxonomy to divide the state-of-the-art GNNs into four categories, namely, recurrent GNNS, convolutional GNN’s, graph autoencoders, and spatial–temporal Gnns.
Abstract: Deep learning has revolutionized many machine learning tasks in recent years, ranging from image classification and video processing to speech recognition and natural language understanding. The data in these tasks are typically represented in the Euclidean space. However, there is an increasing number of applications, where data are generated from non-Euclidean domains and are represented as graphs with complex relationships and interdependency between objects. The complexity of graph data has imposed significant challenges on the existing machine learning algorithms. Recently, many studies on extending deep learning approaches for graph data have emerged. In this article, we provide a comprehensive overview of graph neural networks (GNNs) in data mining and machine learning fields. We propose a new taxonomy to divide the state-of-the-art GNNs into four categories, namely, recurrent GNNs, convolutional GNNs, graph autoencoders, and spatial–temporal GNNs. We further discuss the applications of GNNs across various domains and summarize the open-source codes, benchmark data sets, and model evaluation of GNNs. Finally, we propose potential research directions in this rapidly growing field.

4,584 citations

01 Dec 1982
TL;DR: In this article, it was shown that any black hole will create and emit particles such as neutrinos or photons at just the rate that one would expect if the black hole was a body with a temperature of (κ/2π) (ħ/2k) ≈ 10−6 (M/M)K where κ is the surface gravity of the body.
Abstract: QUANTUM gravitational effects are usually ignored in calculations of the formation and evolution of black holes. The justification for this is that the radius of curvature of space-time outside the event horizon is very large compared to the Planck length (Għ/c3)1/2 ≈ 10−33 cm, the length scale on which quantum fluctuations of the metric are expected to be of order unity. This means that the energy density of particles created by the gravitational field is small compared to the space-time curvature. Even though quantum effects may be small locally, they may still, however, add up to produce a significant effect over the lifetime of the Universe ≈ 1017 s which is very long compared to the Planck time ≈ 10−43 s. The purpose of this letter is to show that this indeed may be the case: it seems that any black hole will create and emit particles such as neutrinos or photons at just the rate that one would expect if the black hole was a body with a temperature of (κ/2π) (ħ/2k) ≈ 10−6 (M/M)K where κ is the surface gravity of the black hole1. As a black hole emits this thermal radiation one would expect it to lose mass. This in turn would increase the surface gravity and so increase the rate of emission. The black hole would therefore have a finite life of the order of 1071 (M/M)−3 s. For a black hole of solar mass this is much longer than the age of the Universe. There might, however, be much smaller black holes which were formed by fluctuations in the early Universe2. Any such black hole of mass less than 1015 g would have evaporated by now. Near the end of its life the rate of emission would be very high and about 1030 erg would be released in the last 0.1 s. This is a fairly small explosion by astronomical standards but it is equivalent to about 1 million 1 Mton hydrogen bombs. It is often said that nothing can escape from a black hole. But in 1974, Stephen Hawking realized that, owing to quantum effects, black holes should emit particles with a thermal distribution of energies — as if the black hole had a temperature inversely proportional to its mass. In addition to putting black-hole thermodynamics on a firmer footing, this discovery led Hawking to postulate 'black hole explosions', as primordial black holes end their lives in an accelerating release of energy.

2,947 citations

Posted Content
TL;DR: A list of five practical research problems related to accident risk, categorized according to whether the problem originates from having the wrong objective function, an objective function that is too expensive to evaluate frequently, or undesirable behavior during the learning process, are presented.
Abstract: Rapid progress in machine learning and artificial intelligence (AI) has brought increasing attention to the potential impacts of AI technologies on society. In this paper we discuss one such potential impact: the problem of accidents in machine learning systems, defined as unintended and harmful behavior that may emerge from poor design of real-world AI systems. We present a list of five practical research problems related to accident risk, categorized according to whether the problem originates from having the wrong objective function ("avoiding side effects" and "avoiding reward hacking"), an objective function that is too expensive to evaluate frequently ("scalable supervision"), or undesirable behavior during the learning process ("safe exploration" and "distributional shift"). We review previous work in these areas as well as suggesting research directions with a focus on relevance to cutting-edge AI systems. Finally, we consider the high-level question of how to think most productively about the safety of forward-looking applications of AI.

1,569 citations

Journal ArticleDOI
TL;DR: A comprehensive review of the literature in graph embedding can be found in this paper, where the authors introduce the formal definition of graph embeddings as well as the related concepts.
Abstract: Graph is an important data representation which appears in a wide diversity of real-world scenarios. Effective graph analytics provides users a deeper understanding of what is behind the data, and thus can benefit a lot of useful applications such as node classification, node recommendation, link prediction, etc. However, most graph analytics methods suffer the high computation and space cost. Graph embedding is an effective yet efficient way to solve the graph analytics problem. It converts the graph data into a low dimensional space in which the graph structural information and graph properties are maximumly preserved. In this survey, we conduct a comprehensive review of the literature in graph embedding. We first introduce the formal definition of graph embedding as well as the related concepts. After that, we propose two taxonomies of graph embedding which correspond to what challenges exist in different graph embedding problem settings and how the existing work addresses these challenges in their solutions. Finally, we summarize the applications that graph embedding enables and suggest four promising future research directions in terms of computation efficiency, problem settings, techniques, and application scenarios.

1,502 citations