Showing papers by "Klaus-Robert Müller published in 2022"

PDF

Open Access

Journal Article•DOI•

Finding and removing Clever Hans: Using explanation methods to debug and improve deep models

[...]

Christopher J. Anders¹, Wojciech Samek², Ray Jones³, Leander Weber¹, Leander Weber², David Neumann², Reade A. Quinton², Wojciech Samek², Klaus-Robert Müller, Sebastian Lapuschkin² - Show less +6 more•Institutions (3)

Technical University of Berlin¹, Heinrich Hertz Institute², Max Planck Institute for Informatics³

01 Jan 2022-Information Fusion

TL;DR: In this article, a scalable quantification of artifactual and poisoned classes where the machine learning models under study exhibit Clever Hans behavior is proposed, and several approaches are collectively termed as Class Artifact Compensation, which are able to effectively reduce a model's Clever Hans behaviour.

...read moreread less

23 citations

Journal Article•DOI•

Accurate global machine learning force fields for molecules with hundreds of atoms

[...]

Stefan Chmiela, Valentin Vassilev-Galindo, Oliver T. Unke, Adil M. Kabylda, Huziel E. Sauceda, Alexandre Tkatchenko, Klaus-Robert Müller - Show less +3 more

29 Sep 2022-Science Advances

TL;DR: This work develops an exact iterative parameter-free approach to train global symmetric gradient domain machine learning (sGDML) force for systems with up to several hundred atoms, without resorting to any localization of atomic interactions or other potentially uncontrolled approximations.

...read moreread less

Abstract: Global machine learning force fields, with the capacity to capture collective interactions in molecular systems, now scale up to a few dozen atoms due to considerable growth of model complexity with system size. For larger molecules, locality assumptions are introduced, with the consequence that nonlocal interactions are not described. Here, we develop an exact iterative approach to train global symmetric gradient domain machine learning (sGDML) force fields (FFs) for several hundred atoms, without resorting to any potentially uncontrolled approximations. All atomic degrees of freedom remain correlated in the global sGDML FF, allowing the accurate description of complex molecules and materials that present phenomena with far-reaching characteristic correlation lengths. We assess the accuracy and efficiency of sGDML on a newly developed MD22 benchmark dataset containing molecules from 42 to 370 atoms. The robustness of our approach is demonstrated in nanosecond path-integral molecular dynamics simulations for supramolecular complexes in the MD22 dataset.

...read moreread less

17 citations

Journal Article•DOI•

Towards robust explanations for deep neural networks

[...]

Ann-Kathrin Dombrowski¹, Ray Jones², Christopher J. Anders¹, Klaus-Robert Müller, Pan Kessel¹ - Show less +1 more•Institutions (2)

Technical University of Berlin¹, Max Planck Institute for Informatics²

01 Jan 2022-Pattern Recognition

TL;DR: In this paper, a unified theoretical framework for deriving bounds on the maximal manipulability of a model is developed. And three different techniques to boost robustness against manipulation are presented: weight decay, smoothing activation functions, and minimizing the Hessian of the network.

...read moreread less

16 citations

Journal Article•DOI•

2020 International brain–computer interface competition: A review

[...]

Ji-Hoon Jeong, Jeong-Hyun Cho, Young Eun Lee, Seo-Hyun Lee, Gi Hwan Shin, Young-Seok Kweon, José del R. Millán, Klaus-Robert Müller, Seong-Whan Lee - Show less +5 more

22 Jul 2022-Frontiers in Human Neuroscience

TL;DR: Remarkable BCI advances were identified through the 2020 competition and indicated some trends of interest to BCI researchers.

...read moreread less

Abstract: The brain-computer interface (BCI) has been investigated as a form of communication tool between the brain and external devices. BCIs have been extended beyond communication and control over the years. The 2020 international BCI competition aimed to provide high-quality neuroscientific data for open access that could be used to evaluate the current degree of technical advances in BCI. Although there are a variety of remaining challenges for future BCI advances, we discuss some of more recent application directions: (i) few-shot EEG learning, (ii) micro-sleep detection (iii) imagined speech decoding, (iv) cross-session classification, and (v) EEG(+ear-EEG) detection in an ambulatory environment. Not only did scientists from the BCI field compete, but scholars with a broad variety of backgrounds and nationalities participated in the competition to address these challenges. Each dataset was prepared and separated into three data that were released to the competitors in the form of training and validation sets followed by a test set. Remarkable BCI advances were identified through the 2020 competition and indicated some trends of interest to BCI researchers.

...read moreread less

14 citations

Proceedings Article•

XAI for Transformers: Better Explanations through Conservative Propagation

[...]

Ameen Ali, Thomas Schnake, Oliver Eberle, Grégoire Montavon, Klaus-Robert Müller, Lior Wolf - Show less +2 more

15 Feb 2022

TL;DR: This proposal, which can be seen as a proper extension of the well-established LRP method to Transformers, is shown both theoretically and empirically to overcome the deﬁciency of a simple gradient-based approach, and achieves state-of-the-art explanation performance on a broad range of Transformer models and datasets.

...read moreread less

Abstract: Transformers have become an important workhorse of machine learning, with numerous applications. This necessitates the development of reliable methods for increasing their transparency. Multiple interpretability methods, often based on gradient information, have been proposed. We show that the gradient in a Transformer reflects the function only locally, and thus fails to reliably identify the contribution of input features to the prediction. We identify Attention Heads and LayerNorm as main reasons for such unreliable explanations and propose a more stable way for propagation through these layers. Our proposal, which can be seen as a proper extension of the well-established LRP method to Transformers, is shown both theoretically and empirically to overcome the deficiency of a simple gradient-based approach, and achieves state-of-the-art explanation performance on a broad range of Transformer models and datasets.

...read moreread less

12 citations

Journal Article•DOI•

Toward Explainable Artificial Intelligence for Regression Models: A methodological perspective

[...]

Simon Letzgus, Patrick Wagner, Jonas Lederer, Wojciech Samek, Klaus-Robert Müller, Grégoire Montavon - Show less +2 more

01 Jul 2022-IEEE Signal Processing Magazine

TL;DR: This review clarifies the fundamental conceptual differences of XAI for regression and classification tasks, establishes novel theoretical insights and analysis for XAIR, provides demonstrations of XAIR on genuine practical regression problems, and discusses challenges remaining for the field.

...read moreread less

Abstract: In addition to the impressive predictive power of machine learning (ML) models, more recently, explanation methods have emerged that enable an interpretation of complex nonlinear learning models, such as deep neural networks. Gaining a better understanding is especially important, e.g., for safety-critical ML applications or medical diagnostics and so on. Although such explainable artificial intelligence (XAI) techniques have reached significant popularity for classifiers, thus far, little attention has been devoted to XAI for regression models (XAIR). In this review, we clarify the fundamental conceptual differences of XAI for regression and classification tasks, establish novel theoretical insights and analysis for XAIR, provide demonstrations of XAIR on genuine practical regression problems, and finally, discuss challenges remaining for the field.

...read moreread less

11 citations

Journal Article•DOI•

Towards the interpretability of deep learning models for multi-modal neuroimaging: Finding structural changes of the ageing brain

[...]

Simon M. Hofmann, Frauke Beyer, Sebastian Lapuschkin, Ole Goltermann, Markus Loeffler, Klaus-Robert Müller, Arno Villringer, Wojciech Samek, A. Veronica Witte - Show less +5 more

08 Jun 2022-NeuroImage

TL;DR: In this paper , the authors combined ensembles of convolutional neural networks with Layer-wise Relevance Propagation (LRP) to detect which brain features contribute to brain aging.

...read moreread less

10 citations

Proceedings Article•

So3krates: Equivariant attention for interactions on arbitrary length-scales in molecular systems

[...]

Joachim Frank, Oliver T. Unke, Klaus-Robert Müller

28 May 2022

TL;DR: This work proposes a modiﬁed attention mechanism adapted to the underlying physics, which allows to recover the relevant non-local effects of quantum mechanical effects over arbitrary length scales and introduces spherical harmonic coordinates (SPHCs) to reﬂect higher-order geometric information for each atom in a molecule, enabling a non- local formulation of attention in the SPHC space.

...read moreread less

Abstract: The application of machine learning methods in quantum chemistry has enabled the study of numerous chemical phenomena, which are computationally intractable with traditional ab-initio methods. However, some quantum mechanical properties of molecules and materials depend on non-local electronic effects, which are often neglected due to the difficulty of modeling them efficiently. This work proposes a modified attention mechanism adapted to the underlying physics, which allows to recover the relevant non-local effects. Namely, we introduce spherical harmonic coordinates (SPHCs) to reflect higher-order geometric information for each atom in a molecule, enabling a non-local formulation of attention in the SPHC space. Our proposed model So3krates - a self-attention based message passing neural network - uncouples geometric information from atomic features, making them independently amenable to attention mechanisms. Thereby we construct spherical filters, which extend the concept of continuous filters in Euclidean space to SPHC space and serve as foundation for a spherical self-attention mechanism. We show that in contrast to other published methods, So3krates is able to describe non-local quantum mechanical effects over arbitrary length scales. Further, we find evidence that the inclusion of higher-order geometric correlations increases data efficiency and improves generalization. So3krates matches or exceeds state-of-the-art performance on popular benchmarks, notably, requiring a significantly lower number of parameters (0.25 - 0.4x) while at the same time giving a substantial speedup (6 - 14x for training and 2 - 11x for inference) compared to other models.

...read moreread less

7 citations

Journal Article•DOI•

DNA methylation-based classification of sinonasal tumors

[...]

Philipp Jurmeister, Stefanie Glöss, Renée Roller, Maximilian Leitheiser, Simone Schmid, Liliana H. Mochmann, Emma Payá Capilla, Rebecca Fritz, Carsten Dittmayer, Corinna Friedrich, A. Thieme, Philipp Keyl, A. Jarosch, Simon Schallenberg, Hendrik Bläker, Inga Hoffmann, Claudia Vollbrecht, Annika Lehmann, Michael Hummel, Daniel Heim, Mohamed Hassan Haji, Patrick N. Harter, B. Englert, Stephan Frank, Jürgen Hench, Werner Paulus, Martin Hasselblatt, Wolfgang Hartmann, Hildegard Dohmen, Ursula Keber, Paul Jank, Carsten Denkert, Christine Stadelmann, Felix Bremmer, Annika Richter, Annika K. Wefers, Julika Ribbat-Idel, Sven Perner, Christian Idel, Lorenzo Chiariotti, Rosa Della Monica, Alfredo Marinelli, Ulrich Schüller, Michael Bockmayr, Jacklyn Liu, Valerie J. Lund, Martin Forster, Matthias Lechner, S. L. Lorenzo-Guerra, Mario Hermsen, Pascal Johann, Abbas Agaimy, Philipp Seegerer, Arend Koch, Frank L. Heppner, Stefan M. Pfister, David T. Jones, Martin Sill, Andreas von Deimling, Matija Snuderl, Klaus-Robert Müller, Erna Forgo, Brooke E. Howitt, Philipp Mertins, Frederick Klauschen, David Capper - Show less +62 more

28 Nov 2022-Nature Communications

TL;DR: In this article , a machine learning algorithm based on DNA methylation patterns was applied to classify sinonasal tumors with clinical-grade reliability, and the results showed that the SNUC morphology is not as undifferentiated as their current terminology suggests but rather reassigned to four distinct molecular classes defined by epigenetic, mutational and proteomic profiles.

...read moreread less

Abstract: The diagnosis of sinonasal tumors is challenging due to a heterogeneous spectrum of various differential diagnoses as well as poorly defined, disputed entities such as sinonasal undifferentiated carcinomas (SNUCs). In this study, we apply a machine learning algorithm based on DNA methylation patterns to classify sinonasal tumors with clinical-grade reliability. We further show that sinonasal tumors with SNUC morphology are not as undifferentiated as their current terminology suggests but rather reassigned to four distinct molecular classes defined by epigenetic, mutational and proteomic profiles. This includes two classes with neuroendocrine differentiation, characterized by IDH2 or SMARCA4/ARID1A mutations with an overall favorable clinical course, one class composed of highly aggressive SMARCB1-deficient carcinomas and another class with tumors that represent potentially previously misclassified adenoid cystic carcinomas. Our findings can aid in improving the diagnostic classification of sinonasal tumors and could help to change the current perception of SNUCs.

...read moreread less

7 citations

Journal Article•DOI•

Algorithmic Differentiation for Automated Modeling of Machine Learned Force Fields

[...]

Niklas Schmitz, Klaus-Robert Müller, Stefan Chmiela

25 Aug 2022-Journal of Physical Chemistry Letters

TL;DR: This paradigmatic approach enables not only a versatile usage of novel representations and the efficient computation of larger systems─all of high value to the FF community─but also the simple inclusion of further physical knowledge, such as higher-order information, even beyond the presented FF domain.

...read moreread less

Abstract: Reconstructing force fields (FFs) from atomistic simulation data is a challenge since accurate data can be highly expensive. Here, machine learning (ML) models can help to be data economic as they can be successfully constrained using the underlying symmetry and conservation laws of physics. However, so far, every descriptor newly proposed for an ML model has required a cumbersome and mathematically tedious remodeling. We therefore propose using modern techniques from algorithmic differentiation within the ML modeling process, effectively enabling the usage of novel descriptors or models fully automatically at an order of magnitude higher computational efficiency. This paradigmatic approach enables not only a versatile usage of novel representations and the efficient computation of larger systems—all of high value to the FF community—but also the simple inclusion of further physical knowledge, such as higher-order information (e.g., Hessians, more complex partial differential equations constraints etc.), even beyond the presented FF domain.

...read moreread less

5 citations

Journal Article•DOI•

High-fidelity molecular dynamics trajectory reconstruction with bi-directional neural networks

[...]

Ludwig Winkler, Klaus-Robert Müller, Huziel E. Sauceda

02 Jan 2022

TL;DR: This research presents a parallel version of the Celada–Seiden cellular automaton that automates the very labor-intensive and therefore time-heavy and therefore expensive and expensive and therefore labor-heavy process of learning and data replacement.

...read moreread less

Abstract: Molecular dynamics (MD) simulations are a cornerstone in science, enabling the investigation of a system’s thermodynamics all the way to analyzing intricate molecular interactions. In general, creating extended molecular trajectories can be a computationally expensive process, for example, when running ab-initio simulations. Hence, repeating such calculations to either obtain more accurate thermodynamics or to get a higher resolution in the dynamics generated by a fine-grained quantum interaction can be time- and computational resource-consuming. In this work, we explore different machine learning methodologies to increase the resolution of MD trajectories on-demand within a post-processing step. As a proof of concept, we analyse the performance of bi-directional neural networks (NNs) such as neural ODEs, Hamiltonian networks, recurrent NNs and long short-term memories, as well as the uni-directional variants as a reference, for MD simulations (here: the MD17 dataset). We have found that Bi-LSTMs are the best performing models; by utilizing the local time-symmetry of thermostated trajectories they can even learn long-range correlations and display high robustness to noisy dynamics across molecular complexity. Our models can reach accuracies of up to 10−4 Å in trajectory interpolation, which leads to the faithful reconstruction of several unseen high-frequency molecular vibration cycles. This renders the comparison between the learned and reference trajectories indistinguishable. The results reported in this work can serve (1) as a baseline for larger systems, as well as (2) for the construction of better MD integrators.

...read moreread less

Journal Article•DOI•

Automatic Identification of Chemical Moieties

[...]

Jonas Lederer, Michael Gastegger, Kristof T. Schütt, Michael Kampffmeyer, Klaus-Robert Müller, Oliver T. Unke - Show less +2 more

30 Mar 2022-arXiv.org

TL;DR: This work introduces a method to automatically identify chemical moieties (molecular building blocks) from atomic representations, enabling a variety of applications beyond property prediction, which otherwise rely on expert knowledge.

...read moreread less

Abstract: In recent years, the prediction of quantum mechanical observables with machine learning methods has become increasingly popular. Message-passing neural networks (MPNNs) solve this task by constructing atomic representations, from which the properties of interest are predicted. Here, we introduce a method to automatically identify chemical moieties (molecular building blocks) from such representations, enabling a variety of applications beyond property prediction, which otherwise rely on expert knowledge. The required representation can either be provided by a pretrained MPNN, or learned from scratch using only structural information. Beyond the data-driven design of molecular fingerprints, the versatility of our approach is demonstrated by enabling the selection of representative entries in chemical databases, the automatic construction of coarse-grained force fields, as well as the identification of reaction coordinates.

...read moreread less

Journal Article•DOI•

Disentangled Explanations of Neural Network Predictions by Finding Relevant Subspaces

[...]

Pattarawat Chormai, Jan Herrmann, Klaus-Robert Müller, Grégoire Montavon

30 Dec 2022-arXiv.org

TL;DR: In this paper , the authors propose to disentangle explanations by finding relevant subspaces in activation space that can be mapped to more abstract human-understandable concepts and enable a joint attribution on concepts and input features.

...read moreread less

Abstract: Explainable AI transforms opaque decision strategies of ML models into explanations that are interpretable by the user, for example, identifying the contribution of each input feature to the prediction at hand. Such explanations, however, entangle the potentially multiple factors that enter into the overall complex decision strategy. We propose to disentangle explanations by finding relevant subspaces in activation space that can be mapped to more abstract human-understandable concepts and enable a joint attribution on concepts and input features. To automatically extract the desired representation, we propose new subspace analysis formulations that extend the principle of PCA and subspace analysis to explanations. These novel analyses, which we call principal relevant component analysis (PRCA) and disentangled relevant subspace analysis (DRSA), optimize relevance of projected activations rather than the more traditional variance or kurtosis. This enables a much stronger focus on subspaces that are truly relevant for the prediction and the explanation, in particular, ignoring activations or concepts to which the prediction model is invariant. Our approach is general enough to work alongside common attribution techniques such as Shapley Value, Integrated Gradients, or LRP. Our proposed methods show to be practically useful and compare favorably to the state of the art as demonstrated on benchmarks and three use cases.

...read moreread less

Journal Article•DOI•

So3krates - Self-attention for higher-order geometric interactions on arbitrary length-scales

[...]

Joachim Frank, Oliver T. Unke, Klaus-Robert Müller

arXiv.org

TL;DR: This work proposes a modified attention mechanism adapted to the underlying physics, which allows to recover the relevant non-local effects of quantum mechanical effects, and introduces spherical harmonic coordinates (SPHCs) to reflect higher-order geometric information for each atom in a molecule, enabling a non- local formulation of attention in the SPHC space.

...read moreread less

Abstract: The application of machine learning methods in quantum chemistry has enabled the study of numerous chemical phenomena, which are computationally intractable with traditional ab-initio methods. However, some quantum mechanical properties of molecules and materials depend on non-local electronic effects, which are often neglected due to the difﬁculty of modeling them efﬁciently. This work proposes a modiﬁed attention mechanism adapted to the underlying physics, which allows to recover the relevant non-local effects. Namely, we introduce spherical harmonic coordinates (SPHCs) to reﬂect higher-order geometric information for each atom in a molecule, enabling a non-local formulation of attention in the SPHC space. Our proposed model S O 3 KRATES – a self-attention based message passing neural network – uncouples geometric information from atomic features, making them independently amenable to attention mechanisms. We show that in contrast to other published methods, S O 3 KRATES is able to describe non-local quantum mechanical effects over arbitrary length scales. Further, we ﬁnd evidence that the inclusion of higher-order geometric correlations increases data efﬁciency and improves generalization. S O 3 KRATES matches or exceeds state-of-the-art performance on popular benchmarks, notably, requiring a signiﬁcantly lower number of parameters (0.25–0.4x) while at the same time giving a substantial speedup (6–14x for training and 2–11x for inference) compared to other models.

...read moreread less

Journal Article•DOI•

Künstliche Intelligenz als Lösung des PathologInnenmangels?

[...]

Philipp Jurmeister, Klaus-Robert Müller, Frederick Klauschen

11 Apr 2022-Pathologe

Journal Article•DOI•

An Ever-Expanding Humanities Knowledge Graph: The Sphaera Corpus at the Intersection of Humanities, Data Management, and Machine Learning

[...]

Hassan El-Hajj, Maryam Zamani, Jochen Büttner, Julius Martinetz, Oliver Eberle, Noga Shlomi, Annie Siebold, Grégoire Montavon, Klaus-Robert Müller, Holger Kantz, Matteo Valleriani - Show less +7 more

16 May 2022-Datenbank-spektrum

TL;DR: The Sphere project as mentioned in this paper studied the evolution of knowledge in the early modern period by studying a collection of 359 textbook editions published between 1472 and 1650 which were used to teach geocentric cosmology and astronomy at European universities.

...read moreread less

Abstract: Abstract The Sphere project stands at the intersection of the humanities and information sciences. The project aims to better understand the evolution of knowledge in the early modern period by studying a collection of 359 textbook editions published between 1472 and 1650 which were used to teach geocentric cosmology and astronomy at European universities. The relatively large size of the corpus at hand presents a challenge for traditional historical approaches, but provides a great opportunity to explore such a large collection of historical data using computational approaches. In this paper, we present a review of the different computational approaches, used in this project over the period of the last three years, that led to a better understanding of the dynamics of knowledge transfer and transformation in the early modern period.

...read moreread less

Journal Article•DOI•

New definitions of human lymphoid and follicular cell entities in lymphatic tissue by machine learning

[...]

Patrick Wagner, Nils Strodthoff, P. Wurzel, Arturo Marban, Sonja Scharf, Hendrik Schäfer, Philipp Seegerer, Andreas Loth, F. Hartmann, Frederick Klauschen, Klaus-Robert Müller, Wojciech Samek, Martin-Leo Hansmann - Show less +9 more

08 Nov 2022-Dental science reports

TL;DR: In this paper , the authors performed a dynamic analysis of human reactive lymphoid tissue using confocal fluorescent laser microscopy in combination with machine learning and identified correlations of follicular dendritic cell movement and the behavior of lymphocytes in the microenvironment.

...read moreread less

Abstract: Abstract Histological sections of the lymphatic system are usually the basis of static (2D) morphological investigations. Here, we performed a dynamic (4D) analysis of human reactive lymphoid tissue using confocal fluorescent laser microscopy in combination with machine learning. Based on tracks for T-cells (CD3), B-cells (CD20), follicular T-helper cells (PD1) and optical flow of follicular dendritic cells (CD35), we put forward the first quantitative analysis of movement-related and morphological parameters within human lymphoid tissue. We identified correlations of follicular dendritic cell movement and the behavior of lymphocytes in the microenvironment. In addition, we investigated the value of movement and/or morphological parameters for a precise definition of cell types (CD clusters). CD-clusters could be determined based on movement and/or morphology. Differentiating between CD3- and CD20 positive cells is most challenging and long term-movement characteristics are indispensable. We propose morphological and movement-related prototypes of cell entities applying machine learning models. Finally, we define beyond CD clusters new subgroups within lymphocyte entities based on long term movement characteristics. In conclusion, we showed that the combination of 4D imaging and machine learning is able to define characteristics of lymphocytes not visible in 2D histology.

...read moreread less

Journal Article•DOI•

[Artificial intelligence: a solution for the lack of pathologists?]

[...]

Philipp Jurmeister, Klaus-Robert Müller, Frederick Klauschen

11 Apr 2022-Pathologe

TL;DR: While some methods in molecular pathology would already be unthinkable without AI, it remains to be shown how AI will also be able to help with difficult histomorphological differential diagnoses in the future.

...read moreread less

Journal Article•DOI•

Künstliche Intelligenz als Lösung des PathologInnenmangels?

[...]

Philipp Jurmeister, Klaus-Robert Müller, Frederick Klauschen

24 Aug 2022-Wiener klinische Wochenschrift

Journal Article•DOI•

Reconstructing Kernel-based Machine Learning Force Fields with Super-linear Convergence

[...]

Stefan Blücher, Klaus-Robert Müller, Stefan Chmiela

24 Dec 2022-Journal of Chemical Theory and Computation

TL;DR: In this article , the authors consider the broad class of Nyström-type methods to construct preconditioners based on successively more sophisticated low-rank approximations of the original kernel matrix, each of which provides a different set of computational tradeoffs.

...read moreread less

Abstract: Kernel machines have sustained continuous progress in the field of quantum chemistry. In particular, they have proven to be successful in the low-data regime of force field reconstruction. This is because many equivariances and invariances due to physical symmetries can be incorporated into the kernel function to compensate for much larger data sets. So far, the scalability of kernel machines has however been hindered by its quadratic memory and cubical runtime complexity in the number of training points. While it is known that iterative Krylov subspace solvers can overcome these burdens, their convergence crucially relies on effective preconditioners, which are elusive in practice. Effective preconditioners need to partially presolve the learning problem in a computationally cheap and numerically robust manner. Here, we consider the broad class of Nyström-type methods to construct preconditioners based on successively more sophisticated low-rank approximations of the original kernel matrix, each of which provides a different set of computational trade-offs. All considered methods aim to identify a representative subset of inducing (kernel) columns to approximate the dominant kernel spectrum.

...read moreread less

Journal Article•DOI•

Diffeomorphic Counterfactuals with Generative Models

[...]

Ann-Kathrin Dombrowski, Jan E. Gerken, Klaus-Robert Müller, Pan Kessel

10 Jun 2022-arXiv.org

TL;DR: This work performs a suitable diﬀeomorphic coordinate transformation and performs gradient ascent in these coordinates to generate counterfactuals which are classed with great conﬁdence as a speci⬁ed target class.

...read moreread less

Abstract: Counterfactuals can explain classiﬁcation decisions of neural networks in a human interpretable way. We propose a simple but eﬀective method to generate such counterfactuals. More speciﬁcally, we perform a suitable diﬀeomorphic coordinate transformation and then perform gradient ascent in these coordinates to ﬁnd counterfactuals which are classiﬁed with great conﬁdence as a speciﬁed target class. We propose two methods to leverage generative models to construct such suitable coordinate systems that are either exactly or approximately diﬀeomorphic. We analyze the generation process theoretically using Riemannian diﬀerential geometry and validate the quality of the generated counterfactuals using various qualitative and quantitative measures.

...read moreread less

Journal Article•DOI•

DORA: Exploring outlier representations in Deep Neural Networks

[...]

Kirill Bykov, Mayukh Deb, Dennis Grinwald, Klaus-Robert Müller, Marina M. C. Höhne - Show less +1 more

09 Jun 2022-arXiv.org

TL;DR: DORA (Data-agnOstic Representation Analysis): the first automatic data-agnostic method for the detection of potentially infected representations in Deep Neural Networks is introduced and it is shown that contaminated representations found by DORA can be used to detect infected samples in any given dataset.

...read moreread less

Abstract: Deep Neural Networks (DNNs) excel at learning complex abstractions within their internal representations. However, the concepts they learn remain opaque, a problem that becomes particularly acute when models unintentionally learn spurious correlations. In this work, we present DORA (Data-agnOstic Representation Analysis), the first data-agnostic framework for analyzing the representational space of DNNs. Central to our framework is the proposed Extreme-Activation (EA) distance measure, which assesses similarities between representations by analyzing their activation patterns on data points that cause the highest level of activation. As spurious correlations often manifest in features of data that are anomalous to the desired task, such as watermarks or artifacts, we demonstrate that internal representations capable of detecting such artifactual concepts can be found by analyzing relationships within neural representations. We validate the EA metric quantitatively, demonstrating its effectiveness both in controlled scenarios and real-world applications. Finally, we provide practical examples from popular Computer Vision models to illustrate that representations identified as outliers using the EA metric often correspond to undesired and spurious concepts.

...read moreread less

Proceedings Article•

Efficient Computation of Higher-Order Subgraph Attribution via Message Passing

[...]

Pin Xiong, Thomas Schnake, Grégoire Montavon, Klaus-Robert Müller, Shinsuke Nakajima - Show less +1 more

Journal Article•DOI•

To pretrain or not? A systematic analysis of the benefits of pretraining in diabetic retinopathy

[...]

Vignesh Srinivasan, Nils Strodthoff, Jackie Ma, Alexander Binder, Klaus-Robert Müller, Wojciech Samek - Show less +2 more

18 Oct 2022-PLOS ONE

TL;DR: This work compares the impact of different training procedures including recently established self-supervised pretraining methods based on contrastive learning, and indicates that models initialized from ImageNet pretraining report a significant increase in performance, generalization and robustness to image distortions.

...read moreread less

Abstract: There is an increasing number of medical use cases where classification algorithms based on deep neural networks reach performance levels that are competitive with human medical experts. To alleviate the challenges of small dataset sizes, these systems often rely on pretraining. In this work, we aim to assess the broader implications of these approaches in order to better understand what type of pretraining works reliably (with respect to performance, robustness, learned representation etc.) in practice and what type of pretraining dataset is best suited to achieve good performance in small target dataset size scenarios. Considering diabetic retinopathy grading as an exemplary use case, we compare the impact of different training procedures including recently established self-supervised pretraining methods based on contrastive learning. To this end, we investigate different aspects such as quantitative performance, statistics of the learned feature representations, interpretability and robustness to image distortions. Our results indicate that models initialized from ImageNet pretraining report a significant increase in performance, generalization and robustness to image distortions. In particular, self-supervised models show further benefits to supervised models. Self-supervised models with initialization from ImageNet pretraining not only report higher performance, they also reduce overfitting to large lesions along with improvements in taking into account minute lesions indicative of the progression of the disease. Understanding the effects of pretraining in a broader sense that goes beyond simple performance comparisons is of crucial importance for the broader medical imaging community beyond the use case considered in this work.

...read moreread less

Journal Article•DOI•

Canonical Response Parameterization: Quantifying the structure of responses to single-pulse intracranial electrical brain stimulation

[...]

Kai J. Miller, Klaus-Robert Müller, Gabriela Valencia, Harvey Huang, Nicholas M. Gregg, Gregory A. Worrell, Dora Hermes - Show less +3 more

05 Aug 2022-PLOS Computational Biology

TL;DR: A new machine learning technique for quantifying the structure of responses to single-pulse intracranial electrical brain stimulation, which dramatically simplifies the study of CCEP shapes, and may also be applied in a wide range of other settings involving event-triggered data.

...read moreread less

Abstract: Single-pulse electrical stimulation in the nervous system, often called cortico-cortical evoked potential (CCEP) measurement, is an important technique to understand how brain regions interact with one another. Voltages are measured from implanted electrodes in one brain area while stimulating another with brief current impulses separated by several seconds. Historically, researchers have tried to understand the significance of evoked voltage polyphasic deflections by visual inspection, but no general-purpose tool has emerged to understand their shapes or describe them mathematically. We describe and illustrate a new technique to parameterize brain stimulation data, where voltage response traces are projected into one another using a semi-normalized dot product. The length of timepoints from stimulation included in the dot product is varied to obtain a temporal profile of structural significance, and the peak of the profile uniquely identifies the duration of the response. Using linear kernel PCA, a canonical response shape is obtained over this duration, and then single-trial traces are parameterized as a projection of this canonical shape with a residual term. Such parameterization allows for dissimilar trace shapes from different brain areas to be directly compared by quantifying cross-projection magnitudes, response duration, canonical shape projection amplitudes, signal-to-noise ratios, explained variance, and statistical significance. Artifactual trials are automatically identified by outliers in sub-distributions of cross-projection magnitude, and rejected. This technique, which we call “Canonical Response Parameterization” (CRP) dramatically simplifies the study of CCEP shapes, and may also be applied in a wide range of other settings involving event-triggered data. Author summary We introduce a new machine learning technique for quantifying the structure of responses to single-pulse intracranial electrical brain stimulation. This approach allows voltage response traces of very different shape to be compared with one another. A tool like this has been needed to replace the status quo, where researchers may understand their data in terms of discovered structure rather than in terms of a pre-assigned, hand-picked, feature. The method compares single-trial responses pairwise to understand if there is a reproducible shape and how long it lasts. When significant structure is identified, the shape underlying it is isolated and each trial is parameterized in terms of this shape. This simple parameterization enables quantification of statistical significance, signal-to-noise ratio, explained variance, and average voltage of the response. Differently-shaped voltage traces from any setting can be compared with any other in a succinct mathematical framework. This versatile tool to quantify single-pulse stimulation data should facilitate a blossoming in the study of brain connectivity using implanted electrodes.

...read moreread less

Efficient Higher-order Subgraph Attribution via Message Passing

[...]

Pin Xiong, Thomas Schnake, Grégoire Montavon, Klaus-Robert Müller, Shinichi Nakajima - Show less +1 more

TL;DR: In this article , the authors proposed a generalized subgraph attribution method using GNN-LRP (layer-wise relevance propagation for GNNs) for explaining graph neural networks (GNNs).

...read moreread less

Abstract: Explaining graph neural networks (GNNs) has become more and more important recently. Higher-order interpretation schemes, such as GNN-LRP (layer-wise relevance propagation for GNN), emerged as powerful tools for unraveling how different features interact thereby contributing to explaining GNNs. GNN-LRP gives a relevance attribution of walks between nodes at each layer, and the subgraph attribution is expressed as a sum over exponentially many such walks. In this work, we demonstrate that such exponential complexity can be avoided. In particular, we propose novel algorithms that enable to attribute subgraphs with GNN-LRP in linear-time (w.r.t. the network depth). Our algorithms are derived via message passing techniques that make use of the distributive property, thereby directly computing quantities for higher-order explanations. We further adapt our efficient algorithms to compute a generalization of subgraph attributions that also takes into account the neighboring graph features. Experimental results show the significant acceler-ation of the proposed algorithms and demonstrate the high usefulness and scalability of our novel generalized subgraph attribution method.

...read moreread less

Journal Article•DOI•

Analysing cerebrospinal fluid with explainable deep learning: From diagnostics to insights

[...]

15 Dec 2022-Neuropathology and Applied Neurobiology

TL;DR: The current gold standard is based on microscopic examination by specialised technicians and neuropathologists, which is timeconsuming, labour intensive and subjective as mentioned in this paper , and is not suitable for patients with neurological diseases.

...read moreread less

Abstract: Analysis of cerebrospinal fluid (CSF) is essential for diagnostic workup of patients with neurological diseases and includes differential cell typing. The current gold standard is based on microscopic examination by specialised technicians and neuropathologists, which is time‐consuming, labour‐intensive and subjective.

...read moreread less

Proceedings Article•DOI•

Deep Learning for Whole-Brain Cognitive Decoding

[...]

Klaus-Robert Müller, Armin W. Thomas, Wojciech Samek

21 Feb 2022

TL;DR: This brief note will touch upon selected recent directions of research where machine learning techniques help to analyze brain measurements from EEG, fNIRS and fMRI typically targeting BCI applications.

...read moreread less

Abstract: Accurately decoding brain activities is both a challenge for machine learning and a potential vehicle for gaining insight into complex cognitive brain states and their dynamics. In this brief note, we will touch upon selected recent directions of our research where machine learning techniques help to analyze brain measurements from EEG, fNIRS and fMRI typically targeting BCI applications. We owe these steps that we are summarizing mainly to activities of members of the BBCI team and their collaborators. Clearly, unavoidably and intentionally this abstract will have a high overlap to prior own contributions as it reports about and discusses these novel ideas and directions.

...read moreread less

Journal Article•

Automated Dissipation Control for Turbulence Simulation with Shell Models

[...]

Ann-Kathrin Dombrowski, Klaus-Robert Müller, Wolf-Christian Müller

07 Jan 2022-arXiv.org

TL;DR: This work constructs a strongly simplified representation of turbulence by using the Gledzer-OhkitaniYamada shell model and proposes an approach that aims to reconstruct statistical properties of turbulence such as the self-similar inertial-range scaling, where it could achieve encouraging experimental results.

...read moreread less

Abstract: The application of machine learning (ML) techniques, especially neural networks, has seen tremendous success at processing images and language. This is because we often lack formal models to understand visual and audio input, so here neural networks can unfold their abilities as they can model solely from data. In the field of physics we typically have models that describe natural processes reasonably well on a formal level. Nonetheless, in recent years, ML has also proven useful in these realms, be it by speeding up numerical simulations or by improving accuracy. One important and so far unsolved problem in classical physics is understanding turbulent fluid motion. In this work we construct a strongly simplified representation of turbulence by using the Gledzer-Ohkitani-Yamada (GOY) shell model. With this system we intend to investigate the potential of ML-supported and physics-constrained small-scale turbulence modelling. Instead of standard supervised learning we propose an approach that aims to reconstruct statistical properties of turbulence such as the self-similar inertial-range scaling, where we could achieve encouraging experimental results. Furthermore we discuss pitfalls when combining machine learning with differential equations.

...read moreread less