Machine Learning for Integrating Data in Biology and Medicine: Principles, Practice, and Opportunities
Marinka Zitnik,Francis Nguyen,Francis Nguyen,Bo Wang,Jure Leskovec,Anna Goldenberg,Michael M. Hoffman +6 more
Reads0
Chats0
TLDR
The principles of data integration are described and current methods and available implementations are discussed and examples of successful data integration in biology and medicine are provided.Abstract:
New technologies have enabled the investigation of biology and human health at an unprecedented scale and in multiple dimensions. These dimensions include a myriad of properties describing genome, epigenome, transcriptome, microbiome, phenotype, and lifestyle. No single data type, however, can capture the complexity of all the factors relevant to understanding a phenomenon such as a disease. Integrative methods that combine data from multiple technologies have thus emerged as critical statistical and computational approaches. The key challenge in developing such approaches is the identification of effective models to provide a comprehensive and relevant systems view. An ideal method can answer a biological or medical question, identifying important features and predicting outcomes, by harnessing heterogeneous data across several dimensions of biological variation. In this Review, we describe the principles of data integration and discuss current methods and available implementations. We provide examples of successful data integration in biology and medicine. Finally, we discuss current challenges in biomedical integrative methods and our perspective on the future development of the field.read more
Citations
More filters
Journal ArticleDOI
Information fusion as an integrative cross-cutting enabler to achieve robust, explainable, and trustworthy medical artificial intelligence
TL;DR: In this paper , the authors describe three frontier research areas facilitating ethical responsible and legally compliant medical AI: complex networks and their inference, graph causal models and counterfactuals, and explainability methods.
Journal ArticleDOI
Graph representation learning in biomedicine and healthcare
Journal ArticleDOI
A Review of Integrative Imputation for Multi-Omics Datasets
Meng Song,Jonathan Greenbaum,Joseph Luttrell,Weihua Zhou,Chong Wu,Hui Shen,Ping Gong,Chaoyang Zhang,Hong-Wen Deng +8 more
TL;DR: An overview of the currently available imputation methods for handling missing values in bioinformatics data with an emphasis on multi-omics imputation is provided and a perspective on how deep learning methods might be developed for the integrative imputation ofmulti-omics datasets is provided.
Journal ArticleDOI
Artificial intelligence for multimodal data integration in oncology.
Jana Lipkova,Richard Chen,Bowen Chen,Ming Y. Lu,Matteo Barbieri,Daniel Shao,Anurag J. Vaidya,Chengkuan Chen,Luoting Zhuang,Drew F. K. Williamson,Muhammad Shaban,Tiffany Y. Chen,Faisal Mahmood +12 more
TL;DR: In this article , the authors present a synopsis of AI methods and strategies for multimodal data fusion and association discovery, and outline approaches for AI interpretability and directions for AI-driven exploration through multi-modal data interconnections.
Journal ArticleDOI
Drug-target interaction prediction by integrating multiview network data.
TL;DR: A multiview DTI prediction method based on clustering is proposed, formulated as an optimization problem, which aims to identify the clusters in both drug similarity network and target protein similarity network, and at the same time make the clusters with more known DTIs be connected together.
References
More filters
Journal ArticleDOI
Basic Local Alignment Search Tool
TL;DR: A new approach to rapid sequence comparison, basic local alignment search tool (BLAST), directly approximates alignments that optimize a measure of local similarity, the maximal segment pair (MSP) score.
Journal ArticleDOI
Gene Ontology: tool for the unification of biology
M Ashburner,Catherine A. Ball,Judith A. Blake,David Botstein,Heather Butler,J. M. Cherry,Allan Peter Davis,Kara Dolinski,Selina S. Dwight,J.T. Eppig,Midori A. Harris,David P. Hill,Laurie Issel-Tarver,Andrew Kasarskis,Suzanna E. Lewis,John C. Matese,Joel E. Richardson,M. Ringwald,Gerald M. Rubin,Gavin Sherlock +19 more
TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
Journal ArticleDOI
A tutorial on hidden Markov models and selected applications in speech recognition
TL;DR: In this paper, the authors provide an overview of the basic theory of hidden Markov models (HMMs) as originated by L.E. Baum and T. Petrie (1966) and give practical details on methods of implementation of the theory along with a description of selected applications of HMMs to distinct problems in speech recognition.
Journal ArticleDOI
Reducing the Dimensionality of Data with Neural Networks
TL;DR: In this article, an effective way of initializing the weights that allows deep autoencoder networks to learn low-dimensional codes that work much better than principal components analysis as a tool to reduce the dimensionality of data is described.
Journal ArticleDOI
An integrated encyclopedia of DNA elements in the human genome
TL;DR: The Encyclopedia of DNA Elements project provides new insights into the organization and regulation of the authors' genes and genome, and is an expansive resource of functional annotations for biomedical research.