Machine learning applications in microbial ecology, human microbiome studies, and environmental monitoring.
Reads0
Chats0
TLDR
In this paper, the authors review current trends in machine learning applications in microbial ecology as well as some of the important challenges and opportunities for more broad application of machine learning to understand microbial communities.Abstract:
Advances in nucleic acid sequencing technology have enabled expansion of our ability to profile microbial diversity. These large datasets of taxonomic and functional diversity are key to better understanding microbial ecology. Machine learning has proven to be a useful approach for analyzing microbial community data and making predictions about outcomes including human and environmental health. Machine learning applied to microbial community profiles has been used to predict disease states in human health, environmental quality and presence of contamination in the environment, and as trace evidence in forensics. Machine learning has appeal as a powerful tool that can provide deep insights into microbial communities and identify patterns in microbial community data. However, often machine learning models can be used as black boxes to predict a specific outcome, with little understanding of how the models arrived at predictions. Complex machine learning algorithms often may value higher accuracy and performance at the sacrifice of interpretability. In order to leverage machine learning into more translational research related to the microbiome and strengthen our ability to extract meaningful biological information, it is important for models to be interpretable. Here we review current trends in machine learning applications in microbial ecology as well as some of the important challenges and opportunities for more broad application of machine learning to understanding microbial communities.read more
Citations
More filters
Journal ArticleDOI
FL-PMI: Federated Learning-Based Person Movement Identification through Wearable Devices in Smart Healthcare Systems
K.S. Arikumar,Sahaya Beni Prathiba,Mamoun Alazab,Thippa Reddy Gadekallu,Sharnil Pandya,J. Khan,Rajalakshmi Shenbaga Moorthy +6 more
TL;DR: This work proposes a federated learning-based person movement identification (FL-PMI), in which the edge servers allow the parameters alone to pass on the cloud, rather than passing vast amounts of sensor data.
Journal ArticleDOI
Suspect and non-target screening: the last frontier in environmental analysis.
Belén González-Gaya,N. Lopez-Herguedas,Dennis Bilbao,Leire Mijangos,A M Iker,Nestor Etxebarria,Mireia Irazola,Ailette Prieto,Maitane Olivares,Olatz Zuloaga +9 more
TL;DR: Suspect and non-target screening (SNTS) techniques are emerging as new analytical strategies useful to disentangle the environmental occurrence of the thousands of exogenous chemicals present in our ecosystems.
Journal ArticleDOI
Competition, Nodule Occupancy, and Persistence of Inoculant Strains: Key Factors in the Rhizobium-Legume Symbioses.
TL;DR: In this paper, a review of current knowledge at the molecular level on competition for nodulation and the advances in molecular tools for assessing competitiveness is presented, along with future perspectives and applications using a multidisciplinary approach to ensure optimal performance of symbiotic partners.
Journal ArticleDOI
Computational modeling of metabolism in microbial communities on a genome-scale
Analeigha V. Colarusso,Analeigha V. Colarusso,Isabella Goodchild-Michelman,Maya Rayle,Ali R. Zomorrodi +4 more
TL;DR: Computational modeling of microbial communities using GEnome-scale Models (GEMs) of metabolism is a new frontier in systems biology and recent efforts to integrate GEMs and machine learning for predicting inter-species interactions are reviewed.
Journal ArticleDOI
MathFeature: feature extraction package for DNA, RNA and protein sequences based on mathematical descriptors.
Robson Parmezan Bonidia,Douglas Silva Domingues,Danilo Sipoli Sanches,André C. P. L. F. de Carvalho +3 more
TL;DR: MathFeature as mentioned in this paper is a new package, which implements mathematical descriptors able to extract relevant numerical information from biological sequences, i.e. DNA, RNA and proteins (prediction of structural features along the primary sequence of amino acids).
References
More filters
Journal Article
R: A language and environment for statistical computing.
TL;DR: Copyright (©) 1999–2012 R Foundation for Statistical Computing; permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and permission notice are preserved on all copies.
Journal ArticleDOI
Random Forests
TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Journal Article
Scikit-learn: Machine Learning in Python
Fabian Pedregosa,Gaël Varoquaux,Alexandre Gramfort,Vincent Michel,Bertrand Thirion,Olivier Grisel,Mathieu Blondel,Peter Prettenhofer,Ron Weiss,Vincent Dubourg,Jake Vanderplas,Alexandre Passos,David Cournapeau,Matthieu Brucher,Matthieu Perrot,Edouard Duchesnay +15 more
TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
Journal ArticleDOI
Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2
TL;DR: This work presents DESeq2, a method for differential analysis of count data, using shrinkage estimation for dispersions and fold changes to improve stability and interpretability of estimates, which enables a more quantitative analysis focused on the strength rather than the mere presence of differential expression.
Journal ArticleDOI
Deep learning
TL;DR: Deep learning is making major advances in solving problems that have resisted the best attempts of the artificial intelligence community for many years, and will have many more successes in the near future because it requires very little engineering by hand and can easily take advantage of increases in the amount of available computation and data.