Reply to: Examining microbe-metabolite correlations by linear methods
Summary (1 min read)
Introduction
- Lawrence Berkeley National Laboratory Recent Work Title Reply to: Examining microbe-metabolite correlations by linear methods.
- The authors have found that MMvec is a powerful discovery tool, as demonstrated by the other real datasets.
Reply to: Examining microbe–metabolite
- Matters arising Nature Methods the authors evaluated in the original article.
- It is critical that the authors provide accurate guidance to the community so that scenarios where one method works better than others are well understood.
- While there may be scenarios where linear methods outperform neural networks, the authors show that there are scenarios where neural networks outperform linear methods.
- Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/.
Methods
- The simulations were created by using the generative form of MMvec; the microbe and metabolite factor loadings were randomly generated from a normal distribution to parameterize the MMvec parameters.
- Microbial counts were then drawn from a multinomial logistic normal distribution and fed into MMvec to generate the metabolite counts.
- To identify scenarios where CLR correlations underperformed in comparison to MMvec, the authors used Bayesian Optimization to tune the distributions used to generate the simulations.
- The CLR-transformed correlations suggested by Quinn and Erb were benchmarked on the desert biocrust soils dataset using the R scripts provided in ref.
- Further information on research design is available in the Nature Research Reporting Summary linked to this article.
Author contributions
- J.T.M. performed all analyses and wrote the manuscript.
- All authors have contributed edits to the manuscript.
Additional information
- Supplementary information is available for this paper at https://doi.org/10.1038/.
- All manuscripts must include a data availability statement.
- For a reference copy of the document with all sections, see nature.com/documents/nr-reporting-summary-flat.pdf.
- 2 nature research | reporting sum m ary O ctober 2018 Life sciences study design Randomization Randomization was not necessary, since the data was simulated, not collected.
Did you find this useful? Give us your feedback
Citations
11 citations
2 citations
References
12 citations
Frequently Asked Questions (9)
Q2. What is the way to calculate the metabolite counts?
Microbial counts were then drawn from a multinomial logistic normal distribution and fed into MMvec to generate the metabolite counts.
Q3. How many samples were removed from the biocrust soils study?
Taxa that appeared in less than 10 samples for each study were removed, since there are fewer samples than degrees of freedom in the model to infer these microbes co-occurrence patterns.
Q4. What is the purpose of the study?
The simulations were created by using the generative form of MMvec; the microbe and metabolite factor loadings were randomly generated from a normal distribution to parameterize the MMvec parameters.
Q5. What is the common method used in the study?
Involved in the study Antibodies Eukaryotic cell lines Palaeontology Animals and other organisms Human research participants Clinical data Methods n/a Involved in the study ChIP-seq Flow cytometry MRI-based neuroimaging
Q6. What is the way to test a hypothesis?
For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted Give P values as exact values whenever suitable.
Q7. What is the policy for submitting data to editors/reviewers?
For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers.
Q8. What is the way to determine the performance of CLR correlations?
To identify scenarios where CLR correlations underperformed in comparison to MMvec, the authors used Bayesian Optimization to tune the distributions used to generate the simulations.
Q9. What is the definition of the statistical test?
A description of all covariates tested A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistical parameters including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals)