scispace - formally typeset
Search or ask a question

Showing papers by "Helsinki Institute for Information Technology published in 2020"


Proceedings Article
06 Apr 2020
TL;DR: This paper describes a simple technique to analyze Generative Adversarial Networks and create interpretable controls for image synthesis, and shows that BigGAN can be controlled with layer-wise inputs in a StyleGAN-like manner.
Abstract: This paper describes a simple technique to analyze Generative Adversarial Networks (GANs) and create interpretable controls for image synthesis, such as change of viewpoint, aging, lighting, and time of day. We identify important latent directions based on Principal Components Analysis (PCA) applied either in latent space or feature space. Then, we show that a large number of interpretable controls can be defined by layer-wise perturbation along the principal directions. Moreover, we show that BigGAN can be controlled with layer-wise inputs in a StyleGAN-like manner. We show results on different GANs trained on various datasets, and demonstrate good qualitative matches to edit directions found through earlier supervised approaches.

482 citations


Journal ArticleDOI
TL;DR: The latest version of SynergyFinder 2.0 is described, which has extensively been upgraded through the addition of novel features supporting especially higher-order combination data analytics and exploratory visualization of multi-drug synergy patterns, along with automated outlier detection procedure, extended curve-fitting functionality and statistical analysis of replicate measurements.
Abstract: SynergyFinder (https://synergyfinder.fimm.fi) is a stand-alone web-application for interactive analysis and visualization of drug combination screening data. Since its first release in 2017, SynergyFinder has become a widely used web-tool both for the discovery of novel synergistic drug combinations in pre-clinical model systems (e.g. cell lines or primary patient-derived cells), and for better understanding of mechanisms of combination treatment efficacy or resistance. Here, we describe the latest version of SynergyFinder (release 2.0), which has extensively been upgraded through the addition of novel features supporting especially higher-order combination data analytics and exploratory visualization of multi-drug synergy patterns, along with automated outlier detection procedure, extended curve-fitting functionality and statistical analysis of replicate measurements. A number of additional improvements were also implemented based on the user requests, including new visualization and export options, updated user interface, as well as enhanced stability and performance of the web-tool. With these improvements, SynergyFinder 2.0 is expected to greatly extend its potential applications in various areas of multi-drug combinatorial screening and precision medicine.

475 citations


Journal ArticleDOI
TL;DR: Panaroo is introduced, a graph-based pangenome clustering tool that is able to account for many of the sources of error introduced during the annotation of prokaryotic genome assemblies.
Abstract: Population-level comparisons of prokaryotic genomes must take into account the substantial differences in gene content resulting from horizontal gene transfer, gene duplication and gene loss. However, the automated annotation of prokaryotic genomes is imperfect, and errors due to fragmented assemblies, contamination, diverse gene families and mis-assemblies accumulate over the population, leading to profound consequences when analysing the set of all genes found in a species. Here, we introduce Panaroo, a graph-based pangenome clustering tool that is able to account for many of the sources of error introduced during the annotation of prokaryotic genome assemblies. Panaroo is available at https://github.com/gtonkinhill/panaroo .

284 citations


Posted ContentDOI
28 Jan 2020-bioRxiv
TL;DR: Panaroo is introduced, a graph based pangenome clustering tool that is able to account for many of the sources of error introduced during the annotation of prokaryotic genome assemblies and is shown to have utility by performing a pan-genome wide association study in Neisseria gonorrhoeae and by analysing gene gain and loss rates across 51 of the major global pneumococcal sequence clusters.
Abstract: Population-level comparisons of prokaryotic genomes must take into account the substantial differences in gene content, resulting from frequent horizontal gene transfer, gene duplication and gene loss. However, the automated annotation of prokaryotic genomes is imperfect, and errors due to fragmented assemblies, contamination, diverse gene families and mis-assemblies accumulate over the population, leading to profound consequences when analysing the set of all genes found in a species. Here we introduce Panaroo, a graph based pangenome clustering tool that is able to account for many of the sources of error introduced during the annotation of prokaryotic genome assemblies. We verified our approach through extensive simulations of de novo assemblies using the infinitely many genes model and by analysing a number of publicly available large bacterial genome datasets. Using a highly clonal Mycobacterium tuberculosis dataset as a negative control case, we show that failing to account for annotation errors can lead to pangenome estimates that are dominated by error. We additionally demonstrate the utility of the improved graphical output provided by Panaroo by performing a pan-genome wide association study in Neisseria gonorrhoeae and by analysing gene gain and loss rates across 51 of the major global pneumococcal sequence clusters. Panaroo is freely available under an open source MIT licence at https://github.com/gtonkinhill/panaroo.

195 citations


Journal ArticleDOI
27 Feb 2020-Blood
TL;DR: The results implicate death receptor signaling as an important mediator of cancer cell sensitivity to CAR T cell cytotoxicity, with potential for pharmacological targeting to enhance cancer immunotherapy.

129 citations


Journal ArticleDOI
TL;DR: The opportunities for a comprehensive way of assessing genetic risk in the general population, in breast cancer patients, and in unaffected family members are demonstrated.
Abstract: Polygenic risk scores (PRS) for breast cancer have potential to improve risk prediction, but there is limited information on their utility in various clinical situations. Here we show that among 122,978 women in the FinnGen study with 8401 breast cancer cases, the PRS modifies the breast cancer risk of two high-impact frameshift risk variants. Similarly, we show that after the breast cancer diagnosis, individuals with elevated PRS have an elevated risk of developing contralateral breast cancer, and that the PRS can considerably improve risk assessment among their female first-degree relatives. In more detail, women with the c.1592delT variant in PALB2 (242-fold enrichment in Finland, 336 carriers) and an average PRS (10-90th percentile) have a lifetime risk of breast cancer at 55% (95% CI 49-61%), which increases to 84% (71-97%) with a high PRS ( > 90th percentile), and decreases to 49% (30-68%) with a low PRS ( < 10th percentile). Similarly, for c.1100delC in CHEK2 (3.7-fold enrichment; 1648 carriers), the respective lifetime risks are 29% (27-32%), 59% (52-66%), and 9% (5-14%). The PRS also refines the risk assessment of women with first-degree relatives diagnosed with breast cancer, particularly among women with positive family history of early-onset breast cancer. Here we demonstrate the opportunities for a comprehensive way of assessing genetic risk in the general population, in breast cancer patients, and in unaffected family members.

82 citations


Posted Content
TL;DR: In this paper, a simple technique to analyze Generative Adversarial Networks (GANs) and create interpretable controls for image synthesis, such as change of viewpoint, aging, lighting, and time of day, is described.
Abstract: This paper describes a simple technique to analyze Generative Adversarial Networks (GANs) and create interpretable controls for image synthesis, such as change of viewpoint, aging, lighting, and time of day. We identify important latent directions based on Principal Components Analysis (PCA) applied either in latent space or feature space. Then, we show that a large number of interpretable controls can be defined by layer-wise perturbation along the principal directions. Moreover, we show that BigGAN can be controlled with layer-wise inputs in a StyleGAN-like manner. We show results on different GANs trained on various datasets, and demonstrate good qualitative matches to edit directions found through earlier supervised approaches.

71 citations


Journal ArticleDOI
TL;DR: The Breeze application provides a complete solution for data quality assessment, dose–response curve fitting and quantification of the drug responses along with interactive visualization of the results.
Abstract: Summary High-throughput screening (HTS) enables systematic testing of thousands of chemical compounds for potential use as investigational and therapeutic agents. HTS experiments are often conducted in multi-well plates that inherently bear technical and experimental sources of error. Thus, HTS data processing requires the use of robust quality control procedures before analysis and interpretation. Here, we have implemented an open-source analysis application, Breeze, an integrated quality control and data analysis application for HTS data. Furthermore, Breeze enables a reliable way to identify individual drug sensitivity and resistance patterns in cell lines or patient-derived samples for functional precision medicine applications. The Breeze application provides a complete solution for data quality assessment, dose-response curve fitting and quantification of the drug responses along with interactive visualization of the results. Availability and implementation The Breeze application with video tutorial and technical documentation is accessible at https://breeze.fimm.fi; the R source code is publicly available at https://github.com/potdarswapnil/Breeze under GNU General Public License v3.0. Contact swapnil.potdar@helsinki.fi. Supplementary information Supplementary data are available at Bioinformatics online.

56 citations


Journal ArticleDOI
TL;DR: The presently observed variation in MOR availability may explain why some individuals are prone to develop MOR-linked pathological states, such as chronic pain or psychiatric disorders.

50 citations


Journal ArticleDOI
TL;DR: The approach enables comboFM to leverage information from previous experiments performed on similar drugs and cells when predicting responses of new combinations in so far untested cells and achieves highly accurate predictions despite sparsely populated data tensors.
Abstract: We present comboFM, a machine learning framework for predicting the responses of drug combinations in pre-clinical studies, such as those based on cell lines or patient-derived cells. comboFM models the cell context-specific drug interactions through higher-order tensors, and efficiently learns latent factors of the tensor using powerful factorization machines. The approach enables comboFM to leverage information from previous experiments performed on similar drugs and cells when predicting responses of new combinations in so far untested cells; thereby, it achieves highly accurate predictions despite sparsely populated data tensors. We demonstrate high predictive performance of comboFM in various prediction scenarios using data from cancer cell line pharmacogenomic screens. Subsequent experimental validation of a set of previously untested drug combinations further supports the practical and robust applicability of comboFM. For instance, we confirm a novel synergy between anaplastic lymphoma kinase (ALK) inhibitor crizotinib and proteasome inhibitor bortezomib in lymphoma cells. Overall, our results demonstrate that comboFM provides an effective means for systematic pre-screening of drug combinations to support precision oncology applications. Combinatorial treatments have become a standard of care for various complex diseases including cancers. Here, the authors show that combinatorial responses of two anticancer drugs can be accurately predicted using factorization machines trained on large-scale pharmacogenomic data for guiding precision oncology studies.

48 citations


Journal ArticleDOI
TL;DR: A new projection technique is presented that unifies two existing techniques and is both accurate and fast to compute and a way of evaluating the feature selection process using fast leave-one-out cross-validation that allows for easy and intuitive model size selection is proposed.
Abstract: This paper reviews predictive inference and feature selection for generalized linear models with scarce but high-dimensional data. We demonstrate that in many cases one can benefit from a decision theoretically justified two-stage approach: first, construct a possibly non-sparse model that predicts well, and then find a minimal subset of features that characterize the predictions. The model built in the first step is referred to as the reference model and the operation during the latter step as predictive projection. The key characteristic of this approach is that it finds an excellent tradeoff between sparsity and predictive accuracy, and the gain comes from utilizing all available information including prior and that coming from the left out features. We review several methods that follow this principle and provide novel methodological contributions. We present a new projection technique that unifies two existing techniques and is both accurate and fast to compute. We also propose a way of evaluating the feature selection process using fast leave-one-out cross-validation that allows for easy and intuitive model size selection. Furthermore, we prove a theorem that helps to understand the conditions under which the projective approach could be beneficial. The key ideas are illustrated via several experiments using simulated and real world data.


Journal ArticleDOI
TL;DR: A web-based open-source tool, SynToxProfiler (Synergy-Toxicity-Profiler), prioritized as top hits those synergistic drug pairs that showed higher selective efficacy (difference between efficacy and toxicity), which offers an improved likelihood for clinical success.
Abstract: Drug combinations are becoming a standard treatment of many complex diseases due to their capability to overcome resistance to monotherapy. In the current preclinical drug combination screening, the top combinations for further study are often selected based on synergy alone, without considering the combination efficacy and toxicity effects, even though these are critical determinants for the clinical success of a therapy. To promote the prioritization of drug combinations based on integrated analysis of synergy, efficacy and toxicity profiles, we implemented a web-based open-source tool, SynToxProfiler (Synergy-Toxicity-Profiler). When applied to 20 anti-cancer drug combinations tested both in healthy control and T-cell prolymphocytic leukemia (T-PLL) patient cells, as well as to 77 anti-viral drug pairs tested in Huh7 liver cell line with and without Ebola virus infection, SynToxProfiler prioritized as top hits those synergistic drug pairs that showed higher selective efficacy (difference between efficacy and toxicity), which offers an improved likelihood for clinical success.

Journal ArticleDOI
19 Jun 2020
TL;DR: Re-sequencing multiple genomes from dromedaries, Bactrian camels, and their endangered wild relatives shows that positive selection for candidate genes underlying traits collectively referred to as ‘domestication syndrome’ is consistent with neural crest deficiencies and altered thyroid hormone-based signaling.
Abstract: Domestication begins with the selection of animals showing less fear of humans. In most domesticates, selection signals for tameness have been superimposed by intensive breeding for economical or other desirable traits. Old World camels, conversely, have maintained high genetic variation and lack secondary bottlenecks associated with breed development. By re-sequencing multiple genomes from dromedaries, Bactrian camels, and their endangered wild relatives, here we show that positive selection for candidate genes underlying traits collectively referred to as ‘domestication syndrome’ is consistent with neural crest deficiencies and altered thyroid hormone-based signaling. Comparing our results with other domestic species, we postulate that the core set of domestication genes is considerably smaller than the pan-domestication set – and overlapping genes are likely a result of chance and redundancy. These results, along with the extensive genomic resources provided, are an important contribution to understanding the evolutionary history of camels and the genomic features of their domestication. Robert R. Fitak et al. investigate the genetic basis for domestication in camels. They found that the positive selection of candidate domestication genes is consistent with neural crest deficiencies and altered thyroid hormone-based signaling. Their work provides insights to the evolutionary history of camels and genetics of domestication.

Posted ContentDOI
11 Feb 2020-bioRxiv
TL;DR: A crowdsourced benchmarking of the accuracy of machine learning (ML) algorithms at predicting kinase inhibitor potencies across multiple kinase families demonstrated that these models and their ensemble can improve the accuracies of experimental mapping efforts, especially for so far under-studied kinases.
Abstract: Despite decades of intensive search for compounds that modulate the activity of particular targets, there are currently small-molecules available only for a small proportion of the human proteome. Effective approaches are therefore required to map the massive space of unexplored compound-target interactions for novel and potent activities. Here, we carried out a crowdsourced benchmarking of predictive models for kinase inhibitor potencies across multiple kinase families using unpublished bioactivity data. The top-performing predictions were based on kernel learning, gradient boosting and deep learning, and their ensemble resulted in predictive accuracy exceeding that of kinase activity assays. We then made new experiments based on the model predictions, which further improved the accuracy of experimental mapping efforts and identified unexpected potencies even for under-studied kinases. The open-source algorithms together with the novel bioactivities between 95 compounds and 295 kinases provide a resource for benchmarking new prediction algorithms and for extending the druggable kinome.

Posted Content
TL;DR: A novel approach to accurate privacy accounting of the subsampled Gaussian mechanism using the recently introduced Fast Fourier Transform based accounting technique to give a strict lower and upper bounds for the true $(\varepsilon,\delta)$-values.
Abstract: We propose a numerical accountant for evaluating the tight $(\varepsilon,\delta)$-privacy loss for algorithms with discrete one dimensional output. The method is based on the privacy loss distribution formalism and it uses the recently introduced fast Fourier transform based accounting technique. We carry out an error analysis of the method in terms of moment bounds of the privacy loss distribution which leads to rigorous lower and upper bounds for the true $(\varepsilon,\delta)$-values. As an application, we present a novel approach to accurate privacy accounting of the subsampled Gaussian mechanism. This completes the previously proposed analysis by giving strict lower and upper bounds for the privacy parameters. We demonstrate the performance of the accountant on the binomial mechanism and show that our approach allows decreasing noise variance up to 75 percent at equal privacy compared to existing bounds in the literature. We also illustrate how to compute tight bounds for the exponential mechanism applied to counting queries.

Journal ArticleDOI
TL;DR: A framework to discover daily cyber activity patterns across people's mobile app usage is proposed, which shows that people usually follow yesterday's activity patterns, but the patterns tend to deviate as the time-lapse increases.
Abstract: With the prevalence of smartphones, people have left abundant behavior records in cyberspace. Discovering and understanding individuals' cyber activities can provide useful implications for policymakers, service providers, and app developers. In this paper, we propose a framework to discover daily cyber activity patterns across people's mobile app usage. We first segment app usage traces into small time windows and then design a probabilistic topic model to infer users' cyber activities of each window. By exploring the coherence of users' activity sequences, the daily patterns of individuals are identified. Next, we recognize the common patterns across diverse groups of individuals using a hierarchical clustering algorithm. We then apply our framework on a large-scale and real-world dataset, consisting of 653,092 users with 971,818,946 usage records of 2,000 popular mobile apps. Our analysis shows that people usually obey yesterday's activity patterns, but the patterns tend to deviate as the time-lapse increases. We also discover five common daily cyber activity patterns, including afternoon reading, nightly entertainment, pervasive socializing, commuting, and nightly socializing. Our findings have profound implications on identifying the demographics of users and their lifestyles, habits, service requirements, and further detecting other disrupting trends such as working overtime and addiction to the game and social media.

Proceedings ArticleDOI
09 Jul 2020
TL;DR: A rough estimation of complexity for word analogies and an algorithm to find the optimal transformations of minimal complexity is proposed and compared with state-of-the-art approaches to demonstrate the interest of using complexity to solve analogies on words.
Abstract: Analogies are 4-ary relations of the form “A is to B as C is to D”. When A, B and C are fixed, we call analogical equation the problem of finding the correct D. A direct applicative domain is Natural Language Processing, in which it has been shown successful on word inflections, such as conjugation or declension. If most approaches rely on the axioms of proportional analogy to solve these equations, these axioms are known to have limitations, in particular in the nature of the considered flections. In this paper, we propose an alternative approach, based on the assumption that optimal word inflections are transformations of minimal complexity. We propose a rough estimation of complexity for word analogies and an algorithm to find the optimal transformations. We illustrate our method on a large-scale benchmark dataset and compare with state-of-the-art approaches to demonstrate the interest of using complexity to solve analogies on words.

Proceedings Article
01 Jan 2020
TL;DR: The methods build on a recent Markov chain Monte Carlo scheme for learning Bayesian networks, which enables efficient approximate sampling from the graph posterior, provided that each node is assigned a small number K of candidate parents.
Abstract: We give methods for Bayesian inference of directed acyclic graphs, DAGs, and the induced causal effects from passively observed complete data. Our methods build on a recent Markov chain Monte Carlo scheme for learning Bayesian networks, which enables efficient approximate sampling from the graph posterior, provided that each node is assigned a small number $K$ of candidate parents. We present algorithmic techniques to significantly reduce the space and time requirements, which make the use of substantially larger values of $K$ feasible. Furthermore, we investigate the problem of selecting the candidate parents per node so as to maximize the covered posterior mass. Finally, we combine our sampling method with a novel Bayesian approach for estimating causal effects in linear Gaussian DAG models. Numerical experiments demonstrate the performance of our methods in detecting ancestor-descendant relations, and in causal effect estimation our Bayesian method is shown to outperform previous approaches.

Journal ArticleDOI
TL;DR: Brain–computer interfaces enable active communication and execution of a pre-defined set of commands, such as typing a letter or moving a cursor, but they have thus far not been able to infer more complex intentions or adapt more complex output based on brain signals.
Abstract: Brain-computer interfaces enable active communication and execution of a pre-defined set of commands, such as typing a letter or moving a cursor. However, they have thus far not been able to infer more complex intentions or adapt more complex output based on brain signals. Here, we present neuroadaptive generative modelling, which uses a participant's brain signals as feedback to adapt a boundless generative model and generate new information matching the participant's intentions. We report an experiment validating the paradigm in generating images of human faces. In the experiment, participants were asked to specifically focus on perceptual categories, such as old or young people, while being presented with computer-generated, photorealistic faces with varying visual features. Their EEG signals associated with the images were then used as a feedback signal to update a model of the user's intentions, from which new images were generated using a generative adversarial network. A double-blind follow-up with the participant evaluating the output shows that neuroadaptive modelling can be utilised to produce images matching the perceptual category features. The approach demonstrates brain-based creative augmentation between computers and humans for producing new information matching the human operator's perceptual categories.

Posted ContentDOI
18 Apr 2020-bioRxiv
TL;DR: The broad utility of CANOPUS is demonstrated by investigating the effect of the microbial colonization in the digestive system in mice, and through analysis of the chemodiversity of different Euphorbia plants; both uniquely revealing biological insights at the compound class level.
Abstract: Metabolomics experiments can employ non-targeted tandem mass spectrometry to detect hundreds to thousands of molecules in a biological sample. Structural annotation of molecules is typically carried out by searching their fragmentation spectra in spectral libraries or, recently, in structure databases. Annotations are limited to structures present in the library or database employed, prohibiting a thorough utilization of the experimental data. We present a computational tool for systematic compound class annotation: CANOPUS uses a deep neural network to predict 1,270 compound classes from fragmentation spectra, and explicitly targets compounds where neither spectral nor structural reference data are available. CANOPUS even predicts classes for which no MS/MS training data are available. We demonstrate the broad utility of CANOPUS by investigating the effect of the microbial colonization in the digestive system in mice, and through analysis of the chemodiversity of different Euphorbia plants; both uniquely revealing biological insights at the compound class level.

Journal ArticleDOI
TL;DR: Synthetic utility of DERA aldolase was improved by protein engineering approaches, and a novel machine learning model utilising Gaussian processes and feature learning was applied for the 3rd mutagenesis round to predict new beneficial mutant combinations.
Abstract: In this work, deoxyribose-5-phosphate aldolase (Ec DERA, EC 4.1.2.4) from Escherichia coli was chosen as the protein engineering target for improving the substrate preference towards smaller, non-phosphorylated aldehyde donor substrates, in particular towards acetaldehyde. The initial broad set of mutations was directed to 24 amino acid positions in the active site or in the close vicinity, based on the 3D complex structure of the E. coli DERA wild-type aldolase. The specific activity of the DERA variants containing one to three amino acid mutations was characterised using three different substrates. A novel machine learning (ML) model utilising Gaussian processes and feature learning was applied for the 3rd mutagenesis round to predict new beneficial mutant combinations. This led to the most clear-cut (two- to threefold) improvement in acetaldehyde (C2) addition capability with the concomitant abolishment of the activity towards the natural donor molecule glyceraldehyde-3-phosphate (C3P) as well as the non-phosphorylated equivalent (C3). The Ec DERA variants were also tested on aldol reaction utilising formaldehyde (C1) as the donor. Ec DERA wild-type was shown to be able to carry out this reaction, and furthermore, some of the improved variants on acetaldehyde addition reaction turned out to have also improved activity on formaldehyde. KEY POINTS: • DERA aldolases are promiscuous enzymes. • Synthetic utility of DERA aldolase was improved by protein engineering approaches. • Machine learning methods aid the protein engineering of DERA.

Journal ArticleDOI
TL;DR: Data shows that the availability of interactive faceted query suggestion substantially improves whole‐session effectiveness by increasing recall without sacrificing precision, and implies that research in exploratory search should focus on measuring and designing tools that engage users with directed situated navigation support for improving whole‐ session performance.
Abstract: The outcome of exploratory information retrieval is not only dependent on the effectiveness of individual responses to a set of queries, but also on relevant information retrieved during the entire exploratory search session. We study the effect of search assistance, operationalized as an interactive faceted query suggestion, for both whole‐session effectiveness and engagement through interactive faceted query suggestion. A user experiment is reported, where users performed exploratory search tasks, comparing interactive faceted query suggestion and a control condition with only conventional typed‐query interaction. Data comprised of interaction and search logs show that the availability of interactive faceted query suggestion substantially improves whole‐session effectiveness by increasing recall without sacrificing precision. The increased engagement with interactive faceted query suggestion is targeted to direct situated navigation around the initial query scope, but is not found to improve individual queries on average. The results imply that research in exploratory search should focus on measuring and designing tools that engage users with directed situated navigation support for improving whole‐session performance.

Journal ArticleDOI
TL;DR: The frequencies of accessory genes are used to predict changes in the pneumococcal population after vaccination, hypothesizing that these frequencies reflect negative frequency-dependent selection (NFDS) on the gene products.
Abstract: Predicting how pathogen populations will change over time is challenging. Such has been the case with Streptococcus pneumoniae, an important human pathogen, and the pneumococcal conjugate vaccines (PCVs), which target only a fraction of the strains in the population. Here, we use the frequencies of accessory genes to predict changes in the pneumococcal population after vaccination, hypothesizing that these frequencies reflect negative frequency-dependent selection (NFDS) on the gene products. We find that the standardized predicted fitness of a strain, estimated by an NFDS-based model at the time the vaccine is introduced, enables us to predict whether the strain increases or decreases in prevalence following vaccination. Further, we are able to forecast the equilibrium post-vaccine population composition and assess the invasion capacity of emerging lineages. Overall, we provide a method for predicting the impact of an intervention on pneumococcal populations with potential application to other bacterial pathogens in which NFDS is a driving force.

Journal ArticleDOI
TL;DR: A disturbance response that presents as classic species sorting, but is nevertheless accompanied by rapid within-species evolution is identified, the magnitude of which increases with increasing antibiotic levels.
Abstract: In an era of pervasive anthropogenic ecological disturbances, there is a pressing need to understand the factors that constitute community response and resilience. A detailed understanding of disturbance response needs to go beyond associations and incorporate features of disturbances, species traits, rapid evolution and dispersal. Multispecies microbial communities that experience antibiotic perturbation represent a key system with important medical dimensions. However, previous microbiome studies on this theme have relied on high-throughput sequencing data from uncultured species without the ability to explicitly account for the role of species traits and immigration. Here, we serially passage a 34-species defined bacterial community through different levels of pulse antibiotic disturbance, manipulating the presence or absence of species immigration. To understand the ecological community response measured using amplicon sequencing, we combine initial trait data measured for each species separately and metagenome sequencing data revealing adaptive mutations during the experiment. We found that the ecological community response was highly repeatable within the experimental treatments, which could be attributed in part to key species traits (antibiotic susceptibility and growth rate). Increasing antibiotic levels were also coupled with an increasing probability of species extinction, making species immigration critical for community resilience. Moreover, we detected signals of antibiotic-resistance evolution occurring within species at the same time scale, leaving evolutionary changes in communities despite recovery at the species compositional level. Together, these observations reveal a disturbance response that presents as classic species sorting, but is nevertheless accompanied by rapid within-species evolution.

Journal ArticleDOI
30 Jan 2020
TL;DR: The mSWEEP pipeline for identifying and estimating the relative sequence abundances of bacterial lineages from plate sweeps of enrichment cultures leverages biologically grouped sequence assembly databases, applying probabilistic modelling, and provides controls for false positive results.
Abstract: Determining the composition of bacterial communities beyond the level of a genus or species is challenging because of the considerable overlap between genomes representing close relatives. Here, we present the mSWEEP pipeline for identifying and estimating the relative sequence abundances of bacterial lineages from plate sweeps of enrichment cultures. mSWEEP leverages biologically grouped sequence assembly databases, applying probabilistic modelling, and provides controls for false positive results. Using sequencing data from major pathogens, we demonstrate significant improvements in lineage quantification and detection accuracy. Our pipeline facilitates investigating cultures comprising mixtures of bacteria, and opens up a new field of plate sweep metagenomics.

Posted ContentDOI
08 Oct 2020-bioRxiv
TL;DR: This work identifies, characterize and exploit a trade-off between decreasing the target population size as fast as possible and generating a surplus of treatment-induced de novo mutations, and finds the optimal treatment strategy, which minimizes the probability of evolutionary rescue.
Abstract: Evolution of drug resistance to anticancer, antimicrobial and antiviral therapies is widespread among cancer and pathogen cell populations. Classical theory posits strictly that genetic and phenotypic variation is generated in evolving populations independently of the selection pressure. However, recent experimental findings among antimicrobial agents, traditional cytotoxic chemotherapies and targeted cancer therapies suggest that treatment not only imposes selection but also affects the rate of adaptation via altered mutational processes. Here we analyze a model with drug-induced increase in mutation rate and explore its consequences for treatment optimization. We argue that the true biological cost of treatment is not limited to the harmful side-effects, but instead realizes even more profoundly by fundamentally changing the underlying eco-evolutionary dynamics within the microenvironment. Factoring in such costs (or collateral damage) of control is at the core of successful therapy design and can unify different evolution-based approaches to therapy optimization. Using the concept of evolutionary rescue, we formulate the treatment as an optimal control problem and solve the optimal elimination strategy, which minimizes the probability of evolutionary rescue. Our solution exploits a trade-off, where increasing the drug concentration has two opposing effects. On the one hand, it reduces de novo mutations by decreasing the size of the target cell population faster; on the other hand, a higher dosage generates a surplus of treatment-induced mutations. We show that aggressive elimination strategies, which aim at eradication as fast as possible and which represent the current standard of care, can be detrimental even with modest drug-induced increases (fold change ≤10) to the baseline mutation rate. Our findings highlight the importance of dose dependencies in resistance evolution and motivate further investigation of the mutagenicity and other hidden collateral costs of therapies that promote resistance. Author summary The evolution of drug resistance is a particularly problematic and frequent outcome of cancer and antimicrobial therapies. Recent research suggests that these treatments may enhance the evolvability of the target population not only via inducing intense selection pressures but also via altering the underlying mutational processes. Here we investigate the consequences of such drug-induced evolution by considering a mathematical model with explicitly dose-dependent mutation rate. We identify, characterize and exploit a trade-off between decreasing the target population size as fast as possible and generating a surplus of treatment-induced de novo mutations. By formulating the treatment as an optimal control problem over the evolution of the target population, we find the optimal treatment strategy, which minimizes the probability of evolutionary rescue. We show that this probability changes non-monotonically with the cumulative drug concentration and is minimized at an intermediate dosage. Our results are immediately amenable to experimental investigation and motivate further study of the various mutagenic and other hidden collateral costs of treatment. Taken together, our results add to the ongoing criticism of the standard practice of administering aggressive, high-dose therapies and stimulate further clinical trials on alternative treatment strategies.

Posted ContentDOI
27 Feb 2020-medRxiv
TL;DR: This is the first study to show that social isolation is associated with increased risk of dementia across the spectrum of genetic risk, and Loneliness, although considered as a significant risk for multiple health problems, seems to be associated with dementia only when combined with high genetic risk.
Abstract: Objective To examine the associations of social isolation and loneliness with incident dementia by level of genetic risk. Design Prospective population-based cohort study. Setting and participants 155 074 men and women (mean age 64.1, SD 2.9 years) from the UK Biobank Study, recruited between 2006 and 2010. Main exposures Self-reported social isolation and loneliness, and polygenic risk score for Alzheimer’s disease with low (lowest quintile), intermediate (quintiles 2 to 4), and high (highest quintile) risk categories. Main outcome Incident all-cause dementia ascertained using electronic health records. Results Overall, 8.6% of participants reported that they were socially isolated and 5.5% were lonely. During a mean follow-up of 8.8 years (1.36 million person-years), 1444 (0.9% of the total sample) were diagnosed with dementia. Social isolation, but not loneliness, was associated with increased risk of dementia (hazard ratio 1.62, 95% confidence interval 1.38 to 1.90). Of the participants who were socially isolated and had high genetic risk, 4.2% (2.9% to 5.5%) were estimated to develop dementia compared with 3.1% (2.7% to 3.5%) in participants who were not socially isolated but had high genetic risk. The corresponding estimated incidence in the socially isolated and not isolated were 3.9% (3.1% to 4.6%) and 2.5% (2.2% to 2.6%) in participants with intermediate genetic risk. Conclusion Socially isolated individuals are at increased risk of dementia at all levels of genetic risk. What is already known on this topic Social isolation and loneliness have been associated with increased risk of dementia It is not known whether this risk is modified or confounded by genetic risk of dementia What this study adds This is the first study to show that social isolation is associated with increased risk of dementia across the spectrum of genetic risk Loneliness, although considered as a significant risk for multiple health problems, seems to be associated with dementia only when combined with high genetic risk

Journal ArticleDOI
TL;DR: This work proposes an approach where a reinforcement learning agent is trained to make the first two decisions (i.e., rescheduling timing and computing time allocation) using neuroevolution of augmenting topologies (NEAT) as the reinforcement learning algorithm, which yields better closed-loop solutions on three out of four studied routing problems.

Journal ArticleDOI
TL;DR: The implications of platformed interaction for the democratic process are discussed, suggesting that campaign strategy may exploit it in ways that may even necessitate regulation and contributions to theory that acknowledge platforms’ part in interaction may be needed.
Abstract: Interaction between candidates and constituents via social media is a well-studied domain. The article takes this research further through a synthesis with platform studies, emerging scholarship th...