scispace - formally typeset
Search or ask a question
Author

Martyna Moskal

Bio: Martyna Moskal is an academic researcher from Polish Academy of Sciences. The author has contributed to research in topics: Computer science & Algorithm. The author has an hindex of 2, co-authored 2 publications receiving 7 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: In this article , the Allchemy platform is used to generate giant synthetic networks emanating from approximately 200 waste chemicals recycled on commercial scales, retrieve from these networks tens of thousands of routes leading to approximately 300 important drugs and agrochemicals, and algorithmically rank these syntheses according to the accepted metrics of sustainable chemistry.
Abstract: As the chemical industry continues to produce considerable quantities of waste chemicals1,2, it is essential to devise 'circular chemistry'3-8 schemes to productively back-convert at least a portion of these unwanted materials into useful products. Despite substantial progress in the degradation of some classes of harmful chemicals9, work on 'closing the circle'-transforming waste substrates into valuable products-remains fragmented and focused on well known areas10-15. Comprehensive analyses of which valuable products are synthesizable from diverse chemical wastes are difficult because even small sets of waste substrates can, within few steps, generate millions of putative products, each synthesizable by multiple routes forming densely connected networks. Tracing all such syntheses and selecting those that also meet criteria of process and 'green' chemistries is, arguably, beyond the cognition of human chemists. Here we show how computers equipped with broad synthetic knowledge can help address this challenge. Using the forward-synthesis Allchemy platform16, we generate giant synthetic networks emanating from approximately 200 waste chemicals recycled on commercial scales, retrieve from these networks tens of thousands of routes leading to approximately 300 important drugs and agrochemicals, and algorithmically rank these syntheses according to the accepted metrics of sustainable chemistry17-19. Several of these routes we validate by experiment, including an industrially realistic demonstration on a 'pharmacy on demand' flow-chemistry platform20. Wide adoption of computerized waste-to-valuable algorithms can accelerate productive reuse of chemicals that would otherwise incur storage or disposal costs, or even pose environmental hazards.

20 citations

Journal ArticleDOI
TL;DR: In this article , a computer algorithm using a comprehensive knowledge base of individual reactions constructs and evaluates myriads of putative, but chemically plausible, sequences and discovers an unprecedented number of iterative sequences.
Abstract: Iterative syntheses comprise sequences of organic reactions in which the substrate molecules grow with each iteration and the functional groups, which enable the growth step, are regenerated to allow sustained cycling. Typically, iterative sequences can be automated, for example, as in the transformative examples of the robotized syntheses of peptides, oligonucleotides, polysaccharides and even some natural products. However, iterations are not easy to identify—in particular, for sequences with cycles more complex than protection and deprotection steps. Indeed, the number of catalogued examples is in the tens to maybe a hundred. Here, a computer algorithm using a comprehensive knowledge base of individual reactions constructs and evaluates myriads of putative, but chemically plausible, sequences and discovers an unprecedented number of iterative sequences. Some of these iterations are validated by experiment and result in the synthesis of motifs commonly found in natural products. This computer-driven discovery expands the pool of iterative sequences that may be automated in the future. Iterative sequences of organic reactions can be automated but are rare and challenging to identify. Now, a computer-driven strategy is reported for the systematic discovery and evaluation of such sequences. Several of the iterative sequences are validated experimentally and enable the syntheses of useful motifs in natural product targets.

11 citations

Journal ArticleDOI
TL;DR: A computer program for retrosynthetic planning helps develop multiple “Synthetic contingency” plans for hydroxychloroquine and also routes leading to remdesivir, both promising but yet unproven medications against COVID-19.
Abstract: A computer program for retrosynthetic planning helps develop multiple “synthetic contingency” plans for hydroxychloroquine and also routes leading to remdesivir, both promising but yet unproven medications against COVID-19. These plans are designed to navigate, as much as possible, around known and patented routes and to commence from inexpensive and diverse starting materials, so as to ensure supply in case of anticipated market shortages of commonly used substrates. Looking beyond the current COVID-19 pandemic, development of similar contingency syntheses is advocated for other already-approved medications, in case such medications become urgently needed in mass quantities to face other public-health emergencies.

9 citations

Journal ArticleDOI
TL;DR: In this article, a method to vectorize and machine-learn non-covalent interactions responsible for scaffold-directed reactions important in synthetic chemistry is described, and models trained on this representation predict correct face of approach in ca. 90% of Michael additions or Diels-Alder cycloadditions.
Abstract: This work describes a method to vectorize and Machine-Learn, ML, non-covalent interactions responsible for scaffold-directed reactions important in synthetic chemistry. Models trained on this representation predict correct face of approach in ca. 90 % of Michael additions or Diels-Alder cycloadditions. These accuracies are significantly higher than those based on traditional ML descriptors, energetic calculations, or intuition of experienced synthetic chemists. Our results also emphasize the importance of ML models being provided with relevant mechanistic knowledge; without such knowledge, these models cannot easily "transfer-learn" and extrapolate to previously unseen reaction mechanisms.

6 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: It is shown that ML models cannot offer any meaningful predictions of optimum reaction conditions, even if the search space is restricted to only solvents and bases, and highlighted the likely importance of systematically generating reliable and standardized data sets for algorithm training.
Abstract: Applications of machine learning (ML) to synthetic chemistry rely on the assumption that large numbers of literature-reported examples should enable construction of accurate and predictive models of chemical reactivity. This paper demonstrates that abundance of carefully curated literature data may be insufficient for this purpose. Using an example of Suzuki–Miyaura coupling with heterocyclic building blocks—and a carefully selected database of >10,000 literature examples—we show that ML models cannot offer any meaningful predictions of optimum reaction conditions, even if the search space is restricted to only solvents and bases. This result holds irrespective of the ML model applied (from simple feed-forward to state-of-the-art graph-convolution neural networks) or the representation to describe the reaction partners (various fingerprints, chemical descriptors, latent representations, etc.). In all cases, the ML methods fail to perform significantly better than naive assignments based on the sheer frequency of certain reaction conditions reported in the literature. These unsatisfactory results likely reflect subjective preferences of various chemists to use certain protocols, other biasing factors as mundane as availability of certain solvents/reagents, and/or a lack of negative data. These findings highlight the likely importance of systematically generating reliable and standardized data sets for algorithm training.

35 citations

Journal ArticleDOI
TL;DR: The efforts to build a self-driving lab for the development of a new class of materials: organic semiconductor lasers (OSLs) are described, and a flexible system for automated synthesis via iterative Suzuki-Miyaura cross-coupling reactions is developed.
Abstract: ConspectusWe must accelerate the pace at which we make technological advancements to address climate change and disease risks worldwide. This swifter pace of discovery requires faster research and development cycles enabled by better integration between hypothesis generation, design, experimentation, and data analysis. Typical research cycles take months to years. However, data-driven automated laboratories, or self-driving laboratories, can significantly accelerate molecular and materials discovery. Recently, substantial advancements have been made in the areas of machine learning and optimization algorithms that have allowed researchers to extract valuable knowledge from multidimensional data sets. Machine learning models can be trained on large data sets from the literature or databases, but their performance can often be hampered by a lack of negative results or metadata. In contrast, data generated by self-driving laboratories can be information-rich, containing precise details of the experimental conditions and metadata. Consequently, much larger amounts of high-quality data are gathered in self-driving laboratories. When placed in open repositories, this data can be used by the research community to reproduce experiments, for more in-depth analysis, or as the basis for further investigation. Accordingly, high-quality open data sets will increase the accessibility and reproducibility of science, which is sorely needed.In this Account, we describe our efforts to build a self-driving lab for the development of a new class of materials: organic semiconductor lasers (OSLs). Since they have only recently been demonstrated, little is known about the molecular and material design rules for thin-film, electrically-pumped OSL devices as compared to other technologies such as organic light-emitting diodes or organic photovoltaics. To realize high-performing OSL materials, we are developing a flexible system for automated synthesis via iterative Suzuki-Miyaura cross-coupling reactions. This automated synthesis platform is directly coupled to the analysis and purification capabilities. Subsequently, the molecules of interest can be transferred to an optical characterization setup. We are currently limited to optical measurements of the OSL molecules in solution. However, material properties are ultimately most important in the solid state (e.g., as a thin-film device). To that end and for a different scientific goal, we are developing a self-driving lab for inorganic thin-film materials focused on the oxygen evolution reaction.While the future of self-driving laboratories is very promising, numerous challenges still need to be overcome. These challenges can be split into cognition and motor function. Generally, the cognitive challenges are related to optimization with constraints or unexpected outcomes for which general algorithmic solutions have yet to be developed. A more practical challenge that could be resolved in the near future is that of software control and integration because few instrument manufacturers design their products with self-driving laboratories in mind. Challenges in motor function are largely related to handling heterogeneous systems, such as dispensing solids or performing extractions. As a result, it is critical to understand that adapting experimental procedures that were designed for human experimenters is not as simple as transferring those same actions to an automated system, and there may be more efficient ways to achieve the same goal in an automated fashion. Accordingly, for self-driving laboratories, we need to carefully rethink the translation of manual experimental protocols.

29 citations

Journal ArticleDOI
TL;DR: As remdesivir is the first approved treatment for COVID-19 (SARS-CoV-2), its production is likely to be of vital importance in the near future.

29 citations

Journal ArticleDOI
TL;DR: For example, Chematica as discussed by the authors is a software platform for planning synthetic routes to complex natural products, including natural products that can be synthesized at an expert level, using the rules describing chemical reactions and use these rules to expand and search the networks of synthetic options.
Abstract: Teaching computers to plan multistep syntheses of arbitrary target molecules-including natural products-has been one of the oldest challenges in chemistry, dating back to the 1960s. This Account recapitulates two decades of our group's work on the software platform called Chematica, which very recently achieved this long-sought objective and has been shown capable of planning synthetic routes to complex natural products, several of which were validated in the laboratory.For the machine to plan syntheses at an expert level, it must know the rules describing chemical reactions and use these rules to expand and search the networks of synthetic options. The rules must be of high quality: They must delineate accurately the scope of admissible substituents, capture all relevant stereochemical information, detect potential reactivity conflicts, and protection requirements. They should yield only those synthons that are chemically stable and energetically allowed (e.g., not too strained) and should be able to extrapolate beyond examples already published in the literature. In parallel, the network-search algorithms must be able to assign meaningful scores to the sets of synthons they encounter, make judicious choices which of the network's branches to expand, and when to withdraw from unpromising ones. They must be able to strategize over multiple steps to resolve intermittent reactivity conflicts, exchange functional groups, or overcome local maxima of molecular complexity.Meeting all these requirements makes the problem of computer-driven retrosynthesis very multifaceted, combining expert and AI approaches further supplemented by quantum-mechanical and molecular-mechanics calculations. Development of Chematica has been a very long and gradual process because all these components are needed. Any shortcuts-for example, reliance on only expert or only data-based approaches-yield chemically naive and often erroneous syntheses, especially for complex targets. On the bright side, once all the requisite algorithms are implemented-as they now are-they not only streamline conventional synthetic planning but also enable completely new modalities that would challenge any human chemist, for example, synthesis with multiple constraints imposed simultaneously or library-wide syntheses in which the machine constructs "global plans" leading to multiple targets and benefiting from the use of common intermediates. These types of analyses will have profound impact on the practice of chemical industry, designing more economical, more green, and less hazardous pathways.

17 citations