scispace - formally typeset
Open AccessProceedings ArticleDOI

signac: A Python framework for data and workflow management

TLDR
This talk showcases signac, an open-source Python framework that offers highly modular and scalable solutions for versatile data and workflow management tools that can easily adapt to the highly dynamic requirements of scientific investigations.
Abstract
Computational research requires versatile data and workflow management tools that can easily adapt to the highly dynamic requirements of scientific investigations. Many existing tools require strict adherence to a particular usage pattern, so researchers often use less robust ad hoc solutions that they find easier to adopt. The resulting data fragmentation and methodological incompatibilities significantly impede research. Our talk showcases signac, an open-source Python framework that offers highly modular and scalable solutions for this problem. Named for the Pointillist painter Paul Signac, the framework’s powerful workflow management tools enable users to construct and automate workflows that transition seamlessly from laptops to HPC clusters. Crucially, the underlying data model is completely independent of the workflow. The flexible, serverless, and schema-free signac database can be introduced into other workflows with essentially no overhead and no recourse to the signac workflow model. Additionally, the data model’s simplicity makes it easy to parse the underlying data without using signac at all. This modularity and simplicity eliminates significant barriers for consistent data management across projects, facilitating improved provenance management and data sharing with minimal

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Machine Learning Directed Optimization of Classical Molecular Modeling Force Fields

TL;DR: In this paper, a machine learning directed, multiobjective optimization workflow for force field parameterization is presented, which evaluates millions of prospective force field parameters while requiring only a small fraction of them to be tested with molecular simulations.
Journal ArticleDOI

MoSDeF Cassandra: A complete Python interface for the Cassandra Monte Carlo software.

TL;DR: A new Python interface for the Cassandra Monte Carlo software, molecular simulation design framework (MoSDeF) Cassandra, provides a simplified user interface, offers broader interoperability with other molecular simulation codes, enables the construction of programmatic and reproducible molecular simulation workflows, and builds the infrastructure necessary for high‐throughput Monte Carlo studies.
Journal ArticleDOI

High-throughput screening of tribological properties of monolayer films using molecular dynamics and machine learning.

TL;DR: The ML model was able to be used as a predictive tool to greatly speed up the initial screening of promising candidate films for future simulation studies, suggesting that computational screening in combination with ML can greatly increase the throughput in combinatorial approaches to generate in silico data and then train ML models in a controlled, self-consistent fashion.
Journal ArticleDOI

NMR and Theoretical Study of In-Pore Diffusivity of Ionic Liquid-Solvent Mixtures.

TL;DR: In this article , the authors applied two-dimensional exchange nuclear magnetic resonance spectroscopy (2D EXSY NMR) and molecular dynamics (MD) simulations to investigate the diffusivity of anions of an RTIL, namely, 1-butyl-3-methyl-imidazolium bis(trifluoromethylsulfonyl)imide (BMIM+-TFSI-), dissolved in five different organic solvents, in the micropores of activated carbon.
References
More filters
Journal ArticleDOI

AiiDA: automated interactive infrastructure and database for computational science

TL;DR: The paradigm sustaining such vision is illustrated, based around the four pillars of Automation, Data, Environment, and Sharing, and it is believed that AiiDA's design and its sharing capabilities will encourage the creation of social ecosystems to disseminate codes, data, and scientific workflows.
Journal ArticleDOI

Automated Capture of Experiment Context for Easier Reproducibility in Computational Research

TL;DR: A combination of best practices and automated tools can make it easier to create reproducible research.
Proceedings ArticleDOI

The Sacred Infrastructure for Computational Research

TL;DR: Sacred is an open source Python framework which aims to provide basic infrastructure for running computational experiments independent of the methods and libraries used, and provides an extensible basis for other tools, two of which are presented here: Labwatch helps with tuning hyperparameters, and Sacredboard provides a web-dashboard for organizing and analyzing runs.
Proceedings ArticleDOI

datreant: persistent, Pythonic trees for heterogeneous data

TL;DR: The Pythonic with Treants (P2T) project as discussed by the authors is a Pythonic approach for manipulating Treants, a scientific data structure that can be manipulated individually and in aggregate with mechanisms for granular access to the directories and files in their trees.
Related Papers (5)