Molecular simulation workflows as parallel algorithms: the execution engine of Copernicus, a distributed high-performance computing platform.
read more
Citations
GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers
PyEMMA 2: A Software Package for Estimation, Validation, and Analysis of Markov Models.
High Performance Computing for Cyber Physical Social Systems by Using Evolutionary Multi-Objective Optimization Algorithm
Molecular dynamics simulations of membrane proteins and their interactions: from nanoscale to mesoscale.
Combining experimental and simulation data of molecular processes via augmented Markov models
References
Comparison of simple potential functions for simulating liquid water
CHARMM: A program for macromolecular energy, minimization, and dynamics calculations
Scalable molecular dynamics with NAMD
The Amber biomolecular simulation programs
Related Papers (5)
Gromacs 4.5
Frequently Asked Questions (14)
Q2. What type of dynamism is supported in the dataflow network?
In order to enable dynamic execution (such as iterations and conditionals), two types of dynamism are supported in the dataflow network.
Q3. How many structures were generated from each trajectory?
Relaxation simulations of 25 ps at 300 K with dihedral restraints (4000 kJ mol−1 rad−2) were used to generate 20 structures from each trajectory, all of which were run for 30 fs without restraints.
Q4. What is the main advantage of the dataflow network formalism?
The dataflow network formalism also enables more sophisticated approaches such as altering the simulation setup to achieve more efficient overlap with a different distribution of stages based on short initial runs (known as adaptive lambda spacing).
Q5. What is the type of input socket on a function instance?
The data in the dataflow program flows from output sockets to input sockets, both of which are strongly typed: the type of an input socket on a function instance must match the type of the output socket to which it is connected.
Q6. What is the advantage of using explicit dataflow descriptions?
An advantage of using explicit dataflow descriptions is that program execution becomes transparent to the user; any value can be examined or set at any time.
Q7. How many cores can be spread in a millisecond?
With a few thousand particles there are not enough floating-point operations to spread over 100,000 cores in less than a millisecond, no matter what algorithm or code is used.
Q8. How many short simulations can the swarms module perform?
For large solvated protein complexes, the Copernicus swarms module can simultaneously execute over 10,000 short simulations if given a sufficient pool of workers.
Q9. How many efforts have been made to improve performance of molecular dynamics?
Large efforts have been invested in improving performance through simplified models, new algorithms, and better scaling of simulations,4–7 not to mention special-purpose hardware.
Q10. What is the second type of dynamism associated with arrays?
The second type of dynamism is associated with arrays: instance arrays will instantiate as many copies of a function as there are inputs in its array of function inputs; the output is an array of function outputs (Fig. 4).
Q11. How many cores can a worker allocate to execute?
Copernicus is also capable of using e.g. a 10,000-core worker allocation to execute 100 separate function instances each needing 100 cores.
Q12. What is the common way to use a single simulation trajectory?
In computational chemistry and related disciplines, a study almost never relies on a single simulation trajectory — multiple runs are used even in simple studies for uncertainty quantification and for comparison between conditions.
Q13. What makes MSM a very attractive sampling method for distributed computing?
combined with the high level of parallelism inherent in many hundreds of trajectories, makes MSM a very attractive sampling method for distributed computing.
Q14. What is the easiest way to illustrate this?
The easiest way to illustrate this is to use an example:> cpcc get fe.iter_lj_1.out.dgHere, the authors use the top-level function fe, in which the authors access the instance called iter_lj_1, which is the first iteration of the Lennard-Jones decoupling.