Comparing molecules and solids across structural and alchemical space.
read more
Citations
Recent advances and applications of machine learning in solid-state materials science
Quantum-chemical insights from deep tensor neural networks.
Machine learning in materials informatics: recent applications and prospects
Perspective: Machine learning potentials for atomistic simulations.
Machine learning of accurate energy-conserving molecular force fields
References
The Elements of Statistical Learning
The Hungarian method for the assignment problem
Nonlinear component analysis as a kernel eigenvalue problem
Survey of clustering algorithms
Clustering by fast search and find of density peaks
Related Papers (5)
Generalized neural-network representation of high-dimensional potential-energy surfaces.
Frequently Asked Questions (13)
Q2. What is the important descriptor of oligopeptide structure?
Conventional wisdom [57] assumes that the Cα dihedral angles φ and ψ are the most important descriptors of oligopeptide structure.
Q3. Why did the authors use the conventional best-match distance for the rest of their analyses?
For the sake of simplicity (and given the authors reduced the size of the environment covariance matrix C not considering H atoms as environment centers) the authors used the conventional best-match distance for the rest of their analyses.
Q4. What is the significance of the REMatch-SOAP approach?
Reaching chemical accuracy in the automated prediction of atomization energies is an important milestone, and the fact that the authors could achieve that without fully exploring the flexibility of the REMatch-SOAP framework (e.g. by optimizing the entropy regularization parameter,the environment cutoff, eliminating the outliers, combining multiple layers of description or using a non-diagonal alchemical similarity matrix) highlights the potential of their approach.
Q5. How many conformers of arginine dipeptide were selected?
The authors selected a library of 5062 locally stable conformers of arginine dipeptide (845 with and 4217 without a Ca2+ counterion) from a public database of oligopeptides structures developed by Ropo et al [56].
Q6. What are the main descriptors of local environments?
Among the many descriptors of local environments that have been developed in the recent years[1–3, 5, 6, 17– 22, 24–28, 33, 36], the authors will refer in particular to the SOAP fingerprints [38], that have been proven to be a very elegant and robust strategy to describe coordination environments in a way that is naturally invariant with respect to translations, rotations and permutations of atoms.
Q7. What could be used to accelerate the exploration of chemical and conformational space of materials and molecules?
For instance, it could be used to detect outliers in automated high-throughput screenings of materials, to cluster similar configurations together, to accelerate the exploration of chemical and conformational space of materials and molecules.
Q8. How many hypothetical structures were used in the map?
Although the map has been built using only reference configurations from a few of the conventional Si phases, the authors have also projected on it (using out-of-sample embedding) two sets of hypothetical configurations obtained by minima hopping [53] and by ab initio random structure search (AIRSS) [52, 55].
Q9. What is the smallest number of local minima?
In the absence of a complexing cation, the dipeptide can exist in a very large number of local minima, spanning a relatively narrow range of energies.
Q10. What is the way to define a metric in structural and alchemical space?
Distances between atomic structures based on combinations of local similarity kernels provide a flexible framework to define a metric in structural and alchemical space.
Q11. Why did the authors not include them in the environment descriptors of other atoms?
Since H atoms stay at almost fixed positions relative to their neighboring atoms, the authors decided to include them in the environment descriptors of other atoms, but did not include them explicitly as centers of atomic environments.
Q12. What is the advantage of this procedure?
The advantage of this procedure is that one does not need to explicitly find the relation between the shape of the two unit cells and replicate them to perform the comparison: the environment similarities can be evaluated including periodic replicas, and the minimum number of comparisons will be naturally performed among any pairs of structures.
Q13. What is the way to compare crystalline, periodic structures?
When comparing crystalline, periodic structures, it may be the case that one of the structures corresponds to a slight distortion of the other, that needs a larger unit cell for a proper representation.