scispace - formally typeset
Search or ask a question

Showing papers by "Anubhav Jain published in 2019"


Journal ArticleDOI
03 Jul 2019-Nature
TL;DR: It is shown that materials science knowledge present in the published literature can be efficiently encoded as information-dense word embeddings11–13 (vector representations of words) without human labelling or supervision, suggesting that latent knowledge regarding future discoveries is to a large extent embedded in past publications.
Abstract: The overwhelming majority of scientific knowledge is published as text, which is difficult to analyse by either traditional statistical analysis or modern machine learning methods. By contrast, the main source of machine-interpretable data for the materials research community has come from structured property databases1,2, which encompass only a small fraction of the knowledge present in the research literature. Beyond property values, publications contain valuable knowledge regarding the connections and relationships between data items as interpreted by the authors. To improve the identification and use of this knowledge, several studies have focused on the retrieval of information from scientific literature using supervised natural language processing3-10, which requires large hand-labelled datasets for training. Here we show that materials science knowledge present in the published literature can be efficiently encoded as information-dense word embeddings11-13 (vector representations of words) without human labelling or supervision. Without any explicit insertion of chemical knowledge, these embeddings capture complex materials science concepts such as the underlying structure of the periodic table and structure-property relationships in materials. Furthermore, we demonstrate that an unsupervised method can recommend materials for functional applications several years before their discovery. This suggests that latent knowledge regarding future discoveries is to a large extent embedded in past publications. Our findings highlight the possibility of extracting knowledge and relationships from the massive body of scientific literature in a collective manner, and point towards a generalized approach to the mining of scientific literature.

653 citations


Journal ArticleDOI
TL;DR: In this paper, the authors present an overview of the current state of computational materials prediction, synthesis and characterization approaches, materials design needs for various technologies, and future challenges and opportunities that must be addressed.
Abstract: Advances in renewable and sustainable energy technologies critically depend on our ability to design and realize materials with optimal properties. Materials discovery and design efforts ideally involve close coupling between materials prediction, synthesis and characterization. The increased use of computational tools, the generation of materials databases, and advances in experimental methods have substantially accelerated these activities. It is therefore an opportune time to consider future prospects for materials by design approaches. The purpose of this Roadmap is to present an overview of the current state of computational materials prediction, synthesis and characterization approaches, materials design needs for various technologies, and future challenges and opportunities that must be addressed. The various perspectives cover topics on computational techniques, validation, materials databases, materials informatics, high-throughput combinatorial methods, advanced characterization approaches, and materials design issues in thermoelectrics, photovoltaics, solid state lighting, catalysts, batteries, metal alloys, complex oxides and transparent conducting materials. It is our hope that this Roadmap will guide researchers and funding agencies in identifying new prospects for materials design.

257 citations


Journal ArticleDOI
TL;DR: It is demonstrated that simple database queries can be used to answer complex ``meta-questions" of the published literature that would have previously required laborious, manual literature searches to answer.
Abstract: The number of published materials science articles has increased manyfold over the past few decades. Now, a major bottleneck in the materials discovery pipeline arises in connecting new results with the previously established literature. A potential solution to this problem is to map the unstructured raw text of published articles onto structured database entries that allow for programmatic querying. To this end, we apply text mining with named entity recognition (NER) for large-scale information extraction from the published materials science literature. The NER model is trained to extract summary-level information from materials science documents, including inorganic material mentions, sample descriptors, phase labels, material properties and applications, as well as any synthesis and characterization methods used. Our classifier achieves an accuracy (f1) of 87%, and is applied to information extraction from 3.27 million materials science abstracts. We extract more than 80 million materials-science-related named entities, and the content of each abstract is represented as a database entry in a structured format. We demonstrate that simple database queries can be used to answer complex "meta-questions" of the published literature that would have previously required laborious, manual literature searches to answer. All of our data and functionality has been made freely available on our Github ( https://github.com/materialsintelligence/matscholar ) and website ( http://matscholar.com ), and we expect these results to accelerate the pace of future materials science discovery.

128 citations


Journal ArticleDOI
TL;DR: In this work, a utilization of tantalum‐sealing for melting enables n‐type Mg3Sb2 alloys to show a substantially higher mobility than ever reported, which can be attributed to the purification of phases and to the coarse grains.
Abstract: Over the past years, thermoelectric Mg3Sb2 alloys particularly in n-type conduction, have attracted increasing attentions for thermoelectric applications, due to the multivalley conduction band, abundance of constituents, and less toxicity. However, the high vapor pressure, causticity of Mg, and the high melting point of Mg3Sb2 tend to cause the inclusion in the materials of boundary phases and defects that affect the transport properties. In this work, a utilization of tantalum-sealing for melting enables n-type Mg3Sb2 alloys to show a substantially higher mobility than ever reported, which can be attributed to the purification of phases and to the coarse grains. Importantly, the inherently high mobility successfully enables the thermoelectric figure of merit in optimal compositions to be highly competitive to that of commercially available n-type Bi2Te3 alloys and to be higher than that of other known n-type thermoelectrics at 300-500 K. This work reveals Mg3Sb2 alloys as a top candidate for near-room-temperature thermoelectric applications.

63 citations


Journal ArticleDOI
TL;DR: It is demonstrated that novel site environment features that characterize interstice distributions around atoms combined with machine learning (ML) can reliably identify plastic sites in several Cu-Zr compositions, and a quench-in softness model trained on a single composition and quench rate substantially improves upon previous models.
Abstract: When metallic glasses (MGs) are subjected to mechanical loads, the plastic response of atoms is non-uniform. However, the extent and manner in which atomic environment signatures present in the undeformed structure determine this plastic heterogeneity remain elusive. Here, we demonstrate that novel site environment features that characterize interstice distributions around atoms combined with machine learning (ML) can reliably identify plastic sites in several Cu-Zr compositions. Using only quenched structural information as input, the ML-based plastic probability estimates ("quench-in softness" metric) can identify plastic sites that could activate at high strains, losing predictive power only upon the formation of shear bands. Moreover, we reveal that a quench-in softness model trained on a single composition and quench rate substantially improves upon previous models in generalizing to different compositions and completely different MG systems (Ni62Nb38, Al90Sm10 and Fe80P20). Our work presents a general, data-centric framework that could potentially be used to address the structural origin of any site-specific property in MGs.

54 citations


Journal ArticleDOI
TL;DR: In this article, a limited selection of how thermoelectrics can benefit from new discoveries in physics: wave effects in phonon transport, correlated electron physics, and unconventional transport in organic materials.
Abstract: Thermoelectrics represent a unique opportunity in energy to directly convert thermal energy or secondary waste heat into a primary resource. The development of thermoelectric materials has improved over the decades in leaps, rather than by increments—each leap forward has recapitulated the science of its time: from the crystal growth of semiconductors, to controlled doping, to nanostructuring, and to 2D confinement. Each of those leaps forward was, arguably, more a result of materials science than physics. Thermoelectrics is now ripe for another leap forward, and many probable advances rely on new physics outside of the standard band transport model of thermoelectrics. This perspective will cover a limited selection of how thermoelectrics can benefit from new discoveries in physics: wave effects in phonon transport, correlated electron physics, and unconventional transport in organic materials. We also highlight recent developments in thermoelectrics discovery aided by machine learning that may be needed to realize some of these new concepts practically. Looking ahead, developing new thermoelectric physics will also have a concomitant domino effect on adjacent fields, furthering the understanding of nonequilibrium thermal and electronic transport in novel materials.

49 citations


Journal ArticleDOI
TL;DR: In this article, the authors examined the relationship between employee silence and job burnout as well as the possible mediating role of emotional intelligence (EI) on the silence-burnout relationship.
Abstract: Although considerable research has been completed on employee voice, relatively few studies have investigated employee silence. The purpose of this paper is to examine the relationship between employee silence and job burnout as well as the possible mediating role of emotional intelligence (EI) on the silence-burnout relationship.,This paper reports the findings of an empirical study based upon the survey of 286 managers working in four different states in India. Correlational and mediated regression analyses were performed to test four hypotheses.,Contrary to findings from studies conducted in Western countries in which employee silence was positively related to undesirable work outcomes, in this study, employee silence was negatively related to job burnout. Additionally, results indicated that the relationship between employee silence and job burnout was mediated by EI. These findings suggest the importance of considering country context and potential mediating variables when investigating employee silence.,This study demonstrates how Indian employees may strategically choose employee silence in order to enhance job outcomes.,This study is one of the few efforts to investigate employee silence in a non-western country. This is first study that has examined the role of EI as a mediating variable of the relationship between employee silence and job burnout in India.

29 citations


Journal ArticleDOI
TL;DR: In this paper, the authors reported low thermal transport properties of four selenide compounds (BaAg2SnSe4, BaCu2GeSe4 and BaCoSe4) with experimentally-measured thermal conductivity as low as 0.31 ± 0.03 W m−1 K−1 at 673 K for BaAg2snSe4 due to scattering from weakly-bonded Ag-Ag dimers.
Abstract: Engineering the thermal properties in solids is important for both fundamental physics (e.g. electric and phonon transport) and device applications (e.g. thermal insulating coating, thermoelectrics). In this paper, we report low thermal transport properties of four selenide compounds (BaAg2SnSe4, BaCu2GeSe4, BaCu2SnSe4 and SrCu2GeSe4) with experimentally-measured thermal conductivity as low as 0.31 ± 0.03 W m−1 K−1 at 673 K for BaAg2SnSe4. Density functional theory calculations predict κ < 0.3 W m−1 K−1 for BaAg2SnSe4 due to scattering from weakly-bonded Ag–Ag dimers. Defect calculations suggest that achieving high hole doping levels in these materials could be challenging due to monovalent (e.g., Ag) interstitials acting as hole killers, resulting in overall low electrical conductivity in these compounds.

24 citations


Journal ArticleDOI
TL;DR: Robocrystallographer as discussed by the authors is an open-source toolkit for analyzing crystal structures, including the local coordination and polyhedral type, polyhedral connectivity, octahedral tilt angles, component-dimensionality, and molecule-within-crystal and fuzzy prototype identification.
Abstract: Our ability to describe crystal structure features is of crucial importance when attempting to understand structure–property relationships in the solid state. In this paper, the authors introduce robocrystallographer, an open-source toolkit for analyzing crystal structures. This package combines new and existing open-source analysis tools to provide structural information, including the local coordination and polyhedral type, polyhedral connectivity, octahedral tilt angles, component-dimensionality, and molecule-within-crystal and fuzzy prototype identification. Using this information, robocrystallographer can generate text-based descriptions of crystal structures that resemble descriptions written by human crystallographers. The authors use robocrystallographer to investigate the dimensionalities of all compounds in the Materials Project database and highlight its potential in machine learning studies.

23 citations



Proceedings ArticleDOI
01 Jun 2019
TL;DR: In this paper, the authors introduce a climate zone classification system specific to photovoltaic, PhotoVoltaic Climate Zones (PVCZ-2019 or PVCZ) that defines zones based on the geographic distribution in PV stressor intensity.
Abstract: A large body of previous research indicates that climate affects photovoltaic (PV) degradation both in terms of steady power loss and hazardous failures. However, the geographic distribution of climate stressors has not yet been characterized in a systematic way. Most typically the Koppen-Geiger classification scheme is used for comparing PV degradation across different climates. However, Koppen-Geiger uses temperature and rainfall to develop zones relevant for botany and lacks the ability to distinguish locations based on climate stressors more relevant to PV degradation. Prior work has shown that specific stressors (e.g. high temperature, temperature cycling, damp heat, wind stress and UV exposure) induce multiple PV degradation modes such as solder bond degradation, corrosion by moisture intrusion, wind-induced cell cracking, encapsulant discoloration and others. We introduce a climate zone classification system specific to PV, PhotoVoltaic Climate Zones (PVCZ-2019 or PVCZ) that defines zones based on the geographic distribution in PV stressor intensity. This climate zone scheme provides quantitative thresholds on the climate stress experienced in each zone which can provide a basis for future work on the impact of climate on PV degradation and failure.

Journal ArticleDOI
TL;DR: In this article, the authors examined the relationship between careerism and organizational attitudes among workers in India and found that careerism was negatively related to affective commitment, organization satisfaction and perceived organizational performance.
Abstract: Using psychological contract theory as its foundation, the purpose of this paper is to examine the important, but under-explored, relationship between careerism and organizational attitudes among workers in India.,In total, 250 middle-level executives, working in six manufacturing plants of motorbike companies located in Northern India, were surveyed.,As hypothesized, careerism was found to be negatively related to affective commitment, organization satisfaction and perceived organizational performance. Contrary to expectations, however, careerism was positively related to continuance and normative commitment.,The study is based on a cross-sectional survey. Also, because the motorbike industry is male dominated, all the executives surveyed are men.,Despite concerns that employees with more transactional relationships with their employers are no longer loyal to their organizations, this study demonstrates that Indian employees with a higher careerism also have higher levels of normative and continuance organizational commitment.,Prior research has produced conflicting results as to whether employees with more careerist, transactional psychological contracts with their employers have more negative organizational attitudes. This study contributes to research on psychological contract theory and careerism in today’s turbulent career landscape while also answering calls to examine the generalizability of western theories of careers in non-western countries.

Journal ArticleDOI
TL;DR: In this paper, the specific heat capacity of thermal fluids has been investigated in the storage and transfer of thermal energy, playing a key role in heating, cooling, refrigeration, and power generation.
Abstract: Thermal fluids have many applications in the storage and transfer of thermal energy, playing a key role in heating, cooling, refrigeration, and power generation. However, the specific heat capacity...

Journal ArticleDOI
TL;DR: In this article, the authors used clear sky classifications determined from satellite data to develop an algorithm that determines clear sky periods using only measured irradiance values and modeled clear sky irradiance as inputs.
Abstract: Recent degradation studies have highlighted the importance of considering cloud cover when calculating degradation rates, finding more reliable values when the data are restricted to clear sky periods. Several automated methods of determining clear sky periods have been previously developed, but parameterizing and testing the models has been difficult. In this paper, we use clear sky classifications determined from satellite data to develop an algorithm that determines clear sky periods using only measured irradiance values and modeled clear sky irradiance as inputs. This method is tested on global horizontal irradiance (GHI) data from ground collectors at six sites across the United States and compared against independent satellite-based classifications. First, 30 separate models were optimized on each individual site at GHI data intervals of 1, 5, 10, 15, and 30 min (sampled on the first minute of the interval). The models had an average F0.5 score of 0.949 ± 0.035 on a holdout test set. Next, optimizations were performed by aggregating data from different locations at the same interval, yielding one model per data interval. This paper yielded an average F0.5 of 0.946 ± 0.037. A final, “universal” optimization that was trained on data from all sites at all intervals provided an F0.5 score of 0.943 ± 0.040. The optimizations all provide improvements on a prior, unoptimized clear sky detection algorithm that produces F0.5 scores that average to 0.903 ± 0.067. Our paper indicates that a single algorithm can accurately classify clear sky periods across locations and data sampling intervals.

Proceedings ArticleDOI
16 Jun 2019
TL;DR: In this article, the authors performed a literature search to identify models that estimate the damage caused by exposure to various environmental stressors, including temperature, radiation, and humidity, and compared these PV-specific variables to identify correlations and the translation required to represent the stressor accurately.
Abstract: Environmental stress can degrade photovoltiac (PV) modules. We perform a literature search to identify models that estimate the damage caused by exposure to various environmental stressors, including temperature, radiation, and humidity. The weather-related variables, including ambient temperature, irradiance and humidity are calculated using the Global Land Data Assimilation System (GLDAS). The analysis also calculated degradation-model stressors, module temperature, plane of array irradiance, and relative humidity and compared these PV-specific variables to identify correlations and the translation required to represent the stressor accurately. The results show that global horizontal (GHI) irradiance can be used instead of plane-of-array irradiance to represent radiation dose. However, module temperature can be significantly different from ambient temperatures and specific humidity is significantly different from relative humidity.

Journal ArticleDOI
TL;DR: In this paper, a machine learning-based approach was used to identify plastic sites that could activate at high strains, losing predictive power only upon the formation of shear bands, and showed that a quench-in softness model trained on a single composition and quenching rate substantially improved upon previous models in generalizing to different compositions.
Abstract: When metallic glasses (MGs) are subjected to mechanical loads, the plastic response of atoms is non-uniform. However, the extent and manner in which atomic environment signatures present in the undeformed structure determine this plastic heterogeneity remain elusive. Here, we demonstrate that novel site environment features that characterize interstice distributions around atoms combined with machine learning (ML) can reliably identify plastic sites in several Cu-Zr compositions. Using only quenched structural information as input, the ML-based plastic probability estimates ("quench-in softness" metric) can identify plastic sites that could activate at high strains, losing predictive power only upon the formation of shear bands. Moreover, we reveal that a quench-in softness model trained on a single composition and quenching rate substantially improves upon previous models in generalizing to different compositions and completely different MG systems (Ni62Nb38, Al90Sm10 and Fe80P20). Our work presents a general, data-centric framework that could potentially be used to address the structural origin of any site-specific property in MGs.

Posted Content
TL;DR: In this article, a machine learning framework was proposed to predict the plastic heterogeneity of atoms in Cu-Zr metallic glasses solely from the undeformed, quenched configuration of the glass.
Abstract: When metallic glasses are subjected to mechanical loads, the plastic response of atoms is heterogeneous However, the degree to which the plastic units are correlated with the structural defects frozen in the quenched glass structure is still elusive Here, we introduce a machine learning framework to predict the plastic heterogeneity of atoms in Cu-Zr metallic glasses solely from the undeformed, quenched configuration We propose that an atomic-scale quantity, "quench-in softness", calibrated from a gradient boosted decision tree model trained on a set of short- and medium-range site features, can identify plastically susceptible sites at various strain levels with high accuracy The predictive ability is further confirmed in that a model trained on a single composition and quench rate retains high accuracy on other compositions and quench rates without any further training We also quantitatively assess historical site descriptors against our method, demonstrating that the regularity-related features introduced in this work are more predictive and may play an important role in future glass characterization Our work presents a general, data-centric framework that could potentially be used to address the structural origin of any site-specific property in metallic glasses