Building Predictive Models in R Using the caret Package
Reads0
Chats0
TLDR
The caret package, short for classification and regression training, contains numerous tools for developing predictive models using the rich set of models available in R to simplify model training and tuning across a wide variety of modeling techniques.Abstract:
The caret package, short for classification and regression training, contains numerous tools for developing predictive models using the rich set of models available in R. The package focuses on simplifying model training and tuning across a wide variety of modeling techniques. It also includes methods for pre-processing training data, calculating variable importance, and model visualizations. An example from computational chemistry is used to illustrate the functionality on a real data set and to benchmark the benefits of parallel processing with several types of models.read more
Citations
More filters
Journal ArticleDOI
Accurate ethnicity prediction from placental DNA methylation data.
Victor Yuan,E. Magda Price,Giulia F. Del Gobbo,Sara Mostafavi,B. Cox,Alexandra M. Binder,Karin B. Michels,Carmen J. Marsit,Wendy P. Robinson +8 more
TL;DR: An ethnicity classifier using five cohorts with Infinium Human Methylation 450k BeadChip array data from placental samples that is also compatible with the newer EPIC platform, which provides an improved approach to address population stratification in placental DNAme association studies.
Journal ArticleDOI
Immunometabolic Signatures Predict Risk of Progression to Active Tuberculosis and Disease Outcome.
Fergal J. Duffy,January Weiner,Scott G. Hansen,David L. Tabb,Sara Suliman,Ethan G. Thompson,Jeroen Maertzdorf,Smitha Shankar,Gerard Tromp,Shreemanta K. Parida,Drew Dover,Michael K. Axthelm,Jayne S. Sutherland,Hazel M. Dockrell,Tom H. M. Ottenhoff,Thomas J. Scriba,Louis J. Picker,Gerhard Walzl,Stefan H. E. Kaufmann,Daniel E. Zak +19 more
TL;DR: Analysis of cohorts of household contacts of TB index cases and a stringent non-human primate challenge model evaluated whether integration of blood transcriptional profiling with serum metabolomic profiling can provide new understanding of disease processes and enable improved prediction of TB progression found it to be so.
Journal ArticleDOI
High‐resolution mapping of the global silicate weathering carbon sink and its long‐term changes
Chaojun Li,Xiaoyong Bai,Qiu Tan,Guangjie Luo,Luhua Wu,Fei Chen,Hui-Hui Xi,Xuling Luo,Chen Ran,Huan Chen,Sirui Zhang,Min Liu,Suhua Gong,Liangping Xiong,Fengjiao Song,B.-L. Xiao,Chaochao Du +16 more
TL;DR: In this paper , the authors used the improved first-order model with correlated factors and nonparametric methods, and produced spatiotemporal data sets (0.25° × 0.75°) of the global silicate weathering carbon-sink flux (SCSFα) under different scenarios (SSPs) in present (1950-2014) and future (2015-2100) periods based on the Global River Chemistry Database and CMIP6 data sets.
Journal ArticleDOI
Ensemble methods of classification for power systems security assessment
TL;DR: Novel techniques based on decision trees are used for evaluation of the reliability of the regime of electric power systems using hybrid approach based on random forests models and boosting models for enhanced decision making.
Journal ArticleDOI
Data-driven fraud detection in international shipping.
TL;DR: A Bayesian network is developed that predicts the presence of goods on the cargo list of shipments and is compared with the accompanying documentation of a shipment to determine whether document fraud is perpetrated.
References
More filters
BookDOI
Modern Applied Statistics with S
W. N. Venables,Brian D. Ripley +1 more
TL;DR: A guide to using S environments to perform statistical analyses providing both an introduction to the use of S and a course in modern statistical methods.
Classification and Regression by randomForest
Andy Liaw,Matthew C. Wiener +1 more
TL;DR: random forests are proposed, which add an additional layer of randomness to bagging and are robust against overfitting, and the randomForest package provides an R interface to the Fortran programs by Breiman and Cutler.
Modern Applied Statistics With S
TL;DR: The modern applied statistics with s is universally compatible with any devices to read, and is available in the digital library an online access to it is set as public so you can download it instantly.
Proceedings ArticleDOI
Validity of the single processor approach to achieving large scale computing capabilities
TL;DR: In this paper, the authors argue that the organization of a single computer has reached its limits and that truly significant advances can be made only by interconnection of a multiplicity of computers in such a manner as to permit cooperative solution.