Building Predictive Models in R Using the caret Package
Reads0
Chats0
TLDR
The caret package, short for classification and regression training, contains numerous tools for developing predictive models using the rich set of models available in R to simplify model training and tuning across a wide variety of modeling techniques.Abstract:
The caret package, short for classification and regression training, contains numerous tools for developing predictive models using the rich set of models available in R. The package focuses on simplifying model training and tuning across a wide variety of modeling techniques. It also includes methods for pre-processing training data, calculating variable importance, and model visualizations. An example from computational chemistry is used to illustrate the functionality on a real data set and to benchmark the benefits of parallel processing with several types of models.read more
Citations
More filters
Journal ArticleDOI
Machine Learning for Seed Quality Classification: An Advanced Approach Using Merger Data from FT-NIR Spectroscopy and X-ray Imaging
André Dantas de Medeiros,Laércio Junio da Silva,João Paulo Oliveira Ribeiro,Kamylla Calzolari Ferreira,Jorge Tadeu Fim Rosas,Abraão Almeida Santos,Clíssia Barboza da Silva +6 more
TL;DR: The models developed using both NIR spectra and X-ray imaging data in machine learning algorithms are efficient in quickly, non-destructively, and accurately identifying the capacity of seed to germinate.
Journal ArticleDOI
Isoelectric point optimization using peptide descriptors and support vector machines.
Yasset Perez-Riverol,Enrique Audain,Aleli Millan,Yassel Ramos,Aniel Sanchez,Juan Antonio Vizcaíno,Rui Wang,Markus Müller,Yoan Machado,Lazaro Betancourt,Luis Javier González,Gabriel Padrón,Vladimir Besada +12 more
TL;DR: This manuscript presents an new approach that can significant improve the pI estimation, by using Support Vector Machines (SVM), an experimental amino acid descriptor taken from the AAIndex database and the isoelectric point predicted by the charge-state model.
Journal ArticleDOI
Assessing models for prediction of some soil chemical properties from portable X-ray fluorescence (pXRF) spectrometry data in Brazilian Coastal Plains
Renata Andrade,Sérgio Henrique Godinho Silva,David C. Weindorf,Somsubhra Chakraborty,Wilson Missina Faria,Luiz Felipe Mesquita,Luiz Roberto Guimarães Guilherme,Nilton Curi +7 more
TL;DR: In this paper, the authors used portable X-ray fluorescence (pXRF) spectrometry to characterize the Brazilian Coastal Plains (BCP) soils and assess four machine learning algorithms [ordinary least squares regression (OLS), cubist regression (CR), XGBoost (XGB), and random forest (RF)] for prediction of total nitrogen (TN), cation exchange capacity (CEC), and soil organic matter (SOM) using pXRF data.
Journal ArticleDOI
Dual RNA-seq of Orientia tsutsugamushi informs on host-pathogen interactions for this neglected intracellular human pathogen.
Bozena Mika-Gospodorz,Suparat Giengkam,Alexander J. Westermann,Jantana Wongsantichon,Willow Kion-Crosby,Suthida Chuenklin,Loo Chien Wang,Piyanate Sunyakumthorn,Radoslaw M. Sobota,Selvakumar Subbian,Jörg Vogel,Lars Barquist,Jeanne Salje,Jeanne Salje,Jeanne Salje +14 more
TL;DR: The authors show that dual RNA-seq, profiling the host and pathogen transcriptome simultaneously, helps uncovering the biology of Orientia tsutsugamushi, a major cause of febrile illness in South-East Asia, and its interaction with the host.
Journal ArticleDOI
Inferring Roll-Call Scores from Campaign Contributions Using Supervised Machine Learning
TL;DR: The authors developed a generalized supervised learning methodology for inferring roll call scores for incumbent and non-incumbent candidates from campaign contribution data, which is shown to significantly outperform alternative measures of ideology in predicting legislative voting behavior.
References
More filters
BookDOI
Modern Applied Statistics with S
W. N. Venables,Brian D. Ripley +1 more
TL;DR: A guide to using S environments to perform statistical analyses providing both an introduction to the use of S and a course in modern statistical methods.
Classification and Regression by randomForest
Andy Liaw,Matthew C. Wiener +1 more
TL;DR: random forests are proposed, which add an additional layer of randomness to bagging and are robust against overfitting, and the randomForest package provides an R interface to the Fortran programs by Breiman and Cutler.
Modern Applied Statistics With S
TL;DR: The modern applied statistics with s is universally compatible with any devices to read, and is available in the digital library an online access to it is set as public so you can download it instantly.
Proceedings ArticleDOI
Validity of the single processor approach to achieving large scale computing capabilities
TL;DR: In this paper, the authors argue that the organization of a single computer has reached its limits and that truly significant advances can be made only by interconnection of a multiplicity of computers in such a manner as to permit cooperative solution.