scispace - formally typeset
Open AccessJournal ArticleDOI

Building Predictive Models in R Using the caret Package

Max Kuhn
- 10 Nov 2008 - 
- Vol. 28, Iss: 5, pp 1-26
Reads0
Chats0
TLDR
The caret package, short for classification and regression training, contains numerous tools for developing predictive models using the rich set of models available in R to simplify model training and tuning across a wide variety of modeling techniques.
Abstract
The caret package, short for classification and regression training, contains numerous tools for developing predictive models using the rich set of models available in R. The package focuses on simplifying model training and tuning across a wide variety of modeling techniques. It also includes methods for pre-processing training data, calculating variable importance, and model visualizations. An example from computational chemistry is used to illustrate the functionality on a real data set and to benchmark the benefits of parallel processing with several types of models.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Machine Learning for Seed Quality Classification: An Advanced Approach Using Merger Data from FT-NIR Spectroscopy and X-ray Imaging

TL;DR: The models developed using both NIR spectra and X-ray imaging data in machine learning algorithms are efficient in quickly, non-destructively, and accurately identifying the capacity of seed to germinate.
Journal ArticleDOI

Isoelectric point optimization using peptide descriptors and support vector machines.

TL;DR: This manuscript presents an new approach that can significant improve the pI estimation, by using Support Vector Machines (SVM), an experimental amino acid descriptor taken from the AAIndex database and the isoelectric point predicted by the charge-state model.
Journal ArticleDOI

Assessing models for prediction of some soil chemical properties from portable X-ray fluorescence (pXRF) spectrometry data in Brazilian Coastal Plains

TL;DR: In this paper, the authors used portable X-ray fluorescence (pXRF) spectrometry to characterize the Brazilian Coastal Plains (BCP) soils and assess four machine learning algorithms [ordinary least squares regression (OLS), cubist regression (CR), XGBoost (XGB), and random forest (RF)] for prediction of total nitrogen (TN), cation exchange capacity (CEC), and soil organic matter (SOM) using pXRF data.
Journal ArticleDOI

Inferring Roll-Call Scores from Campaign Contributions Using Supervised Machine Learning

TL;DR: The authors developed a generalized supervised learning methodology for inferring roll call scores for incumbent and non-incumbent candidates from campaign contribution data, which is shown to significantly outperform alternative measures of ideology in predicting legislative voting behavior.
References
More filters
BookDOI

Modern Applied Statistics with S

TL;DR: A guide to using S environments to perform statistical analyses providing both an introduction to the use of S and a course in modern statistical methods.

Classification and Regression by randomForest

TL;DR: random forests are proposed, which add an additional layer of randomness to bagging and are robust against overfitting, and the randomForest package provides an R interface to the Fortran programs by Breiman and Cutler.

Modern Applied Statistics With S

TL;DR: The modern applied statistics with s is universally compatible with any devices to read, and is available in the digital library an online access to it is set as public so you can download it instantly.
Proceedings ArticleDOI

Validity of the single processor approach to achieving large scale computing capabilities

TL;DR: In this paper, the authors argue that the organization of a single computer has reached its limits and that truly significant advances can be made only by interconnection of a multiplicity of computers in such a manner as to permit cooperative solution.