scispace - formally typeset
Open AccessJournal ArticleDOI

Building Predictive Models in R Using the caret Package

Max Kuhn
- 10 Nov 2008 - 
- Vol. 28, Iss: 5, pp 1-26
Reads0
Chats0
TLDR
The caret package, short for classification and regression training, contains numerous tools for developing predictive models using the rich set of models available in R to simplify model training and tuning across a wide variety of modeling techniques.
Abstract
The caret package, short for classification and regression training, contains numerous tools for developing predictive models using the rich set of models available in R. The package focuses on simplifying model training and tuning across a wide variety of modeling techniques. It also includes methods for pre-processing training data, calculating variable importance, and model visualizations. An example from computational chemistry is used to illustrate the functionality on a real data set and to benchmark the benefits of parallel processing with several types of models.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Slope stability prediction for circular mode failure using gradient boosting machine approach based on an updated database of case histories

TL;DR: In this paper, a gradient boosting machine (GBM) was used to predict the slope stability of the circular slope in the R Environment software, trained and tested with the parameters obtained from the detailed investigation of 221 actual slope cases between 1994 and 2011 with circular mode failure available in the literature.
Journal ArticleDOI

The YouTube Lens: Crowdsourced Personality Impressions and Audiovisual Analysis of Vlogs

TL;DR: This work investigates the feasibility of crowdsourcing personality impressions from vlogging as a way to obtain judgements from a variate audience that consumes social media video, and addresses the task of automatic prediction of vloggers' personality impressions using nonverbal cues and machine learning techniques.
Journal ArticleDOI

Alternative preprocessing of RNA-Sequencing data in The Cancer Genome Atlas leads to improved analysis results.

TL;DR: This analysis identified various non-coding RNA that may influence lung-cancer histology and compared TCGA samples processed using either pipeline and found that the Rsubread pipeline produced fewer zero-expression genes and more consistent expression levels across replicate samples than the TCGA pipeline.
Journal ArticleDOI

Deep learning predicts hip fracture using confounding patient and healthcare variables

TL;DR: In this paper, a single model that directly combines image features, patient and hospital process data outperforms a Naive Bayes ensemble of an image-only model prediction, patient, and hospital processes data.
References
More filters
BookDOI

Modern Applied Statistics with S

TL;DR: A guide to using S environments to perform statistical analyses providing both an introduction to the use of S and a course in modern statistical methods.

Classification and Regression by randomForest

TL;DR: random forests are proposed, which add an additional layer of randomness to bagging and are robust against overfitting, and the randomForest package provides an R interface to the Fortran programs by Breiman and Cutler.

Modern Applied Statistics With S

TL;DR: The modern applied statistics with s is universally compatible with any devices to read, and is available in the digital library an online access to it is set as public so you can download it instantly.
Proceedings ArticleDOI

Validity of the single processor approach to achieving large scale computing capabilities

TL;DR: In this paper, the authors argue that the organization of a single computer has reached its limits and that truly significant advances can be made only by interconnection of a multiplicity of computers in such a manner as to permit cooperative solution.