scispace - formally typeset
Journal ArticleDOI

Evaluating multiple classifiers for stock price direction prediction

TLDR
The results indicate that Random Forest is the top algorithm followed by Support Vector Machines, Kernel Factory, AdaBoost, Neural Networks, K-Nearest Neighbors and Logistic Regression in the domain of stock price direction prediction.
Abstract
We predict long term stock price direction.We benchmark three ensemble methods against four single classifiers.We use five times twofold cross-validation and AUC as a performance measure.Random Forest is the top algorithm.This study is the first to make such an extensive benchmark in this domain. Stock price direction prediction is an important issue in the financial world. Even small improvements in predictive performance can be very profitable. The purpose of this paper is to benchmark ensemble methods (Random Forest, AdaBoost and Kernel Factory) against single classifier models (Neural Networks, Logistic Regression, Support Vector Machines and K-Nearest Neighbor). We gathered data from 5767 publicly listed European companies and used the area under the receiver operating characteristic curve (AUC) as a performance measure. Our predictions are one year ahead. The results indicate that Random Forest is the top algorithm followed by Support Vector Machines, Kernel Factory, AdaBoost, Neural Networks, K-Nearest Neighbors and Logistic Regression. This study contributes to literature in that it is, to the best of our knowledge, the first to make such an extensive benchmark. The results clearly suggest that novel studies in the domain of stock price direction prediction should include ensembles in their sets of algorithms. Our extensive literature review evidently indicates that this is currently not the case.

read more

Citations
More filters
Journal ArticleDOI

Computational Intelligence and Financial Markets

TL;DR: An overview of the most important primary studies published from 2009 to 2015, which cover techniques for preprocessing and clustering of financial data, for forecasting future market movements, for mining financial text information, among others, are given.
Journal ArticleDOI

An up-to-date comparison of state-of-the-art classification algorithms

TL;DR: It is found that Stochastic Gradient Boosting Trees (GBDT) matches or exceeds the prediction performance of Support Vector Machines and Random Forests, while being the fastest algorithm in terms of prediction efficiency.
Journal ArticleDOI

CNNpred: CNN-based stock market prediction using a diverse set of variables

TL;DR: A CNN-based framework is suggested, that can be applied on a collection of data from a variety of sources, including different markets, in order to extract features for predicting the future of those markets.
Journal ArticleDOI

Literature review: Machine learning techniques applied to financial market prediction

TL;DR: Bibliographic survey techniques are applied to the literature about machine learning for predicting financial market values, resulting in a bibliographical review of the most important studies about this topic, and it was concluded that the research theme is still relevant and that the use of data from developing markets is a research opportunity.
Journal ArticleDOI

An innovative neural network approach for stock market prediction

TL;DR: An innovative neural network approach to achieve better stock market predictions by using the embedded layer and the automatic encoder, respectively, to vectorize the data, in a bid to forecast the stock via long short-term memory neural network.
References
More filters
Journal ArticleDOI

Random Forests

TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Journal ArticleDOI

Regression Shrinkage and Selection via the Lasso

TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
BookDOI

Modern Applied Statistics with S

TL;DR: A guide to using S environments to perform statistical analyses providing both an introduction to the use of S and a course in modern statistical methods.
Journal ArticleDOI

Efficient capital markets: a review of theory and empirical work*

Eugene F. Fama
- 01 May 1970 - 
TL;DR: Efficient Capital Markets: A Review of Theory and Empirical Work Author(s): Eugene Fama Source: The Journal of Finance, Vol. 25, No. 2, Papers and Proceedings of the Twenty-Eighth Annual Meeting of the American Finance Association New York, N.Y. December, 28-30, 1969 (May, 1970), pp. 383-417 as mentioned in this paper
Journal ArticleDOI

Greedy function approximation: A gradient boosting machine.

TL;DR: A general gradient descent boosting paradigm is developed for additive expansions based on any fitting criterion, and specific algorithms are presented for least-squares, least absolute deviation, and Huber-M loss functions for regression, and multiclass logistic likelihood for classification.
Related Papers (5)
Trending Questions (1)
Why is stock price prediction an important problem? Do not include methods.?

Stock price prediction is important in the financial world as even small improvements in predictive performance can be highly profitable.