Stock Prediction Using Machine Learning Algorithms
01 Jan 2019-pp 405-414
TL;DR: In this paper, the authors have taken into factors such as Commodity Prices (crude oil, gold, silver), Market History, and Foreign exchange rate (FEX) that influence the stock trend, as input attributes for various machine learning models to predict the behavior of Bombay Stock Exchange (BSE).
Abstract: Market systems are so complex that they overwhelm the ability of any individual to predict. But it is crucial for the investors to predict stock market price to generate notable profit. The ultimate aim of this project is to predict the behavior of Bombay Stock Exchange (BSE). We have taken into factors such as Commodity Prices (crude oil, gold, silver), Market History, and Foreign exchange rate (FEX) that influence the stock trend, as input attributes for various machine learning models to predict the behavior of Bombay Stock Exchange (BSE). The performances of the models are then compared against other benchmarks. A structured relationship was also determined among the different attributes used. The gold price attribute was found to have the highest positive correlation with market performance. The AdaBoost algorithm performed best as compared to other techniques.
TL;DR: An extensive comparative analysis of ensemble techniques such as boosting, bagging, blending and super learners (stacking) suggests that an innovative study in the domain of stock market direction prediction ought to include ensemble techniques in their sets of algorithms.
Abstract: Stock-market prediction using machine-learning technique aims at developing effective and efficient models that can provide a better and higher rate of prediction accuracy. Numerous ensemble regressors and classifiers have been applied in stock market predictions, using different combination techniques. However, three precarious issues come in mind when constructing ensemble classifiers and regressors. The first concerns with the choice of base regressor or classifier technique adopted. The second concerns the combination techniques used to assemble multiple regressors or classifiers and the third concerns with the quantum of regressors or classifiers to be ensembled. Subsequently, the number of relevant studies scrutinising these previously mentioned concerns are limited. In this study, we performed an extensive comparative analysis of ensemble techniques such as boosting, bagging, blending and super learners (stacking). Using Decision Trees (DT), Support Vector Machine (SVM) and Neural Network (NN), we constructed twenty-five (25) different ensembled regressors and classifiers. We compared their execution times, accuracy, and error metrics over stock-data from Ghana Stock Exchange (GSE), Johannesburg Stock Exchange (JSE), Bombay Stock Exchange (BSE-SENSEX) and New York Stock Exchange (NYSE), from January 2012 to December 2018. The study outcome shows that stacking and blending ensemble techniques offer higher prediction accuracies (90–100%) and (85.7–100%) respectively, compared with that of bagging (53–97.78%) and boosting (52.7–96.32%). Furthermore, the root means square error (RMSE) recorded by stacking (0.0001–0.001) and blending (0.002–0.01) shows a better fit of ensemble classifiers and regressors based on these two techniques in market analyses compared with bagging (0.01–0.11) and boosting (0.01–0.443). Finally, the results undoubtedly suggest that an innovative study in the domain of stock market direction prediction ought to include ensemble techniques in their sets of algorithms.
Cites methods from "Stock Prediction Using Machine Lear..."
... examined different ML algorithms (SVM, RF, Gradient Boosting and AdaBoost) performance in stock market price prediction....
••17 Oct 2019
TL;DR: This work attempts to provide an extensive and objective walkthrough in the direction of applicability of the machine learning algorithms for financial or stock market prediction.
Abstract: Finance is one of the pioneering industries that started using Machine Learning (ML), a subset of Artificial Intelligence (AI) in the early 80s for market prediction. Since then, major firms and hedge funds have adopted machine learning for stock prediction, portfolio optimization, credit lending, stock betting, etc. In this paper, we survey all the different approaches of machine learning that can be incorporated in applied finance. The major motivation behind ML is to draw out the specifics from the available data from different sources and to forecast from it. Different machine learning algorithms has their abilities for predictions and are heavily depended on the number and quality of parameters as input features. This work attempts to provide an extensive and objective walkthrough in the direction of applicability of the machine learning algorithms for financial or stock market prediction.
TL;DR: In this paper , the authors proposed an N-Period Min-Max (NPMM) labeling that labels data only at definite time points to help overcome small price change sensitivity.
Abstract: • This study discusses the importance of data labeling for the development of trading systems in the stock market. • N-Period Min-Max (NPMM) labeling resolves the drawbacks of the conventional labeling methods. • NPMM labeling identifies the stock price trend for period N and labels its minimum and maximum values. • An empirical analysis is conducted by applying the proposed method to the Nasdaq stock market. Many researchers attempt to accurately predict stock price trends using technologies such as machine learning and deep learning to achieve high returns in the stock market. However, it is difficult to predict the exact trend since stock prices are nonlinear and often appear random. To improve accuracy, the focus of modelers usually lies in improving the performance of the prediction model. However, examining the data used in training the model is imperative. Most studies of stock price trend prediction use an up-down labeling that labels data at all time points. The drawback of this labeling method is that it is sensitive to small price changes, causing inefficient model training. Therefore, this study proposes an N-Period Min-Max (NPMM) labeling that labels data only at definite time points to help overcome small price change sensitivity. The proposed model also develops a trading system using XGBoost to automate trading and verify the proposed labeling method. The proposed trading system is evaluated through an empirical analysis of 92 companies listed on the NASDAQ. Moreover, the trading performance of the proposed labeling method is compared against other prominent labeling methods. In this study, NPMM labeling was found to be an efficient labeling method for stock price trend prediction, in addition to generating trading outperformance compared to other labeling methods.
TL;DR: This paper transforms the daily stock price time series object into a data frame format where the dependent variable is the stock trend label and the independent variables are the stock variations of the last few days and proposes a new method for stock selection and a new stock trading strategy.
Abstract: In this paper, an application of the Bayesian classifier for short-term stock trend prediction is presented. In order to use Bayesian classifier effectively, we transform the daily stock price time series object into a data frame format where the dependent variable is the stock trend label and the independent variables are the stock variations of the last few days. Based on the posterior probability density function, we propose a new method for stock selection and then propose a new stock trading strategy. The numerical examples demonstrate the potential of the proposed strategy for application to short-term stock trading.
04 Jul 2021
TL;DR: This literature review identifies and analyzes research topic trends, types of data sets, learning algorithm, methods improvements, and frameworks used in stock exchange prediction, and proposed techniques to improve prediction accuracy by combining several methods.
Abstract: This literature review identifies and analyzes research topic trends, types of data sets, learning algorithm, methods improvements, and frameworks used in stock exchange prediction. A total of 81 studies were investigated, which were published regarding stock predictions in the period January 2015 to June 2020 which took into account the inclusion and exclusion criteria. The literature review methodology is carried out in three major phases: review planning, implementation, and report preparation, in nine steps from defining systematic review requirements to presentation of results. Estimation or regression, clustering, association, classification, and preprocessing analysis of data sets are the five main focuses revealed in the main study of stock prediction research. The classification method gets a share of 35.80% from related studies, the estimation method is 56.79%, data analytics is 4.94%, the rest is clustering and association is 1.23%. Furthermore, the use of the technical indicator data set is 74.07%, the rest are combinations of datasets. To develop a stock prediction model 48 different methods have been applied, 9 of the most widely applied methods were identified. The best method in terms of accuracy and also small error rate such as SVM, DNN, CNN, RNN, LSTM, bagging ensembles such as RF, boosting ensembles such as XGBoost, ensemble majority vote and the meta-learner approach is ensemble Stacking. Several techniques are proposed to improve prediction accuracy by combining several methods, using boosting algorithms, adding feature selection and using parameter and hyper-parameter optimization.
TL;DR: High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated and the performance of the support- vector network is compared to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.
Abstract: The support-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data. High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.
TL;DR: This paper investigates the predictability of financial movement direction with SVM by forecasting the weekly movement direction of NIKKEI 225 index and proposes a combining model by integrating SVM with the other classification methods.
01 Sep 2012
TL;DR: The classification results show that Random Forest gives better results for the same number of attributes and large data sets i.e. with greater number of instances, while J48 is handy with small data sets (less number of instance).
Abstract: In this paper, we have compared the classification results of two models i.e. Random Forest and the J48 for classifying twenty versatile datasets. We took 20 data sets available from UCI repository  containing instances varying from 148 to 20000. We compared the classification results obtained from methods i.e. Random Forest and Decision Tree (J48). The classification parameters consist of correctly classified instances, incorrectly classified instances, F-Measure, Precision, Accuracy and Recall. We discussed the pros and cons of using these models for large and small data sets. The classification results show that Random Forest gives better results for the same number of attributes and large data sets i.e. with greater number of instances, while J48 is handy with small data sets (less number of instances). The results from breast cancer data set depicts that when the number of instances increased from 286 to 699, the percentage of correctly classified instances increased from 69.23% to 96.13% for Random Forest i.e. for dataset with same number of attributes but having more instances, the Random Forest accuracy increased.
••01 Aug 2016
TL;DR: The results suggest that performance of KSE-100 index can be predicted with machine learning techniques.
Abstract: The main objective of this research is to predict the market performance of Karachi Stock Exchange (KSE) on day closing using different machine learning techniques. The prediction model uses different attributes as an input and predicts market as Positive & Negative. The attributes used in the model includes Oil rates, Gold & Silver rates, Interest rate, Foreign Exchange (FEX) rate, NEWS and social media feed. The old statistical techniques including Simple Moving Average (SMA) and Autoregressive Integrated Moving Average (ARIMA) are also used as input. The machine learning techniques including Single Layer Perceptron (SLP), Multi-Layer Perceptron (MLP), Radial Basis Function (RBF) and Support Vector Machine (SVM) are compared. All these attributes are studied separately also. The algorithm MLP performed best as compared to other techniques. The oil rate attribute was found to be most relevant to market performance. The results suggest that performance of KSE-100 index can be predicted with machine learning techniques.
Related Papers (5)
01 Dec 2018