scispace - formally typeset
Search or ask a question

Showing papers in "arXiv: Statistical Finance in 2018"


Journal ArticleDOI
TL;DR: In this article, the authors survey the cumulating evidence for the presence of multifractality in financial time series in different markets and at different time periods and discuss the sources of multifractality.
Abstract: Multifractality is ubiquitously observed in complex natural and socioeconomic systems. Multifractal analysis provides powerful tools to understand the complex nonlinear nature of time series in diverse fields. Inspired by its striking analogy with hydrodynamic turbulence, from which the idea of multifractality originated, multifractal analysis of financial markets has bloomed, forming one of the main directions of econophysics. We review the multifractal analysis methods and multifractal models adopted in or invented for financial time series and their subtle properties, which are applicable to time series in other disciplines. We survey the cumulating evidence for the presence of multifractality in financial time series in different markets and at different time periods and discuss the sources of multifractality. The usefulness of multifractal analysis in quantifying market inefficiency, in supporting risk management and in developing other applications is presented. We finally discuss open problems and further directions of multifractal analysis.

154 citations


Posted Content
TL;DR: In this article, the behavior of conditional correlations among main cryptocurrencies, stock and bond indices, and gold, using a generalized DCC class model, is explored, and it is shown that correlations among cryptocurrencies are positive, albeit varying across time; correlations with Monero are more stable across time.
Abstract: This letter explores the behavior of conditional correlations among main cryptocurrencies, stock and bond indices, and gold, using a generalized DCC class model. From a portfolio management point of view, asset correlation is a key metric in order to construct efficient portfolios. We find that: (i) correlations among cryptocurrencies are positive, albeit varying across time; (ii) correlations with Monero are more stable across time; (iii) correlations between cryptocurrencies and traditional financial assets are negligible.

73 citations


Journal ArticleDOI
TL;DR: In this article, the fluctuation properties of the rapidly emerging Bitcoin market are assessed over chosen sub-periods, in terms of return distributions, volatility autocorrelation, Hurst exponents and multiscaling effects.
Abstract: Based on 1-minute price changes recorded since year 2012, the fluctuation properties of the rapidly-emerging Bitcoin (BTC) market are assessed over chosen sub-periods, in terms of return distributions, volatility autocorrelation, Hurst exponents and multiscaling effects. The findings are compared to the stylized facts of mature world markets. While early trading was affected by system-specific irregularities, it is found that over the months preceding Apr 2018 all these statistical indicators approach the features hallmarking maturity. This can be taken as an indication that the Bitcoin market, and possibly other cryptocurrencies, carry concrete potential of imminently becoming a regular market, alternative to the foreign exchange (Forex). Since high-frequency price data are available since the beginning of trading, the Bitcoin offers a unique window into the statistical characteristics of a market maturation trajectory.

65 citations


Posted Content
TL;DR: In this article, the authors used deep learning to predict one-month-ahead stock returns in the cross-section in the Japanese stock market and investigated the performance of the method, showing that deep neural networks generally outperform shallow neural networks and the best networks also outperform representative machine learning models.
Abstract: Many studies have been undertaken by using machine learning techniques, including neural networks, to predict stock returns. Recently, a method known as deep learning, which achieves high performance mainly in image recognition and speech recognition, has attracted attention in the machine learning field. This paper implements deep learning to predict one-month-ahead stock returns in the cross-section in the Japanese stock market and investigates the performance of the method. Our results show that deep neural networks generally outperform shallow neural networks, and the best networks also outperform representative machine learning models. These results indicate that deep learning shows promise as a skillful machine learning method to predict stock returns in the cross-section.

61 citations


Posted Content
TL;DR: This work establishes the existence and uniqueness of state-dependent Hawkes processes, and develops maximum likelihood estimation methodology for parametric specifications of the process, and finds that excitation effects in the order flow are strongly state- dependent.
Abstract: We study statistical aspects of state-dependent Hawkes processes, which are an extension of Hawkes processes where a self- and cross-exciting counting process and a state process are fully coupled, interacting with each other. The excitation kernel of the counting process depends on the state process that, reciprocally, switches state when there is an event in the counting process. We first establish the existence and uniqueness of state-dependent Hawkes processes and explain how they can be simulated. Then we develop maximum likelihood estimation methodology for parametric specifications of the process. We apply state-dependent Hawkes processes to high-frequency limit order book data, allowing us to build a novel model that captures the feedback loop between the order flow and the shape of the limit order book. We estimate two specifications of the model, using the bid-ask spread and the queue imbalance as state variables, and find that excitation effects in the order flow are strongly state-dependent. Additionally, we find that the endogeneity of the order flow, measured by the magnitude of excitation, is also state-dependent, being more pronounced in disequilibrium states of the limit order book.

53 citations


Posted Content
TL;DR: In this paper, a large-scale deep learning approach applied to a high-frequency database containing billions of electronic market quotes and transactions for US equities was used to uncover nonparametric evidence for the existence of a universal and stationary price formation mechanism relating the dynamics of supply and demand for a stock, as revealed through the order book, to subsequent variations in its market price.
Abstract: Using a large-scale Deep Learning approach applied to a high-frequency database containing billions of electronic market quotes and transactions for US equities, we uncover nonparametric evidence for the existence of a universal and stationary price formation mechanism relating the dynamics of supply and demand for a stock, as revealed through the order book, to subsequent variations in its market price. We assess the model by testing its out-of-sample predictions for the direction of price moves given the history of price and order flow, across a wide range of stocks and time periods. The universal price formation model is shown to exhibit a remarkably stable out-of-sample prediction accuracy across time, for a wide range of stocks from different sectors. Interestingly, these results also hold for stocks which are not part of the training sample, showing that the relations captured by the model are universal and not asset-specific. The universal model --- trained on data from all stocks --- outperforms, in terms of out-of-sample prediction accuracy, asset-specific linear and nonlinear models trained on time series of any given stock, showing that the universal nature of price formation weighs in favour of pooling together financial data from various stocks, rather than designing asset- or sector-specific models as commonly done. Standard data normalizations based on volatility, price level or average spread, or partitioning the training data into sectors or categories such as large/small tick stocks, do not improve training results. On the other hand, inclusion of price and order flow history over many past observations is shown to improve forecasting performance, showing evidence of path-dependence in price dynamics.

51 citations


Journal ArticleDOI
TL;DR: The present study shows that the long-term records of the S&P500 and NASDAQ develop the multifractal features, but these features evolve through a variety of shapes, most often strongly asymmetric, whose changes typically are correlated with the historically most significant events experienced by the world economy.
Abstract: The concept of multifractality offers a powerful formal tool to filter out multitude of the most relevant characteristics of complex time series. The related studies thus far presented in the scientific literature typically limit themselves to evaluation of whether or not a time series is multifractal and width of the resulting singularity spectrum is considered a measure of the degree of complexity involved. However, the character of the complexity of time series generated by the natural processes usually appears much more intricate than such a bare statement can reflect. As an example, based on the long-term records of S&P500 and NASDAQ - the two world leading stock market indices - the present study shows that they indeed develop the multifractal features, but these features evolve through a variety of shapes, most often strongly asymmetric, whose changes typically are correlated with the historically most significant events experienced by the world economy. Relating at the same time the index multifractal singularity spectra to those of the component stocks that form this index reflects the varying degree of correlations involved among the stocks.

32 citations


Posted Content
TL;DR: The Wang–Mendel method is used to design a deep convolutional fuzzy system based on an input–output data pairs and the models are applied to predict a synthetic chaotic plus random time-series and the real Hang Seng Index of the Hong Kong stock market.
Abstract: A deep convolutional fuzzy system (DCFS) on a high-dimensional input space is a multi-layer connection of many low-dimensional fuzzy systems, where the input variables to the low-dimensional fuzzy systems are selected through a moving window across the input spaces of the layers. To design the DCFS based on input-output data pairs, we propose a bottom-up layer-by-layer scheme. Specifically, by viewing each of the first-layer fuzzy systems as a weak estimator of the output based only on a very small portion of the input variables, we design these fuzzy systems using the WM Method. After the first-layer fuzzy systems are designed, we pass the data through the first layer to form a new data set and design the second-layer fuzzy systems based on this new data set in the same way as designing the first-layer fuzzy systems. Repeating this process layer-by-layer we design the whole DCFS. We also propose a DCFS with parameter sharing to save memory and computation. We apply the DCFS models to predict a synthetic chaotic plus random time-series and the real Hang Seng Index of the Hong Kong stock market.

28 citations


Posted Content
TL;DR: In this article, it was shown that the macroscopic price is diffusive with rough volatility, with a one-to-one correspondence between the exponent of the impact function and the Hurst parameter of the volatility.
Abstract: Market impact is the link between the volume of a (large) order and the price move during and after the execution of this order. We show that under no-arbitrage assumption, the market impact function can only be of power-law type. Furthermore, we prove that this implies that the macroscopic price is diffusive with rough volatility, with a one-to-one correspondence between the exponent of the impact function and the Hurst parameter of the volatility. Hence we simply explain the universal rough behavior of the volatility as a consequence of the no-arbitrage property. From a mathematical viewpoint, our study relies in particular on new results about hyper-rough stochastic Volterra equations.

25 citations


Journal ArticleDOI
TL;DR: In this article, the human capital component was introduced to the Fama and French five-factor model proposing an equilibrium six-factor asset pricing model, which employs an aggregate of four sets of portfolios mimicking size and industry with varying dimensions.
Abstract: The present study introduce the human capital component to the Fama and French five-factor model proposing an equilibrium six-factor asset pricing model. The study employs an aggregate of four sets of portfolios mimicking size and industry with varying dimensions. The first set consists of three set of six portfolios each sorted on size to B/M, size to investment, and size to momentum. The second set comprises of five index portfolios, third, a four-set of twenty-five portfolios each sorted on size to B/M, size to investment, size to profitability, and size to momentum, and the final set constitute thirty industry portfolios. To estimate the parameters of six-factor asset pricing model for the four sets of variant portfolios, we use OLS and Generalized method of moments based robust instrumental variables technique (IVGMM). The results obtained from the relevance, endogeneity, overidentifying restrictions, and the Hausman's specification, tests indicate that the parameter estimates of the six-factor model using IVGMM are robust and performs better than the OLS approach. The human capital component shares equally the predictive power alongside the factors in the framework in explaining the variations in return on portfolios. Furthermore, we assess the t-ratio of the human capital component of each IVGMM estimates of the six-factor asset pricing model for the four sets of variant portfolios. The t-ratio of the human capital of the eighty-three IVGMM estimates are more than 3.00 with reference to the standard proposed by Harvey et al. (2016). This indicates the empirical success of the six-factor asset-pricing model in explaining the variation in asset returns.

23 citations


Journal ArticleDOI
TL;DR: In this paper, the authors analyzed the connection between innovation activities of companies and their performance measured at time of crisis and found that the behavior of the performance of the companies is not univocal when they innovate.
Abstract: This paper analyzes the connection between innovation activities of companies -- implemented before crisis -- and their performance -- measured at time of crisis. The companies listed in the STAR Market Segment of the Italian Stock Exchange are analyzed. Innovation is measured through the level of investments in total tangible and intangible fixed assets in 2006-2007, while performance is captured through growth -- expressed by variations of sales, total assets and employees -- profitability -- through ROI or ROS -- and productivity -- through asset turnover or sales per employee in the period 2008-2010. The variables of interest are analyzed and compared through statistical techniques and by adopting cluster analysis. In particular, a Voronoi tessellation is also implemented in a varying centroids framework. In accord with a large part of the literature, we find that the behavior of the performance of the companies is not univocal when they innovate.

Posted Content
TL;DR: In this article, a behavioral finance perspective was used to find the parallelism between biases present in financial markets that could be applied to cryptomarkets, and it is suggested that cryptocurrencies' prices are driven by herding, hence they test herding behavior under asymmetric and symmetric conditions and the existence of different herding regimes by employing the Markov-Switching approach.
Abstract: There are no solid arguments to sustain that digital currencies are the future of online payments or the disruptive technology that some of its former participants declared when used to face critiques. This paper aims to solve the cryptocurrency puzzle from a behavioral finance perspective by finding the parallelism between biases present in financial markets that could be applied to cryptomarkets. Moreover, it is suggested that cryptocurrencies' prices are driven by herding, hence this study test herding behavior under asymmetric and symmetric conditions and the existence of different herding regimes by employing the Markov-Switching approach.

Journal ArticleDOI
TL;DR: The dynamics of intraday prices of 12 cryptocurrencies during the past months' boom and bust are discussed, revealing two currencies that exhibit a more persistent stochastic dynamics and two other currencies whose behavior is closer to a random walk.
Abstract: This paper discusses the dynamics of intraday prices of twelve cryptocurrencies during last months' boom and bust. The importance of this study lies on the extended coverage of the cryptoworld, accounting for more than 90\% of the total daily turnover. By using the complexity-entropy causality plane, we could discriminate three different dynamics in the data set. Whereas most of the cryptocurrencies follow a similar pattern, there are two currencies (ETC and ETH) that exhibit a more persistent stochastic dynamics, and two other currencies (DASH and XEM) whose behavior is closer to a random walk. Consequently, similar financial assets, using blockchain technology, are differentiated by market participants.

Journal ArticleDOI
TL;DR: In this paper, the ID$_3$-Price in the German Intraday Continuous electricity market using an econometric time series model is analyzed. But the model's performance is compared with benchmark models and is discussed in detail.
Abstract: In the following paper, we analyse the ID$_3$-Price in the German Intraday Continuous electricity market using an econometric time series model. A multivariate approach is conducted for hourly and quarter-hourly products separately. We estimate the model using lasso and elastic net techniques and perform an out-of-sample, very short-term forecasting study. The model's performance is compared with benchmark models and is discussed in detail. Forecasting results provide new insights to the German Intraday Continuous electricity market regarding its efficiency and to the ID$_3$-Price behaviour.

Posted Content
TL;DR: In this paper, the authors proposed an estimator of the Ornstein-Uhlenbeck process based on the maximum likelihood which is robust to the noise and utilizes irregularly spaced data.
Abstract: When stock prices are observed at high frequencies, more information can be utilized in estimation of parameters of the price process. However, high-frequency data are contaminated by the market microstructure noise which causes significant bias in parameter estimation when not taken into account. We propose an estimator of the Ornstein-Uhlenbeck process based on the maximum likelihood which is robust to the noise and utilizes irregularly spaced data. We also show that the Ornstein-Uhlenbeck process contaminated by the independent Gaussian white noise and observed at discrete equidistant times follows an ARMA(1,1) process. To illustrate benefits of the proposed noise-robust approach, we analyze an intraday pairs trading strategy based on the mean-variance optimization. In an empirical study of 7 Big Oil companies, we show that the use of the proposed estimator of the Ornstein-Uhlenbeck process leads to an increase in profitability of the pairs trading strategy.

Journal ArticleDOI
TL;DR: It is demonstrated that this procedure can be efficiently used to forecast off-sample future market states with significant prediction accuracy and opens the way to a range of applications in risk management and trading strategies in the context where the correlation structure plays a central role.
Abstract: We propose a novel methodology to define, analyze and forecast market states. In our approach market states are identified by a reference sparse precision matrix and a vector of expectation values. In our procedure, each multivariate observation is associated with a given market state accordingly to a minimization of a penalized Mahalanobis distance. The procedure is made computationally very efficient and can be used with a large number of assets. We demonstrate that this procedure is successful at clustering different states of the markets in an unsupervised manner. In particular, we describe an experiment with one hundred log-returns and two states in which the methodology automatically associates states prevalently to pre- and post- crisis periods with one state gathering periods with average positive returns and the other state periods with average negative returns, therefore discovering spontaneously the common classification of `bull' and `bear' markets. In another experiment, with again one hundred log-returns and two states, we demonstrate that this procedure can be efficiently used to forecast off-sample future market states with significant prediction accuracy. This methodology opens the way to a range of applications in risk management and trading strategies in the context where the correlation structure plays a central role.

Posted Content
TL;DR: The paper solves the problem of optimal portfolio choice when the parameters of the asset returns distribution, for example the mean vector and the covariance matrix, are unknown and have to be estimated by using historical data on asset returns using the Bayesian posterior predictive distribution.
Abstract: The paper solves the problem of optimal portfolio choice when the parameters of the asset returns distribution, like the mean vector and the covariance matrix are unknown and have to be estimated by using historical data of the asset returns. The new approach employs the Bayesian posterior predictive distribution which is the distribution of the future realization of the asset returns given the observable sample. The parameters of the posterior predictive distributions are functions of the observed data values and, consequently, the solution of the optimization problem is expressed in terms of data only and does not depend on unknown quantities. In contrast, the optimization problem of the traditional approach is based on unknown quantities which are estimated in the second step leading to a suboptimal solution. We also derive a very useful stochastic representation of the posterior predictive distribution whose application leads not only to the solution of the considered optimization problem, but provides the posterior predictive distribution of the optimal portfolio return used to construct a prediction interval. A Bayesian efficient frontier, a set of optimal portfolios obtained by employing the posterior predictive distribution, is constructed as well. Theoretically and using real data we show that the Bayesian efficient frontier outperforms the sample efficient frontier, a common estimator of the set of optimal portfolios known to be overoptimistic.

Posted ContentDOI
TL;DR: This thesis studied corporate bankruptcy of manufacturing companies in Korea and Poland using experts' opinions and financial measures, respectively using several machine learning methods to learn the relationship between the company's current state and its fate in the near future.
Abstract: Corporate insolvency can have a devastating effect on the economy With an increasing number of companies making expansion overseas to capitalize on foreign resources, a multinational corporate bankruptcy can disrupt the world's financial ecosystem Corporations do not fail instantaneously; objective measures and rigorous analysis of qualitative (eg brand) and quantitative (eg econometric factors) data can help identify a company's financial risk Gathering and storage of data about a corporation has become less difficult with recent advancements in communication and information technologies The remaining challenge lies in mining relevant information about a company's health hidden under the vast amounts of data, and using it to forecast insolvency so that managers and stakeholders have time to react In recent years, machine learning has become a popular field in big data analytics because of its success in learning complicated models Methods such as support vector machines, adaptive boosting, artificial neural networks, and Gaussian processes can be used for recognizing patterns in the data (with a high degree of accuracy) that may not be apparent to human analysts This thesis studied corporate bankruptcy of manufacturing companies in Korea and Poland using experts' opinions and financial measures, respectively Using publicly available datasets, several machine learning methods were applied to learn the relationship between the company's current state and its fate in the near future Results showed that predictions with accuracy greater than 95% were achievable using any machine learning technique when informative features like experts' assessment were used However, when using purely financial factors to predict whether or not a company will go bankrupt, the correlation is not as strong

Journal ArticleDOI
TL;DR: In this article, a cross-shareholding matrix is considered, along with two key factors: the node out-degree distribution which represents the diversification of investments in terms of the number of involved companies, and the node indegree distribution that reports the integration of a company due to the sales of its own shares to other companies.
Abstract: --- the companies populating a Stock market, along with their connections, can be effectively modeled through a directed network, where the nodes represent the companies, and the links indicate the ownership. This paper deals with this theme and discusses the concentration of a market. A cross-shareholding matrix is considered, along with two key factors: the node out-degree distribution which represents the diversification of investments in terms of the number of involved companies, and the node in-degree distribution which reports the integration of a company due to the sales of its own shares to other companies. While diversification is widely explored in the literature, integration is most present in literature on contagions. This paper captures such quantities of interest in the two frameworks and studies the stochastic dependence of diversification and integration through a copula approach. We adopt entropies as measures for assessing the concentration in the market. The main question is to assess the dependence structure leading to a better description of the data or to market polarization (minimal entropy) or market fairness (maximal entropy). In so doing, we derive information on the way in which the in- and out-degrees should be connected in order to shape the market. The question is of interest to regulators bodies, as witnessed by specific alert threshold published on the US mergers guidelines for limiting the possibility of acquisitions and the prevalence of a single company on the market. Indeed, all countries and the EU have also rules or guidelines in order to limit concentrations, in a country or across borders, respectively. The calibration of copulas and model parameters on the basis of real data serves as an illustrative application of the theoretical proposal.

Posted Content
TL;DR: A comparative analytical approach and numerical technique to find the price of call option and put option and considered these two prices as buying price and selling price of stocks of frontier markets so that it can predict the stock price (close price).
Abstract: The Black-Scholes Option pricing model (BSOPM) has long been in use for valuation of equity options to find the prices of stocks. In this work, using BSOPM, we have come up with a comparative analytical approach and numerical technique to find the price of call option and put option and considered these two prices as buying price and selling price of stocks of frontier markets so that we can predict the stock price (close price). Changes have been made to the model to find the parameters strike price and the time of expiration for calculating stock price of frontier markets. To verify the result obtained using modified BSOPM we have used machine learning approach using the software Rapidminer, where we have adopted different algorithms like the decision tree, ensemble learning method and neural network. It has been observed that, the prediction of close price using machine learning is very similar to the one obtained using BSOPM. Machine learning approach stands out to be a better predictor over BSOPM, because Black-Scholes-Merton equation includes risk and dividend parameter, which changes continuously. We have also numerically calculated volatility. As the prices of the stocks goes high due to overpricing, volatility increases at a tremendous rate and when volatility becomes very high market tends to fall, which can be observed and determined using our modified BSOPM. The proposed modified BSOPM has also been explained based on the analogy of Schrodinger equation (and heat equation) of quantum physics.

Posted Content
TL;DR: In this article, the Chiarella model is extended by adding noise traders and a non-linear demand of fundamentalists, and Bayesian filtering techniques are used to calibrate the model on time series of prices across a variety of asset classes.
Abstract: Trend and Value are pervasive anomalies, common to all financial markets. We address the problem of their co-existence and interaction within the framework of Heterogeneous Agent Based Models (HABM). More specifically, we extend the Chiarella (1992) model by adding noise traders and a non-linear demand of fundamentalists. We use Bayesian filtering techniques to calibrate the model on time series of prices across a variety of asset classes since 1800. The fundamental value is an output of the calibration, and does not require the use of an external pricing model. Our extended model reproduces many empirical observations, including the non-monotonic relation between past trends and future returns. The destabilizing activity of trend-followers leads to a qualitative change of mispricing distribution, from unimodal to bimodal, meaning that some markets tend to be over- (or under-) valued for long periods of time.

Posted Content
TL;DR: An efficient fat-tail measurement framework that is based on the conditional second moments is introduced, and a goodness-of-fit statistic that has a direct interpretation and can be used to assess the impact of fat-tails on central data conditional dispersion is constructed.
Abstract: In this paper we introduce an efficient fat-tail measurement framework that is based on the conditional second moments. We construct a goodness-of-fit statistic that has a direct interpretation and can be used to assess the impact of fat-tails on central data conditional dispersion. Next, we show how to use this framework to construct a powerful normality test. In particular, we compare our methodology to various popular normality tests, including the Jarque--Bera test that is based on third and fourth moments, and show that in many cases our framework outperforms all others, both on simulated and market stock data. Finally, we derive asymptotic distributions for conditional mean and variance estimators, and use this to show asymptotic normality of the proposed test statistic.

Posted Content
TL;DR: This research aims to identify how Bitcoin-related news publications and online discourse are expressed in Bitcoin exchange movements of price and volume, and finds weak to moderate correlations between forum, news, and Reddit sentiment and movements in price andVolume from 1 to 5 days after the sentiment was expressed.
Abstract: This research aims to identify how Bitcoin-related news publications and online discourse are expressed in Bitcoin exchange movements of price and volume. Being inherently digital, all Bitcoin-related fundamental data (from exchanges, as well as transactional data directly from the blockchain) is available online, something that is not true for traditional businesses or currencies traded on exchanges. This makes Bitcoin an interesting subject for such research, as it enables the mapping of sentiment to fundamental events that might otherwise be inaccessible. Furthermore, Bitcoin discussion largely takes place on online forums and chat channels. In stock trading, the value of sentiment data in trading decisions has been demonstrated numerous times [1] [2] [3], and this research aims to determine whether there is value in such data for Bitcoin trading models. To achieve this, data over the year 2015 has been collected from this http URL, (the biggest Bitcoin forum in post volume), established news sources such as Bloomberg and the Wall Street Journal, the complete /r/btc and /r/Bitcoin subreddits, and the bitcoin-otc and bitcoin-dev IRC channels. By analyzing this data on sentiment and volume, we find weak to moderate correlations between forum, news, and Reddit sentiment and movements in price and volume from 1 to 5 days after the sentiment was expressed. A Granger causality test confirms the predictive causality of the sentiment on the daily percentage price and volume movements, and at the same time underscores the predictive causality of market movements on sentiment expressions in online communities

Posted Content
TL;DR: This paper quantitatively characterize the nonlinearity in stock time series and the effect it has on stock network properties by applying a systematic multi-step approach to stocks included in three prominent indices, and establishes that the apparent non linearity that has been observed is largely due to univariate non-Gaussianity.
Abstract: Stock networks, constructed from stock price time series, are a well-established tool for the characterization of complex behavior in stock markets. Following Mantegna's seminal paper, the linear Pearson's correlation coefficient between pairs of stocks has been the usual way to determine network edges. Recently, possible effects of nonlinearity on the graph-theoretical properties of such networks have been demonstrated when using nonlinear measures such as mutual information instead of linear correlation. In this paper, we quantitatively characterize the nonlinearity in stock time series and the effect it has on stock network properties. This is achieved by a systematic multi-step approach that allows us to quantify the nonlinearity of coupling; correct its effects wherever it is caused by simple univariate non-Gaussianity; potentially localize in space and time any remaining strong sources of this nonlinearity; and, finally, study the effect nonlinearity has on global network properties. By applying this multi-step approach to stocks included in three prominent indices (NYSE100, FTSE100 and SP500), we establish that the apparent nonlinearity that has been observed is largely due to univariate non-Gaussianity. Furthermore, strong nonstationarity in a few specific stocks may play a role. In particular, the sharp decrease in some stocks during the global financial crisis of 2008 gives rise to apparent nonlinear dependencies among stocks.

Journal ArticleDOI
TL;DR: It is suggested that massive data sources resulting from human interaction with the Internet may offer a new perspective on the behavior of market participants in periods of large market movements, which demonstrates the effectiveness of the LSTM neural network in volatility forecasting.
Abstract: Intense volatility in financial markets affect humans worldwide. Therefore, relatively accurate prediction of volatility is critical. We suggest that massive data sources resulting from human interaction with the Internet may offer a new perspective on the behavior of market participants in periods of large market movements. First we select 28 key words, which are related to finance as indicators of the public mood and macroeconomic factors. Then those 28 words of the daily search volume based on Baidu index are collected manually, from June 1, 2006 to October 29, 2017. We apply a Long Short-Term Memory neural network to forecast CSI300 volatility using those search volume data. Compared to the benchmark GARCH model, our forecast is more accurate, which demonstrates the effectiveness of the LSTM neural network in volatility forecasting.

Posted Content
TL;DR: In this article, a simple non-equilibrium model of a financial market as an open system with a possible exchange of money with an outside world and market frictions (trade impacts) incorporated into asset price dynamics via a feedback mechanism is proposed.
Abstract: We propose a simple non-equilibrium model of a financial market as an open system with a possible exchange of money with an outside world and market frictions (trade impacts) incorporated into asset price dynamics via a feedback mechanism. Using a linear market impact model, this produces a non-linear two-parametric extension of the classical Geometric Brownian Motion (GBM) model, that we call the "Quantum Equilibrium-Disequilibrium" (QED) model. The QED model gives rise to non-linear mean-reverting dynamics, broken scale invariance, and corporate defaults. In the simplest one-stock (1D) formulation, our parsimonious model has only one degree of freedom, yet calibrates to both equity returns and credit default swap spreads. Defaults and market crashes are associated with dissipative tunneling events, and correspond to instanton (saddle-point) solutions of the model. When market frictions and inflows/outflows of money are neglected altogether, "classical" GBM scale-invariant dynamics with an exponential asset growth and without defaults are formally recovered from the QED dynamics. However, we argue that this is only a formal mathematical limit, and in reality the GBM limit is non-analytic due to non-linear effects that produce both defaults and divergence of perturbation theory in a small market friction parameter.

Posted Content
TL;DR: In this paper, the authors examined the time series properties of cryptocurrency assets, such as Bitcoin, using established econometric inference techniques, namely models of the GARCH family, and argued that there is a strong empirical argument against modelling innovations under some common assumptions.
Abstract: This paper examines the time series properties of cryptocurrency assets, such as Bitcoin, using established econometric inference techniques, namely models of the GARCH family. The contribution of this study is twofold. I explore the time series properties of cryptocurrencies, a new type of financial asset on which there appears to be little or no literature. I suggest an improved econometric specification to that which has been recently proposed in Chu et al (2017), the first econometric study to examine the price dynamics of the most popular cryptocurrencies. Questions regarding the reliability of their study stem from the authors mis-diagnosing the distribution of GARCH innovations. Checks are performed on whether innovations are Gaussian or GED by using Kolmogorov type non-parametric tests and Khmaladze's martingale transformation. Null of gaussianity is strongly rejected for all GARCH(p,q) models, with $p,q \in \{1,\ldots,5 \}$, for all cryptocurrencies in sample. For tests of normality, I make use of the Gauss-Kronrod quadrature. Parameters of GARCH models are estimated with generalized error distribution innovations using maximum likelihood. For calculating P-values, the parametric bootstrap method is used. Arguing against Chu et al (2017), I show that there is a strong empirical argument against modelling innovations under some common assumptions.

Posted Content
TL;DR: An extended coupled hidden Markov model is introduced incorporating the news events with the historical trading data to address the data sparsity issue of news events for each single stock and incorporate the correlations into the model to facilitate the prediction task.
Abstract: Traditional stock market prediction methods commonly only utilize the historical trading data, ignoring the fact that stock market fluctuations can be impacted by various other information sources such as stock related events. Although some recent works propose event-driven prediction approaches by considering the event data, how to leverage the joint impacts of multiple data sources still remains an open research problem. In this work, we study how to explore multiple data sources to improve the performance of the stock prediction. We introduce an Extended Coupled Hidden Markov Model incorporating the news events with the historical trading data. To address the data sparsity issue of news events for each single stock, we further study the fluctuation correlations between the stocks and incorporate the correlations into the model to facilitate the prediction task. Evaluations on China A-share market data in 2016 show the superior performance of our model against previous methods.

Posted Content
TL;DR: This paper revisits the Kalman filter theory, revisiting well known and establish results, and gives new algorithms for inference for extended Kalman filters and presents an alternative to the traditional estimation of parameters using EM algorithm thanks to the usage of CMA-ES optimization.
Abstract: Affiliated researcher to LAMSADE (UMR CNRS 7243) and QMI (Quantitative Management Initiative) chair, Abstract: In this paper, we revisit the Kalman filter theory. After giving the intuition on a simplified financial markets example, we revisit the maths underlying it. We then show that Kalman filter can be presented in a very different fashion using graphical models. This enables us to establish the connection between Kalman filter and Hidden Markov Models. We then look at their application in financial markets and provide various intuitions in terms of their applicability for complex systems such as financial markets. Although this paper has been written more like a self contained work connecting Kalman filter to Hidden Markov Models and hence revisiting well known and establish results, it contains new results and brings additional contributions to the field. First, leveraging on the link between Kalman filter and HMM, it gives new algorithms for inference for extended Kalman filters. Second, it presents an alternative to the traditional estimation of parameters using EM algorithm thanks to the usage of CMA-ES optimization. Third, it examines the application of Kalman filter and its Hidden Markov models version to financial markets, providing various dynamics assumptions and tests. We conclude by connecting Kalman filter approach to trend following technical analysis system and showing their superior performances for trend following detection.

Posted Content
TL;DR: This article investigated the interaction between news and prices for the one-day-ahead volatility prediction using state-of-the-art deep learning approaches and found that adding news improves the volatility forecasting as compared to the mainstream models that rely only on price data.
Abstract: Stock market volatility forecasting is a task relevant to assessing market risk. We investigate the interaction between news and prices for the one-day-ahead volatility prediction using state-of-the-art deep learning approaches. The proposed models are trained either end-to-end or using sentence encoders transfered from other tasks. We evaluate a broad range of stock market sectors, namely Consumer Staples, Energy, Utilities, Heathcare, and Financials. Our experimental results show that adding news improves the volatility forecasting as compared to the mainstream models that rely only on price data. In particular, our model outperforms the widely-recognized GARCH(1,1) model for all sectors in terms of coefficient of determination $R^2$, $MSE$ and $MAE$, achieving the best performance when training from both news and price data.