scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Detecting correlation in stock market

TL;DR: In order to find hidden correlations in the daily returns, this work builds cross prediction models and uses the normalized modeling error as a generalized correlation measure that extends the concept of the classical correlation matrix.
Abstract: We present a new method for detecting dependencies in the stock market. In order to find hidden correlations in the daily returns, we build cross prediction models and use the normalized modeling error as a generalized correlation measure that extends the concept of the classical correlation matrix.

Summary (1 min read)

1 Introduction

  • , where the brackets indicate the time average over all trading days in the investigated period.
  • Following their investigations the authors see strong indications that this asymmetric interaction exists in a way that the dynamics of single stocks are leading the dynamics of others significantly.
  • The authors indicate this with a cross modeling scheme which is described in the following section.

2 Mixed State Analysis

  • For δ(i, j) > 0 the authors have cp(i, j) > cp(j, i) which means that the returns of the i-th stock contain more useful information to model the returns of the j-th stock than the other way around.
  • In the terms of synchronization this indicates an asymmetrical coupling strength between the two stocks.

3 Numerical Simulations

  • For all 30 stocks in the DJIA, the authors build the time series of daily returns and calculate the cross-correlation matrix ρ(i, j) (see equation 1).
  • The stocks that behave anti correlated with respect to the index (the blue stripes in the correlation matrix) occur in cp(i, j) with an modeling error near one.
  • In the matrix of the error differences δ(i, j) the authors find the amount of asymmetry regarding their mixed state analysis that offers a field of further investigations.
  • The next step will be a detailed analysis of the time dependence of these asymmetries an the nonlinear dependencies in the stock market.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Detecting Correlation in Stock Market
org D. Wichard, Christian Merkwirth, Maciej OgorzaÃlek
a,b,a
a
AGH University of Science and Technology
Department of Electrical Engineering
al. Mickiewicza 30
30-059 Krak´ow, Poland
b
Max-Planck-Institut f¨ur Informatik
Stuhlsatzenhausweg 85
66123 Saarbr¨ucken, Germany
Abstract
We present a new method for detecting dependencies in the stock market. In order
to find hidden correlations in the daily returns, we build cross prediction models
and use the normalized modeling error as a generalized correlation measure that
extends the concept of the classical correlation matrix.
Key words: Econophysics, Multivariate analysis, Time series analysis
PACS: 89.65.Gh, 02.50.Sk, 05.45.Tp
1 Introduction
The analysis of the the cross-correlation matrix of the returns plays an impor-
tant role in portfolio theory and financial analysis. We build the time series
of daily returns
R
i
(t) =
Y
i
(t + 1) Y
i
(t)
Y
i
(t)
,
wherein Y
i
(t) denotes the closing-price of the i-th stock at day t. The cross-
correlation matrix of the returns is defined as
ρ
ij
=
hR
i
R
j
i hR
i
ihR
j
i
q
hR
2
i
hR
i
i
2
ihR
2
j
hR
j
i
2
i
,
where the brackets indicate the time average over all trading days in the
investigated period. The analysis of ρ
ij
leads to some interesting insights in the
market dynamics. Mantegna (see Mantegna (1999)) discovered a hierarchical
Preprint submitted to Elsevier Science 25 April 2004

organization inside a portfolio of stocks by introducing a metric related to
the correlation coefficients. By definition the correlation matrix is symmetric
with respect to i and j and thus cannot be used to distinguish a symmetrical
interaction between different stocks from an asymmetric one. Following our
investigations we see strong indications that this asymmetric interaction exists
in a way that the dynamics of single stocks are leading the dynamics of others
significantly. We indicate this with a cross modeling scheme which is described
in the following section.
2 Mixed State Analysis
The scheme we introduce for market analysis is related to the “mixed state
analysis” of multivariate time series which was developed to detect weak cou-
pling between dynamical systems in the framework of chaotic synchronization
(see Wiesenfeldt et al. (2001)). This approach is based on the reconstruction
of mixed states consisting of delayed samples taken from simultaneously mea-
sured time series of both systems under investigation.
We adopted this idea and changed it for our purpose in a way that a linear
model f(
~
R
i,j
(t)) is constructed that maps the time-lagged returns of the j-th
stock together with the time-lagged returns of the i-th stock
~
R
i,j
(t) = (R
j
(t), R
j
(t 1), . . . , R
j
(t τ), R
i
(t 1), . . . , R
i
(t τ)) (1)
onto the actual returns of the i-th stock R
i
(t). The model f(·) is a linear
function that is fitted using the standard least squares approach (see for ex-
ample Hastie et al. (2001)) for multiple linear regression models, i.e. it should
minimize the residual sum of squares
P
t
(R
i
(t) f (
~
R
i,j
(t)))
2
. We would like
to remark that this model f(·) is for sure not able to make predictions of the
returns for the next day, however it is able to find the relationship between
the actual returns R
i
(t) and R
j
(t) with resp ect to the time lagged returns,
that may contain some information about linear trends on short time scales.
If we consider a portfolio of N different stocks, we can define the N ×N-matrix
of the normalized modeling error as
cp(i, j) =
h(R
i
f(
~
R
i,j
))
2
i
hR
2
i
hR
i
i
2
i
, (2)
where the brackets denote the time average. The modeling error is normalized
with the variance of the time series R
i
(t) for a simple reason: A value of
cp(i, j) 1.0 indicates that the mean value hR
i
i is a more appropriate model
than f (·), which means that there is no linear dep endence in the the time series
under investigation. Smaller values of cp(i, j) give an indication that there is
at least a weak linear interrelation between the dynamics of the returns. In
2

general, the matrix cp(i, j) is not symmetric, i.e. cp(i, j) 6= cp(j, i). We define
the matrix of differences δ(i, j) as
δ(i, j) = cp(i, j) cp(j, i). (3)
The values of δ(i, j) reflect asymmetric dependencies in the market dynamics.
If the returns of i and j are uncorrelated or they interact on the same level,
then we expect δ(i, j) 0.
For δ(i, j) > 0 we have cp(i, j) > cp(j, i) which means that the returns of
the i-th stock contain more useful information to model the returns of the
j-th stock than the other way around. In the terms of synchronization this
indicates an asymmetrical coupling strength between the two stocks.
3 Numerical Simulations
We investigate 600 trading days of the Dow-Jones Industrial Average (DJIA)
between 2-Oct-2000 and 3-Mar-2003. For all 30 stocks in the DJIA, we build
the time series of daily returns and calculate the cross-correlation matrix ρ(i, j)
(see equation 1). For the mixed state analysis we use a time lag of τ = 3 and
we calculate the matrix of the modeling error
1
as defined in equation 2 and
further the matrix of differences δ(i, j) from equation 3. The results are shown
in Figure 2. The cross-correlation matrix shows some interesting structures,
for example are there obvious clusters, there were described by Mantegna
(1999). A part of this structures can be found in the matrix of the modeling
error cp(i, j). The stocks that behave anti correlated with respect to the index
(the blue stripes in the correlation matrix) occur in cp(i, j) with an modeling
error near one. In the matrix of the error differences δ(i, j) we find the amount
of asymmetry regarding our mixed state analysis that offers a field of further
investigations. The next step will be a detailed analysis of the time dependence
of these asymmetries an the nonlinear dependencies in the stock market.
References
Hastie, T., Tibshirani, R., Friedman, J., 2001. The Elements of Statistical
Learning. Springer Series in Statistics. Springer-Verlag.
Mantegna, R., 1999. Hierarchical structure in financial markets. Eur. Phys. J.
B. 11, 193–197.
Wiesenfeldt, M., Parlitz, U., Lauterborn, W., 2001. Mixed state analysis of
multivariate time series. Int. J. Bifurcation and Chaos 11 (8), 2217–2226.
1
In order to achieve a better graphical resolution in the plots, we set the zero
diagonal elements to one.
3

AA
AXP
BA
C
CAT
DD
DIS
EK
GE
GM
HD
HON
HPQ
IBM
INTC
IP
JNJ
JPM
KO
MCD
MMM
MO
MRK
MSFT
PG
SBC
T
UTX
WMT
XOM
−0.5
−0.25
0
0.25
0.5
0.75
1
AA
AXP
BA
C
CAT
DD
DIS
EK
GE
GM
HD
HON
HPQ
IBM
INTC
IP
JNJ
JPM
KO
MCD
MMM
MO
MRK
MSFT
PG
SBC
T
UTX
WMT
XOM
0.50
0.60
0.70
0.80
0.90
+1.00
Student Version of MATLAB
AA
AXP
BA
C
CAT
DD
DIS
EK
GE
GM
HD
HON
HPQ
IBM
INTC
IP
JNJ
JPM
KO
MCD
MMM
MO
MRK
MSFT
PG
SBC
T
UTX
WMT
XOM
−0.2
−0.15
−0.1
−0.05
0
0.05
0.1
0.15
0.2
Fig. 1. The cross-correlation matrix (top), the matrix of the normalized modeling
error cp(i, j) (middle) and the matrix δ(i, j) of the error differences as defined in
equation 3 (bottom) for 600 days of the DJIA (Ticker symbols on the left).
4
Citations
More filters
Proceedings ArticleDOI
24 Jul 2016
TL;DR: A hybrid ensemble combines the output of several different models by a weighted mean that forms the final forecast that shows the application in the WCCI 2016 CIF challenge.
Abstract: We propose hybrid ensemble models for time series forecasting. A hybrid ensemble combines the output of several different models by a weighted mean that forms the final forecast. The final hybrid ensemble model consists of several individual models: A nearest neighbor/trajectory ensemble model, a feed-forward neural network ensemble, a trend cycle model, an autoregressive model and an ensemble model based on the returns of the time series. The best performing models with respect to a left-out part of the time series are selected by cross-validation and are combined by taking the inverse SMAPE prediction error as a weight for the combination of the respective single forecasts. We show the application of this approach in the WCCI 2016 CIF challenge.

6 citations

Journal ArticleDOI
01 May 2018
TL;DR: A stochastic time strength neural network model is developed to predict the return scaling cross-correlation relationships between two real stock market indexes and the empirical results show that the proposed neural network is advantageous in increasing the forecasting precision.
Abstract: A return scaling cross-correlation function of exponential parameter is introduced in the present work, and a stochastic time strength neural network model is developed to predict the return scaling cross-correlations between two real stock market indexes, Shanghai Composite Index and Shenzhen Component Index. In the proposed model, the stochastic time strength function gives a weight for each historical data and makes the model have the effect of random movement. The empirical research is performed in testing the model forecasting effect of long-term cross-correlation relationships by training short-term cross-correlations, and a corresponding comparison analysis is made to the backpropagation neural network model. The empirical results show that the proposed neural network is advantageous in increasing the forecasting precision.

6 citations

Journal Article
TL;DR: An efficient algorithm is proposed which is able to track the lagged correlation and compute the leaders incrementally, while still achieving good accuracy, and the detected leaders demonstrate high predictive power on the event of general time series entities, which can enlighten both climate monitoring and financial risk control.
Abstract: Analyzing the relationships of time series is an important problem for many applications, including climate monitoring, stock investment, traffic control, etc. Existing research mainly focuses on studying the relationship between a pair of time series. In this paper, we study the problem of discovering leaders among a set of time series by analyzing lead-lag relations. A time series is considered to be one of the leaders if its rise or fall impacts the behavior of many other time series. At each time point, we compute the lagged correlation between each pair of time series and model them in a graph. Then, the leadership rank is computed from the graph, which brings order to time series. Based on the leadership ranking, the leaders of time series are extracted. However, the problem poses great challenges as time goes by, since the dynamic nature of time series results in highly evolving relationships between time series. We propose an efficient algorithm which is able to track the lagged correlation and compute the leaders incrementally, while still achieving good accuracy. Our experiments on real climate science data and stock data show that our algorithm is able to compute time series leaders efficiently in a real-time manner and the detected leaders demonstrate high predictive power on the event of general time series entities, which can enlighten both climate monitoring and financial risk control.

4 citations


Cites background from "Detecting correlation in stock mark..."

  • ...We are also aware of a stream of work [6, 17 ,8,9] that constructs a weighted graph on time series in order to discover different interesting patterns....

    [...]

Dissertation
01 Jan 2015

3 citations


Cites background or methods from "Detecting correlation in stock mark..."

  • ...Among these, Yamashita et al. (2005) applied a multi- branch artificial neural network (MBNN) to financial market applications. After investigating the predictive accuracy of the TOPIX index of the Tokyo Stock market using MBNN, the results evidenced that these multi-branch neural networks based on artificial intelligence might be more capable of generating greater generalization and representation, compared to simple conventional neural networks. Using the index value of TOPIX, multi-branch neural networks are better at predicting the next day TPOIX values. After various simulations were conducted to compare the multi-branch neural networks with other conventional neural networks, it was concluded that investors and economists can achieve a higher accuracy of forecasting with the proposed MBNN model. Moreover, Afolabi and Olatoyosiuse (2007) used the “Kohonen Self Organising Map (SOM) and hybrid Kohonen SOM” prediction of stock prices....

    [...]

  • ...Among these, Yamashita et al. (2005) applied a multi- branch artificial neural network (MBNN) to financial market applications....

    [...]

  • ...However, the prevalence of complexity in stock market prices made intelligent prediction paradigms highly significant, as well as forecasting stock prices using the conventional prediction models of CAPM and Fama and French (Huang et al., 2004; Wichard et al., 2004)....

    [...]

  • ...7 Adaptive Neural Fuzzy Inference Systems Zadeh (1965) introduced fuzzy logic to show and manipulate data and information involving several types of uncertainty....

    [...]

Weigang Qie1
01 Jan 2011
TL;DR: In this paper, the authors investigated the relationship between market volatility and pairwise correlations of stocks and how portfolio managers' performances vary during turbulent periods and stable periods via empirical data, and they concluded that there exists a significantly positive relationship between the market volatility of the stocks and their correlations.
Abstract: The objective of this article is to deal with two questions. First, what is the relationship between market volatility and pairwise correlations of stocks? Second, how portfolio managers’ performances vary during turbulent periods and stable periods? Two parts are employed to answer those questions separately via empirical data. In Part I, a data set consisting of OMXS30 Index and five stocks is investigated and the relationship between market volatility and pairwise correlations of the stocks is quantified by a linear regression model. The slope of linear model represents the strength of market volatility’s influence on the pairwise correlation of the stocks. Therefore, we conclude that there exists significantly positive relationship between the market volatility and pairwise correlations of the stocks. In part II, we investigate a data set consisting of OMXS30 Return Index and 69 funds. The excess return of the funds is measured by ´ A and Jensen’s ´ A respectively. Four portfolios, Average Portfolio,T5, M5 and B5 which represent the average performance of all the 69 funds, top 5 funds, median 5 funds and bottom 5 funds respectively are set up for comparison. Two conclusions are derived. First, considering the magnitude of the excess return, Average Portfolio, M5 and B5 in times of high market volatility are inferior to those during periods with low market volatility, whereas T5 is superior. Second, in times of high market volatility T5 is superior to the other three portfolios while M5 performs better than B5. In times of low market volatility B5 is inferior to the other three portfolios. Besides, based on the other intercomparisons of the four portfolios, no significant difference is observed.

1 citations

References
More filters
Journal ArticleDOI
TL;DR: Chapter 11 includes more case studies in other areas, ranging from manufacturing to marketing research, and a detailed comparison with other diagnostic tools, such as logistic regression and tree-based methods.
Abstract: Chapter 11 includes more case studies in other areas, ranging from manufacturing to marketing research. Chapter 12 concludes the book with some commentary about the scientiŽ c contributions of MTS. The Taguchi method for design of experiment has generated considerable controversy in the statistical community over the past few decades. The MTS/MTGS method seems to lead another source of discussions on the methodology it advocates (Montgomery 2003). As pointed out by Woodall et al. (2003), the MTS/MTGS methods are considered ad hoc in the sense that they have not been developed using any underlying statistical theory. Because the “normal” and “abnormal” groups form the basis of the theory, some sampling restrictions are fundamental to the applications. First, it is essential that the “normal” sample be uniform, unbiased, and/or complete so that a reliable measurement scale is obtained. Second, the selection of “abnormal” samples is crucial to the success of dimensionality reduction when OAs are used. For example, if each abnormal item is really unique in the medical example, then it is unclear how the statistical distance MD can be guaranteed to give a consistent diagnosis measure of severity on a continuous scale when the larger-the-better type S/N ratio is used. Multivariate diagnosis is not new to Technometrics readers and is now becoming increasingly more popular in statistical analysis and data mining for knowledge discovery. As a promising alternative that assumes no underlying data model, The Mahalanobis–Taguchi Strategy does not provide sufŽ cient evidence of gains achieved by using the proposed method over existing tools. Readers may be very interested in a detailed comparison with other diagnostic tools, such as logistic regression and tree-based methods. Overall, although the idea of MTS/MTGS is intriguing, this book would be more valuable had it been written in a rigorous fashion as a technical reference. There is some lack of precision even in several mathematical notations. Perhaps a follow-up with additional theoretical justiŽ cation and careful case studies would answer some of the lingering questions.

11,507 citations


"Detecting correlation in stock mark..." refers methods in this paper

  • ...The model f(·) is a linear function that is fitted using the standard least squares approach (see for example Hastie et al. (2001)) for multiple linear regression models, i.e. it should minimize the residual sum of squares ∑ t(Ri(t) − f(~Ri,j(t)))2....

    [...]

Journal ArticleDOI
TL;DR: A hierarchical arrangement of stocks traded in a financial market is found by investigating the daily time series of the logarithm of stock price and the hierarchical tree of the subdominant ultrametric space associated with the graph provides a meaningful economic taxonomy.
Abstract: I find a hierarchical arrangement of stocks traded in a financial market by investigating the daily time series of the logarithm of stock price. The topological space is a subdominant ultrametric space associated with a graph connecting the stocks of the portfolio analyzed. The graph is obtained starting from the matrix of correlation coefficient computed between all pairs of stocks of the portfolio by considering the synchronous time evolution of the difference of the logarithm of daily stock price. The hierarchical tree of the subdominant ultrametric space associated with the graph provides a meaningful economic taxonomy.

1,808 citations


"Detecting correlation in stock mark..." refers background in this paper

  • ...Mantegna (see Mantegna (1999)) discovered a hierarchical Preprint submitted to Elsevier Science 25 April 2004 organization inside a portfolio of stocks by introducing a metric related to the correlation coefficients....

    [...]

  • ...The cross-correlation matrix shows some interesting structures, for example are there obvious clusters, there were described by Mantegna (1999)....

    [...]

  • ...Tp...

    [...]

Journal ArticleDOI
TL;DR: A method is presented for detecting weak coupling between (chaotic) dynamical systems below the threshold of (generalized) synchronization using reconstruction of mixed states consisting of delayed samples taken from simultaneously measured time series of both systems.
Abstract: A method is presented for detecting weak coupling between (chaotic) dynamical systems below the threshold of (generalized) synchronization. This approach is based on reconstruction of mixed states consisting of delayed samples taken from simultaneously measured time series of both systems.

50 citations


"Detecting correlation in stock mark..." refers background in this paper

  • ...The scheme we introduce for market analysis is related to the “mixed state analysis” of multivariate time series which was developed to detect weak coupling between dynamical systems in the framework of chaotic synchronization (see Wiesenfeldt et al. (2001))....

    [...]

  • ...We build the time series of daily returns Ri(t) = Yi(t + 1)− Yi(t) Yi(t) , wherein Yi(t) denotes the closing-price of the i-th stock at day t....

    [...]

Frequently Asked Questions (8)
Q1. What have the authors contributed in "Detecting correlation in stock market" ?

The authors present a new method for detecting dependencies in the stock market. 

The scheme the authors introduce for market analysis is related to the “mixed state analysis” of multivariate time series which was developed to detect weak coupling between dynamical systems in the framework of chaotic synchronization (see Wiesenfeldt et al. (2001)). 

The modeling error is normalized with the variance of the time series Ri(t) for a simple reason: A value of cp(i, j) ≥ 1.0 indicates that the mean value 〈Ri〉 is a more appropriate model than f(·), which means that there is no linear dependence in the the time series under investigation. 

The model f(·) is a linear function that is fitted using the standard least squares approach (see for example Hastie et al. (2001)) for multiple linear regression models, i.e. it should minimize the residual sum of squares ∑t(Ri(t) − f(~Ri,j(t)))2. 

The analysis of the the cross-correlation matrix of the returns plays an important role in portfolio theory and financial analysis. 

By definition the correlation matrix is symmetric with respect to i and j and thus cannot be used to distinguish a symmetrical interaction between different stocks from an asymmetric one. 

For the mixed state analysis the authors use a time lag of τ = 3 and the authors calculate the matrix of the modeling error 1 as defined in equation 2 and further the matrix of differences δ(i, j) from equation 3. 

This approach is based on the reconstruction of mixed states consisting of delayed samples taken from simultaneously measured time series of both systems under investigation. 

Trending Questions (1)
How can we predict stock market correlation?

The paper proposes a method of detecting correlations in the stock market by building cross prediction models and using normalized modeling error as a correlation measure.