scispace - formally typeset

Journal ArticleDOI

Detecting correlation in stock market

01 Dec 2004-Physica A-statistical Mechanics and Its Applications (North-Holland)-Vol. 344, Iss: 1, pp 308-311

TL;DR: In order to find hidden correlations in the daily returns, this work builds cross prediction models and uses the normalized modeling error as a generalized correlation measure that extends the concept of the classical correlation matrix.
Abstract: We present a new method for detecting dependencies in the stock market. In order to find hidden correlations in the daily returns, we build cross prediction models and use the normalized modeling error as a generalized correlation measure that extends the concept of the classical correlation matrix.
Topics: Stock market (55%), Order (exchange) (53%)

Summary (1 min read)

1 Introduction

  • , where the brackets indicate the time average over all trading days in the investigated period.
  • Following their investigations the authors see strong indications that this asymmetric interaction exists in a way that the dynamics of single stocks are leading the dynamics of others significantly.
  • The authors indicate this with a cross modeling scheme which is described in the following section.

2 Mixed State Analysis

  • For δ(i, j) > 0 the authors have cp(i, j) > cp(j, i) which means that the returns of the i-th stock contain more useful information to model the returns of the j-th stock than the other way around.
  • In the terms of synchronization this indicates an asymmetrical coupling strength between the two stocks.

3 Numerical Simulations

  • The authors investigate 600 trading days of the Dow-Jones Industrial Average (DJIA) between 2-Oct-2000 and 3-Mar-2003.
  • For all 30 stocks in the DJIA, the authors build the time series of daily returns and calculate the cross-correlation matrix ρ(i, j) (see equation 1).
  • The stocks that behave anti correlated with respect to the index (the blue stripes in the correlation matrix) occur in cp(i, j) with an modeling error near one.
  • In the matrix of the error differences δ(i, j) the authors find the amount of asymmetry regarding their mixed state analysis that offers a field of further investigations.
  • The next step will be a detailed analysis of the time dependence of these asymmetries an the nonlinear dependencies in the stock market.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Detecting Correlation in Stock Market
org D. Wichard, Christian Merkwirth, Maciej OgorzaÃlek
a,b,a
a
AGH University of Science and Technology
Department of Electrical Engineering
al. Mickiewicza 30
30-059 Krak´ow, Poland
b
Max-Planck-Institut f¨ur Informatik
Stuhlsatzenhausweg 85
66123 Saarbr¨ucken, Germany
Abstract
We present a new method for detecting dependencies in the stock market. In order
to find hidden correlations in the daily returns, we build cross prediction models
and use the normalized modeling error as a generalized correlation measure that
extends the concept of the classical correlation matrix.
Key words: Econophysics, Multivariate analysis, Time series analysis
PACS: 89.65.Gh, 02.50.Sk, 05.45.Tp
1 Introduction
The analysis of the the cross-correlation matrix of the returns plays an impor-
tant role in portfolio theory and financial analysis. We build the time series
of daily returns
R
i
(t) =
Y
i
(t + 1) Y
i
(t)
Y
i
(t)
,
wherein Y
i
(t) denotes the closing-price of the i-th stock at day t. The cross-
correlation matrix of the returns is defined as
ρ
ij
=
hR
i
R
j
i hR
i
ihR
j
i
q
hR
2
i
hR
i
i
2
ihR
2
j
hR
j
i
2
i
,
where the brackets indicate the time average over all trading days in the
investigated period. The analysis of ρ
ij
leads to some interesting insights in the
market dynamics. Mantegna (see Mantegna (1999)) discovered a hierarchical
Preprint submitted to Elsevier Science 25 April 2004

organization inside a portfolio of stocks by introducing a metric related to
the correlation coefficients. By definition the correlation matrix is symmetric
with respect to i and j and thus cannot be used to distinguish a symmetrical
interaction between different stocks from an asymmetric one. Following our
investigations we see strong indications that this asymmetric interaction exists
in a way that the dynamics of single stocks are leading the dynamics of others
significantly. We indicate this with a cross modeling scheme which is described
in the following section.
2 Mixed State Analysis
The scheme we introduce for market analysis is related to the “mixed state
analysis” of multivariate time series which was developed to detect weak cou-
pling between dynamical systems in the framework of chaotic synchronization
(see Wiesenfeldt et al. (2001)). This approach is based on the reconstruction
of mixed states consisting of delayed samples taken from simultaneously mea-
sured time series of both systems under investigation.
We adopted this idea and changed it for our purpose in a way that a linear
model f(
~
R
i,j
(t)) is constructed that maps the time-lagged returns of the j-th
stock together with the time-lagged returns of the i-th stock
~
R
i,j
(t) = (R
j
(t), R
j
(t 1), . . . , R
j
(t τ), R
i
(t 1), . . . , R
i
(t τ)) (1)
onto the actual returns of the i-th stock R
i
(t). The model f(·) is a linear
function that is fitted using the standard least squares approach (see for ex-
ample Hastie et al. (2001)) for multiple linear regression models, i.e. it should
minimize the residual sum of squares
P
t
(R
i
(t) f (
~
R
i,j
(t)))
2
. We would like
to remark that this model f(·) is for sure not able to make predictions of the
returns for the next day, however it is able to find the relationship between
the actual returns R
i
(t) and R
j
(t) with resp ect to the time lagged returns,
that may contain some information about linear trends on short time scales.
If we consider a portfolio of N different stocks, we can define the N ×N-matrix
of the normalized modeling error as
cp(i, j) =
h(R
i
f(
~
R
i,j
))
2
i
hR
2
i
hR
i
i
2
i
, (2)
where the brackets denote the time average. The modeling error is normalized
with the variance of the time series R
i
(t) for a simple reason: A value of
cp(i, j) 1.0 indicates that the mean value hR
i
i is a more appropriate model
than f (·), which means that there is no linear dep endence in the the time series
under investigation. Smaller values of cp(i, j) give an indication that there is
at least a weak linear interrelation between the dynamics of the returns. In
2

general, the matrix cp(i, j) is not symmetric, i.e. cp(i, j) 6= cp(j, i). We define
the matrix of differences δ(i, j) as
δ(i, j) = cp(i, j) cp(j, i). (3)
The values of δ(i, j) reflect asymmetric dependencies in the market dynamics.
If the returns of i and j are uncorrelated or they interact on the same level,
then we expect δ(i, j) 0.
For δ(i, j) > 0 we have cp(i, j) > cp(j, i) which means that the returns of
the i-th stock contain more useful information to model the returns of the
j-th stock than the other way around. In the terms of synchronization this
indicates an asymmetrical coupling strength between the two stocks.
3 Numerical Simulations
We investigate 600 trading days of the Dow-Jones Industrial Average (DJIA)
between 2-Oct-2000 and 3-Mar-2003. For all 30 stocks in the DJIA, we build
the time series of daily returns and calculate the cross-correlation matrix ρ(i, j)
(see equation 1). For the mixed state analysis we use a time lag of τ = 3 and
we calculate the matrix of the modeling error
1
as defined in equation 2 and
further the matrix of differences δ(i, j) from equation 3. The results are shown
in Figure 2. The cross-correlation matrix shows some interesting structures,
for example are there obvious clusters, there were described by Mantegna
(1999). A part of this structures can be found in the matrix of the modeling
error cp(i, j). The stocks that behave anti correlated with respect to the index
(the blue stripes in the correlation matrix) occur in cp(i, j) with an modeling
error near one. In the matrix of the error differences δ(i, j) we find the amount
of asymmetry regarding our mixed state analysis that offers a field of further
investigations. The next step will be a detailed analysis of the time dependence
of these asymmetries an the nonlinear dependencies in the stock market.
References
Hastie, T., Tibshirani, R., Friedman, J., 2001. The Elements of Statistical
Learning. Springer Series in Statistics. Springer-Verlag.
Mantegna, R., 1999. Hierarchical structure in financial markets. Eur. Phys. J.
B. 11, 193–197.
Wiesenfeldt, M., Parlitz, U., Lauterborn, W., 2001. Mixed state analysis of
multivariate time series. Int. J. Bifurcation and Chaos 11 (8), 2217–2226.
1
In order to achieve a better graphical resolution in the plots, we set the zero
diagonal elements to one.
3

AA
AXP
BA
C
CAT
DD
DIS
EK
GE
GM
HD
HON
HPQ
IBM
INTC
IP
JNJ
JPM
KO
MCD
MMM
MO
MRK
MSFT
PG
SBC
T
UTX
WMT
XOM
−0.5
−0.25
0
0.25
0.5
0.75
1
AA
AXP
BA
C
CAT
DD
DIS
EK
GE
GM
HD
HON
HPQ
IBM
INTC
IP
JNJ
JPM
KO
MCD
MMM
MO
MRK
MSFT
PG
SBC
T
UTX
WMT
XOM
0.50
0.60
0.70
0.80
0.90
+1.00
Student Version of MATLAB
AA
AXP
BA
C
CAT
DD
DIS
EK
GE
GM
HD
HON
HPQ
IBM
INTC
IP
JNJ
JPM
KO
MCD
MMM
MO
MRK
MSFT
PG
SBC
T
UTX
WMT
XOM
−0.2
−0.15
−0.1
−0.05
0
0.05
0.1
0.15
0.2
Fig. 1. The cross-correlation matrix (top), the matrix of the normalized modeling
error cp(i, j) (middle) and the matrix δ(i, j) of the error differences as defined in
equation 3 (bottom) for 600 days of the DJIA (Ticker symbols on the left).
4
Citations
More filters

Journal ArticleDOI
Haiyan Mo1, Jun Wang1, Hongli Niu1Institutions (1)
TL;DR: The proposed prediction model improves the activation function of the neural network, and makes an approach on cross-correlations forecasting with the particular input and output variables, and shows that the EBPNN is advantageous in increasing the predicting precision.
Abstract: A new neural network (EBPNN) is developed.An approach to cross-correlations prediction between financial time series.Empirical research is performed in testing the forecasting effect of EBPNN.Forecasting long-term cross-correlations by training short-term cross-correlations.The proposed model is advantageous in increasing the forecasting precision. An improved neural network is developed to predict the cross-correlations between two financial time series. In order to capture the large fluctuations of data set, an exponent back propagation neural network (EBPNN) is introduced in the present work, which information is not only processed locally in each neural unit by computing the dot product between its input vector and its weight vector, but also processed by adding the dot product between its exponential type function of the input vector and its corresponding new weight vector. The proposed prediction model improves the activation function of the neural network, and makes an approach on cross-correlations forecasting with the particular input and output variables. The empirical research is performed in testing the forecasting effect of the EBPNN model and a comparison to back propagation neural network (BPNN). The empirical results show that the EBPNN is advantageous in increasing the predicting precision.

31 citations


Journal ArticleDOI
Xi Zhang1, Jiawei Shi1, Di Wang1, Binxing Fang2  +1 moreInstitutions (2)
TL;DR: This paper analyzes features with regard to collective sentiment and perception on stock relatedness and predict stock price movements by employing nonlinear models on the basis of tweets from Xueqiu, a popular Chinese Twitter-like social platform specialized for investors.
Abstract: Recent works have shown that social media platforms are able to influence the trends of stock price movements. However, existing works have major focused on the U.S. stock market and lacked attention to certain emerging countries such as China, where retail investors dominate the market. In this regard, as retail investors are prone to be influenced by news or other social media, psychological and behavioral features extracted from social media platforms are thought to well predict stock price movements in the China's market. Recent advances in the investor social network in China enables the extraction of such features from web-scale data. In this paper, on the basis of tweets from Xueqiu, a popular Chinese Twitter-like social platform specialized for investors, we analyze features with regard to collective sentiment and perception on stock relatedness and predict stock price movements by employing nonlinear models. The features of interest prove to be effective in our experiments.

28 citations


Book ChapterDOI
Di Wu1, Yiping Ke1, Jeffrey Xu Yu1, Philip S. Yu2  +1 moreInstitutions (3)
01 Apr 2010-
Abstract: Analyzing the relationships of time series is an important problem for many applications, including climate monitoring, stock investment, traffic control, etc Existing research mainly focuses on studying the relationship between a pair of time series In this paper, we study the problem of discovering leaders among a set of time series by analyzing lead-lag relations A time series is considered to be one of the leaders if its rise or fall impacts the behavior of many other time series At each time point, we compute the lagged correlation between each pair of time series and model them in a graph Then, the leadership rank is computed from the graph, which brings order to time series Based on the leadership ranking, the leaders of time series are extracted However, the problem poses great challenges as time goes by, since the dynamic nature of time series results in highly evolving relationships between time series We propose an efficient algorithm which is able to track the lagged correlation and compute the leaders incrementally, while still achieving good accuracy Our experiments on real climate science data and stock data show that our algorithm is able to compute time series leaders efficiently in a real-time manner and the detected leaders demonstrate high predictive power on the event of general time series entities, which can enlighten both climate monitoring and financial risk control

28 citations


Journal ArticleDOI
J.D. Wichard, Maciej Ogorzalek1Institutions (1)
01 Aug 2007-Neurocomputing
TL;DR: This work describes the use of ensemble methods to build models for time series prediction by using several different model architectures and suggests an iterated prediction procedure to select the final ensemble members.
Abstract: We describe the use of ensemble methods to build models for time series prediction. Our approach extends the classical ensemble methods for neural networks by using several different model architectures. We further suggest an iterated prediction procedure to select the final ensemble members.

26 citations


Proceedings ArticleDOI
01 Dec 2005-
TL;DR: A genetic programming technique (called multi-expression programming) for the prediction of two stock indices is introduced and the performance is compared with an artificial neural network trained using Levenberg-Marquardt algorithm, support vector machine, Takagi-Sugeno Neuro-Fuzzy model and difference boosting neural network.
Abstract: The use of intelligent systems for stock market predictions has been widely established. In this paper, we introduce a genetic programming technique (called multi-expression programming) for the prediction of two stock indices. The performance is then compared with an artificial neural network trained using Levenberg-Marquardt algorithm, support vector machine, Takagi-Sugeno Neuro-Fuzzy model and difference boosting neural network. We considered Nasdaq-100 index of Nasdaq Stock MarketSM and the S&P CNX NIFTY stock index as test data

26 citations


References
More filters


Journal ArticleDOI
01 Aug 2003-Technometrics
TL;DR: Chapter 11 includes more case studies in other areas, ranging from manufacturing to marketing research, and a detailed comparison with other diagnostic tools, such as logistic regression and tree-based methods.
Abstract: Chapter 11 includes more case studies in other areas, ranging from manufacturing to marketing research. Chapter 12 concludes the book with some commentary about the scientiŽ c contributions of MTS. The Taguchi method for design of experiment has generated considerable controversy in the statistical community over the past few decades. The MTS/MTGS method seems to lead another source of discussions on the methodology it advocates (Montgomery 2003). As pointed out by Woodall et al. (2003), the MTS/MTGS methods are considered ad hoc in the sense that they have not been developed using any underlying statistical theory. Because the “normal” and “abnormal” groups form the basis of the theory, some sampling restrictions are fundamental to the applications. First, it is essential that the “normal” sample be uniform, unbiased, and/or complete so that a reliable measurement scale is obtained. Second, the selection of “abnormal” samples is crucial to the success of dimensionality reduction when OAs are used. For example, if each abnormal item is really unique in the medical example, then it is unclear how the statistical distance MD can be guaranteed to give a consistent diagnosis measure of severity on a continuous scale when the larger-the-better type S/N ratio is used. Multivariate diagnosis is not new to Technometrics readers and is now becoming increasingly more popular in statistical analysis and data mining for knowledge discovery. As a promising alternative that assumes no underlying data model, The Mahalanobis–Taguchi Strategy does not provide sufŽ cient evidence of gains achieved by using the proposed method over existing tools. Readers may be very interested in a detailed comparison with other diagnostic tools, such as logistic regression and tree-based methods. Overall, although the idea of MTS/MTGS is intriguing, this book would be more valuable had it been written in a rigorous fashion as a technical reference. There is some lack of precision even in several mathematical notations. Perhaps a follow-up with additional theoretical justiŽ cation and careful case studies would answer some of the lingering questions.

8,899 citations


"Detecting correlation in stock mark..." refers methods in this paper

  • ...The model f(·) is a linear function that is fitted using the standard least squares approach (see for example Hastie et al. (2001)) for multiple linear regression models, i.e. it should minimize the residual sum of squares ∑ t(Ri(t) − f(~Ri,j(t)))2....

    [...]


Journal ArticleDOI
Rosario N. Mantegna1Institutions (1)
TL;DR: A hierarchical arrangement of stocks traded in a financial market is found by investigating the daily time series of the logarithm of stock price and the hierarchical tree of the subdominant ultrametric space associated with the graph provides a meaningful economic taxonomy.
Abstract: I find a hierarchical arrangement of stocks traded in a financial market by investigating the daily time series of the logarithm of stock price. The topological space is a subdominant ultrametric space associated with a graph connecting the stocks of the portfolio analyzed. The graph is obtained starting from the matrix of correlation coefficient computed between all pairs of stocks of the portfolio by considering the synchronous time evolution of the difference of the logarithm of daily stock price. The hierarchical tree of the subdominant ultrametric space associated with the graph provides a meaningful economic taxonomy.

1,598 citations


"Detecting correlation in stock mark..." refers background in this paper

  • ...Mantegna (see Mantegna (1999)) discovered a hierarchical Preprint submitted to Elsevier Science 25 April 2004 organization inside a portfolio of stocks by introducing a metric related to the correlation coefficients....

    [...]

  • ...The cross-correlation matrix shows some interesting structures, for example are there obvious clusters, there were described by Mantegna (1999)....

    [...]

  • ...Tp...

    [...]


Journal ArticleDOI
TL;DR: A method is presented for detecting weak coupling between (chaotic) dynamical systems below the threshold of (generalized) synchronization using reconstruction of mixed states consisting of delayed samples taken from simultaneously measured time series of both systems.
Abstract: A method is presented for detecting weak coupling between (chaotic) dynamical systems below the threshold of (generalized) synchronization. This approach is based on reconstruction of mixed states consisting of delayed samples taken from simultaneously measured time series of both systems.

46 citations


"Detecting correlation in stock mark..." refers background in this paper

  • ...The scheme we introduce for market analysis is related to the “mixed state analysis” of multivariate time series which was developed to detect weak coupling between dynamical systems in the framework of chaotic synchronization (see Wiesenfeldt et al. (2001))....

    [...]

  • ...We build the time series of daily returns Ri(t) = Yi(t + 1)− Yi(t) Yi(t) , wherein Yi(t) denotes the closing-price of the i-th stock at day t....

    [...]


Network Information
Related Papers (5)
27 Oct 1992

Michael P. Perrone, Leon N. Cooper

03 Jan 1992, Neural Computation

Stuart Geman, Elie Bienenstock +1 more

01 Jan 1994

Anders Krogh, Jesper Vedelsby

Performance
Metrics
No. of citations received by the Paper in previous years
YearCitations
20182
20171
20162
20151
20121
20112