scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Shrinking the Cross Section

TL;DR: In this article, a robust stochastic discount factor (SDF) is proposed to summarize the joint explanatory power of a large number of cross-sectional stock return predictors.
Abstract: We construct a robust stochastic discount factor (SDF) that summarizes the joint explanatory power of a large number of cross-sectional stock return predictors. Our method achieves robust out-of-sample performance in this high-dimensional setting by imposing an economically motivated prior on SDF coefficients that shrinks the contributions of low-variance principal components of the candidate factors. While empirical asset pricing research has focused on SDFs with a small number of characteristics-based factors --- e.g., the four- or five-factor models discussed in the recent literature --- we find that such a characteristics-sparse SDF cannot adequately summarize the cross-section of expected stock returns. However, a relatively small number of principal components of the universe of potential characteristics-based factors can approximate the SDF quite well.
Citations
More filters
Journal ArticleDOI
TL;DR: The authors performed a comparative analysis of machine learning methods for the canonical problem of empirical asset pricing: measuring asset risk premia, and demonstrated large economic gains to investors using machine learning forecasts, in some cases doubling the performance of leading regression-based strategies from the literature.
Abstract: We perform a comparative analysis of machine learning methods for the canonical problem of empirical asset pricing: measuring asset risk premia. We demonstrate large economic gains to investors using machine learning forecasts, in some cases doubling the performance of leading regression-based strategies from the literature. We identify the best performing methods (trees and neural networks) and trace their predictive gains to allowance of nonlinear predictor interactions that are missed by other methods. All methods agree on the same set of dominant predictive signals which includes variations on momentum, liquidity, and volatility. Improved risk premium measurement through machine learning simplifies the investigation into economic mechanisms of asset pricing and highlights the value of machine learning in financial innovation.

236 citations

Posted Content
TL;DR: The authors proposed a nonparametric method to test which characteristics provide independent information for the cross-section of expected returns, and used the adaptive group LASSO to select characteristics and to estimate how they affect expected returns nonparametrically.
Abstract: We propose a nonparametric method to test which characteristics provide independent information for the cross section of expected returns. We use the adaptive group LASSO to select characteristics and to estimate how they affect expected returns nonparametrically. Our method can handle a large number of characteristics, allows for a exible functional form, and is insensitive to outliers. Many of the previously identified return predictors do not provide incremental information for expected returns, and nonlinearities are important. Our proposed method has higher out-of-sample explanatory power compared to linear panel regressions, and increases Sharpe ratios by 50%.

199 citations

Journal ArticleDOI
TL;DR: A model-selection method to systematically evaluate the contribution to asset pricing of any new factor, above and beyond what a high-dimensional set of existing factors explains, is proposed.
Abstract: We propose a model-selection method to systematically evaluate the contribution to asset pricing of any new factor, above and beyond what a high-dimensional set of existing factors explains. Our methodology explicitly accounts for potential model-selection mistakes that produce a bias due to the omitted variables, unlike the standard approaches that assume perfect variable selection, which rarely occurs in practice. We apply our procedure to a set of factors recently discovered in the literature. While most of these new factors are found to be redundant relative to the existing factors, a few — such as profitability — have statistically significant explanatory power beyond the hundreds of factors proposed in the past. In addition, we show that our estimates and their significance are stable, whereas the model selected by simple LASSO is not. Finally, we provide additional applications of our procedure that illustrate how it could help control the proliferation of factors in the zoo.

127 citations


Cites background or methods from "Shrinking the Cross Section"

  • ...Kozak et al. (2017) use model-selection techniques to approximate the SDF and the mean-variance efficient portfolio as a function of many test portfolios, and compare sparse models based on principal 1Some of the factors proposed in the literature are based on economic theory (e.g., Breeden (1979),…...

    [...]

  • ...6For any matrix A, we use Ai,· and A·,j to denote the ith row and jth column of A, respectively. selection on top of PCs similar to Kozak et al. (2017))....

    [...]

  • ...…selection step corresponds closely to the approach taken in the current literature dealing with the proliferation of asset pricing factors (e.g., Kozak et al. (2017)): take a large set of factors (ht), apply some dimension-reduction method (LASSO, Elastic net, PCA, etc.), and interpret the…...

    [...]

  • ...Kozak et al. (2017) use model-selection techniques to approximate the SDF and the mean-variance efficient portfolio as a function of many test portfolios, and compare sparse models based on principal...

    [...]

  • ...…the cross-sectional LASSO, closely related to the dimension-reduction methods that recent papers in asset pricing have been using to tackle the factor zoo (e.g., Kozak et al. (2017)): the objective of this first LASSO is to select a parsimonious model that explains the cross-section of risk premia....

    [...]

Posted Content
TL;DR: The authors used deep neural networks to estimate an asset pricing model for individual stock returns that takes advantage of the vast amount of conditioning information, while keeping a fully flexible form and accounting for time-variation.
Abstract: We use deep neural networks to estimate an asset pricing model for individual stock returns that takes advantage of the vast amount of conditioning information, while keeping a fully flexible form and accounting for time-variation. The key innovations are to use the fundamental no-arbitrage condition as criterion function, to construct the most informative test assets with an adversarial approach and to extract the states of the economy from many macroeconomic time series. Our asset pricing model outperforms out-of-sample all benchmark approaches in terms of Sharpe ratio, explained variation and pricing errors and identifies the key factors that drive asset prices.

119 citations


Cites background or methods from "Shrinking the Cross Section"

  • ...…onto quantiles of characteristics are exactly the input to PCA in Kelly, Pruitt, and Su (2019) or the elastic net mean-variance optimization in Kozak, Nagel, and Santosh (2020).12 The solution to minimizing the sum of squared errors in these moment conditions is a simple mean-variance…...

    [...]

  • ...Kozak, Nagel, and Santosh (2020) estimate the SDF based on characteristic sorted factors with a modified elastic net regression.5 Kelly, Pruitt, and Su (2019) apply PCA to stock returns projected on characteristics to obtain a conditional multi-factor model where the loadings are linear in the…...

    [...]

  • ...This is a standard transformation to deal with the different scales and has also been used in Kelly, Pruitt, and Su (2019), Kozak, Nagel, and Santosh (2020) or Freyberger, Neuhierl, and Weber (2020) among others....

    [...]

  • ...The linear approach with elastic net is closely related to Kozak, Nagel, and Santosh (2020) who perform mean-variance optimization with an elastic net penalty on characteristic based factors.21 In addition we also report the maximum Sharpe ratios for the tangency portfolios based on the Fama-French…...

    [...]

  • ...28Kelly, Pruitt, and Su (2019) and Kozak, Nagel, and Santosh (2020) construct factors in this way....

    [...]

ReportDOI
TL;DR: This paper proposed a nonparametric method to test which characteristics provide independent information for the cross-section of expected returns, and used the adaptive group LASSO to select characteristics and to estimate how they affect expected returns nonparametrically.
Abstract: We propose a nonparametric method to test which characteristics provide independent information for the cross section of expected returns. We use the adaptive group LASSO to select characteristics and to estimate how they affect expected returns nonparametrically. Our method can handle a large number of characteristics, allows for a exible functional form, and is insensitive to outliers. Many of the previously identified return predictors do not provide incremental information for expected returns, and nonlinearities are important. Our proposed method has higher out-of-sample explanatory power compared to linear panel regressions, and increases Sharpe ratios by 50%.

100 citations

References
More filters
Journal ArticleDOI
TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Abstract: SUMMARY We propose a new method for estimation in linear models. The 'lasso' minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant. Because of the nature of this constraint it tends to produce some coefficients that are exactly 0 and hence gives interpretable models. Our simulation studies suggest that the lasso enjoys some of the favourable properties of both subset selection and ridge regression. It produces interpretable models like subset selection and exhibits the stability of ridge regression. There is also an interesting relationship with recent work in adaptive function estimation by Donoho and Johnstone. The lasso idea is quite general and can be applied in a variety of statistical models: extensions to generalized regression models and tree-based models are briefly described.

40,785 citations


"Shrinking the Cross Section" refers background or methods in this paper

  • ...…have noted that Lasso does not perform well when regressors are correlated and that ridge regression (with L2-norm penalty) or elastic net (with a combination of L1- and L2-norm penalties) delivers better prediction performance than Lasso in these cases (Tibshirani, 1996; Zou and Hastie, 2005)....

    [...]

  • ...Lasso is known to suffer from relatively poor performance compared with ridge and elastic net when regressors are highly correlated (Tibshirani, 1996; Zou and Hastie, 2005)....

    [...]

  • ...To allow for factor selection, we augment the estimation criterion with an additional penalty on the sum of absolute SDF coefficients (L1 norm), which is typically used in Lasso regression (Tibshirani, 1996) and naturally leads to sparse solutions....

    [...]

Journal ArticleDOI
TL;DR: In this article, the authors identify five common risk factors in the returns on stocks and bonds, including three stock-market factors: an overall market factor and factors related to firm size and book-to-market equity.

24,874 citations


"Shrinking the Cross Section" refers background or methods or result in this paper

  • ...We focus on the well known 25 ME/BM sorted portfolios from Fama and French (1993). We show that our method automatically recovers an SDF that is similar to the one based on the SMB and HML factors constructed intuitively by Fama and French (1993)....

    [...]

  • ...For instance, the CAPM predicts a single factor representation; the 5-factor model of Fama and French (2016) and investment-based asset pricing models represent an SDF in terms of size, book-to-market and/or investment, and profitability characteristics....

    [...]

  • ...We focus on the well known 25 ME/BM sorted portfolios from Fama and French (1993)....

    [...]

  • ...For example, Fama and French (1993) use two characteristics: market capitalization and the book-to-market equity ratio....

    [...]

  • ...To summarize, these results confirm that our method can recover the SDF that Fama and French (1993) constructed intuitively for this set of portfolios....

    [...]

Book
28 Jul 2013
TL;DR: In this paper, the authors describe the important ideas in these areas in a common conceptual framework, and the emphasis is on concepts rather than mathematics, with a liberal use of color graphics.
Abstract: During the past decade there has been an explosion in computation and information technology. With it have come vast amounts of data in a variety of fields such as medicine, biology, finance, and marketing. The challenge of understanding these data has led to the development of new tools in the field of statistics, and spawned new areas such as data mining, machine learning, and bioinformatics. Many of these tools have common underpinnings but are often expressed with different terminology. This book describes the important ideas in these areas in a common conceptual framework. While the approach is statistical, the emphasis is on concepts rather than mathematics. Many examples are given, with a liberal use of color graphics. It is a valuable resource for statisticians and anyone interested in data mining in science or industry. The book's coverage is broad, from supervised learning (prediction) to unsupervised learning. The many topics include neural networks, support vector machines, classification trees and boosting---the first comprehensive treatment of this topic in any book. This major new edition features many topics not covered in the original, including graphical models, random forests, ensemble methods, least angle regression and path algorithms for the lasso, non-negative matrix factorization, and spectral clustering. There is also a chapter on methods for ``wide'' data (p bigger than n), including multiple testing and false discovery rates. Trevor Hastie, Robert Tibshirani, and Jerome Friedman are professors of statistics at Stanford University. They are prominent researchers in this area: Hastie and Tibshirani developed generalized additive models and wrote a popular book of that title. Hastie co-developed much of the statistical modeling software and environment in R/S-PLUS and invented principal curves and surfaces. Tibshirani proposed the lasso and is co-author of the very successful An Introduction to the Bootstrap. Friedman is the co-inventor of many data-mining tools including CART, MARS, projection pursuit and gradient boosting.

19,261 citations

Journal ArticleDOI
TL;DR: It is shown that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation, and an algorithm called LARS‐EN is proposed for computing elastic net regularization paths efficiently, much like algorithm LARS does for the lamba.
Abstract: Summary. We propose the elastic net, a new regularization and variable selection method. Real world data and a simulation study show that the elastic net often outperforms the lasso, while enjoying a similar sparsity of representation. In addition, the elastic net encourages a grouping effect, where strongly correlated predictors tend to be in or out of the model together.The elastic net is particularly useful when the number of predictors (p) is much bigger than the number of observations (n). By contrast, the lasso is not a very satisfactory variable selection method in the

16,538 citations


"Shrinking the Cross Section" refers background or methods in this paper

  • ...Lasso is known to suffer from relatively poor performance compared with ridge and elastic net when regressors are highly correlated (Tibshirani, 1996; Zou and Hastie, 2005)....

    [...]

  • ...…have noted that Lasso does not perform well when regressors are correlated and that ridge regression (with L2-norm penalty) or elastic net (with a combination of L1- and L2-norm penalties) delivers better prediction performance than Lasso in these cases (Tibshirani, 1996; Zou and Hastie, 2005)....

    [...]

  • ...(28) we use the LARS-EN algorithm in Zou and Hastie (2005). to represent the value effect in an SDF, it may be advantageous to consider a weighted average of multiple measures of value, such as book-to-market, price-dividend, and cashflowto-price ratios....

    [...]

  • ...39 we use the LARS-EN algorithm in Zou and Hastie (2005), with few small modifications that impose economic restrictions specific to our setup....

    [...]

  • ...The approach is motivated by lasso regression and elastic net (Zou and Hastie, 2005)....

    [...]

Journal ArticleDOI
TL;DR: In this article, the authors show that strategies that buy stocks that have performed well in the past and sell stocks that had performed poorly in past years generate significant positive returns over 3- to 12-month holding periods.
Abstract: This paper documents that strategies which buy stocks that have performed well in the past and sell stocks that have performed poorly in the past generate significant positive returns over 3- to 12-month holding periods. We find that the profitability of these strategies are not due to their systematic risk or to delayed stock price reactions to common factors. However, part of the abnormal returns generated in the first year after portfolio formation dissipates in the following two years. A similar pattern of returns around the earnings announcements of past winners and losers is also documented

10,806 citations