scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Estimating Standard Errors in Finance Panel Data Sets: Comparing Approaches

01 Jan 2009-Review of Financial Studies (Oxford University Press)-Vol. 22, Iss: 1, pp 435-480
TL;DR: In this article, the authors examine the different methods used in the literature and explain when the different approaches yield the same (and correct) standard errors and when they diverge, and give researchers guidance for their use.
Abstract: In both corporate finance and asset pricing empirical work, researchers are often confronted with panel data. In these data sets, the residuals may be correlated across firms and across time, and OLS standard errors can be biased. Historically, the two literatures have used different solutions to this problem. Corporate finance has relied on clustered standard errors, while asset pricing has used the Fama-MacBeth procedure to estimate standard errors. This paper examines the different methods used in the literature and explains when the different methods yield the same (and correct) standard errors and when they diverge. The intent is to provide intuition as to why the different approaches sometimes give different answers and give researchers guidance for their use.

Summary (4 min read)

Introduction

  • It is well known that OLS standard errors are unbiased when the residuals are independent and identically distributed.
  • Thirty-four percent of the papers estimated both the coefficients and the standard errors using the Fama-MacBeth procedure (Fama-MacBeth, 1973) .
  • There are two general forms of dependence which are most common in finance applications.
  • The residuals of a given firm may be correlated across years (time series dependence) for a given firm.
  • Of the most common approaches used in the literature and examined in this paper, only clustered standard errors are unbiased as they account for the residual dependence created by the firm effect.

A)

  • To provide intuition on why the standard errors produced by OLS are incorrect and how alternative estimation methods correct this problem, it is helpful to very briefly review the expression for the variance of the estimated coefficients.
  • This is the standard OLS formula and is based on the assumption that the errors are independent and identically distributed (Greene, 1990) .
  • Each observation of the dependent variable is a monthly equity return.
  • Since the adjustment in the standard error, and the bias in White standard errors, is a function of the monthly auto-correlation in the Xs (a large number) times the auto-correlation in the residuals (zero), the standard errors clustered by firm are equal to the White standard errors.
  • If the time effect influenced each firm in a given month by the same amount, the time dummies would absorb the effect and clustering by time would not change the reported standard errors.

Var [ βOLS

  • I use the assumption that residuals are independent across firms in deriving the second line.
  • To understand this intuition, consider the extreme case where the independent variables and residuals are perfectly correlated across time (i.e. ρ X =1 and ρ ε =1).
  • The basic program which I used to simulate the data and estimate the coefficients and standard errors is posted on my web site.
  • Estimated standard error will shrink accordingly and incorrectly.
  • The correlation of the residuals within cluster is the problem the clustered standard errors are designed to correct.

B)

  • Testing the Standard Error Estimates by Simulation.
  • The estimated standard errors are extremely close to the true standard errors and the number of statistically significant t-statistics is close to three percent across the simulations (using a 1 percent critical value).
  • Once the firm effect is temporary, the OLS standard errors again underestimate the true standard errors even when firm dummies are included in the regression (Wooldridge, 2003, Baker, Stein, and Wurgler, 2003) .
  • In the asset pricing example, these standard errors were identical to the standard errors clustered by time, since there was no firm effect (Table 6 ).
  • The results are similar for firm size, firm age, asset tangibility (the ratio of property, plant, and equipment to assets), and R&D expenditure.

C)

  • An alternative way to estimate the regression coefficients and standard errors when the residuals are not independent is the Fama-MacBeth approach (Fama and MacBeth, 1973) .
  • And the estimated variance of the Fama-MacBeth estimate is calculated as: This is rarely done in the finance literature.
  • The GLS estimates are more efficient than the OLS estimates (with or without firm dummies) when the residuals are correlated (compare Table 5 -Panels A and B).
  • If the firm effect is temporary, then the residuals are still correlated within cluster and this is the source of the bias in the standard errors.

MacBeth coefficient estimates.

  • This result is the same as their expression for the variance of the OLS coefficient (see equation 7).
  • The Fama-MacBeth standard error are biased in exactly the same way as the OLS estimates.
  • In both cases, the magnitude of the bias is a function of the serial correlation of both the independent variable and the residual within a cluster and the number of time periods per firm.

D)

  • Since the average first-order auto-correlation is negative, the adjusted Fama-MacBeth standard errors are even more biased than the unadjusted standard errors.
  • To verify that this is correct, I re-ran the simulation using 20 years of data per firm and the average estimated serial correlation moved closer to zero, rising from -0.1157 to -0.0556.

E)

  • Incorrect Standard Error Estimates in Published Papers.
  • As part of my literature survey, I looked for papers which ran a regression of one persistent firm characteristic on other persistent firm characteristics (i.e. the serial correlation of the variables is large and dies away slowly as the lag 11 Both of these papers correct the Fama-MacBeth standard errors for the first order auto-correlation of the estimated slopes.
  • Pastor and Veronesi (2003) report that this does not change their answer.
  • I will show in Section V-C that this correction still produces biased standard errors and this probably explains Pastor and Veronesi's finding that the adjustment has little effect on their estimated standard errors.
  • 12 Baker and Wurgler (2002) estimate both White and Fama-MacBeth standard errors but do not report the Fama-MacBeth standard errors since they are the same as the White standard errors.

F)

  • An alternative approach for addressing the correlation of errors across observation is the Newey-West procedure (Newey and West, 1987) .
  • Thus having a lag length of less than the maximum (T-1), will cause the Newey-West standard errors to underestimate the true standard error when the firm effect is fixed.
  • When I drew observations as a cluster (e.g. I drew 500 firms with replacement and took all 10 years for any firm which was drawn), the estimated standard errors are the same as the clustered standard errors (e.g. 0.0505 for bootstrap versus 0.0508 for clustered).
  • Newey and West show that if M is allowed to grow at the correct rate with the sample size (T), then their estimate is consistent.

III)

  • To demonstrate how the techniques work in the presence of a time effect, I generated data sets which contain only a time effect (observations on different firms within the same year are correlated).
  • The expressions for the standard errors in the presence of only a time effect are correct once I exchange N and T. EQUATION A) Clustered Standard Error Estimates.
  • The problem arises due to the limited number of clusters (e.g. years).
  • To explore this issue, I simulated data sets of 5,000 observations with the number of years (or clusters) ranging from 5 to 100.
  • The bias in the clustered standard error estimates declines with the number of clusters, dropping from 27 percent when there are 5 years (or clusters) to 3 percent when there are 40 years to 1 percent when there are 100 years .

IV)

  • Estimating Standard Errors in the Presence of a Fixed Firm and Time Effect.
  • Since EQUATION ) researchers do not always know the precise form of the dependence, a less parametric approach may be preferred.
  • To illustrate the performance of standard errors clustered by firm, year, or both, I simulated data sets with a fixed firm and time effect.
  • Clustering by two dimensions produces less biased standard errors.
  • In my simulations, the number of t-statistics which are greater than 2.58 rises to 5% when the number of firms or time periods falls to 10 (see Thompson, 2005 for more complete results).

V) Estimating Standard Errors in the Presence of a Temporary Firm Effect

  • The analysis thus far has assumed that the firm effect is fixed.
  • The dependence between residuals may decay as the time between them increases (e.g. ρ(ε t , ε t-k ) may decline with k).
  • Assuming homoscedasticity makes the interpretation of the results simpler.
  • In addition, if the performance of the different standard error estimates depends on the permanence of the firm effect, researchers need to know this.

VI)

  • I used simulated data in the previous sections.
  • In real world applications, the authors may have priors about the data's structure (are firm effects or time effects more important, are they permanent or temporary), but they do not know the data structure for certain.
  • This way I can demonstrate how the different methods for estimating standard errors compare, confirm that the methods used by some published papers have produced significantly biased results, and show what the authors can learn from the different standard errors estimates.
  • The constant is calculated as the average of the yearly intercepts.
  • Thus the Fama-MacBeth R 2 does not include the explanatory power of time dummies.

VII) Conclusions.

  • It is well known from first-year econometrics classes that OLS and White standard errors are biased when the residuals are not independent.
  • The standard errors clustered by firm are unbiased and produce correctly sized confidence intervals whether the firm effect is permanent or temporary.
  • Alternatively, researchers can cluster by multiple dimensions, assuming there are a sufficient number of clusters in each dimension.
  • The fraction of the independent variable's variance which is due to a firm specific component varies across the columns of the table from 0 percent (no firm effect) to 75 percent.
  • The second entry is the standard deviation of the coefficient estimated by Fama-MacBeth.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

NBER WORKING PAPER SERIES
ESTIMATING STANDARD ERRORS
IN FINANCE PANEL DATA SETS:
COMPARING APPROACHES
Mitchell A. Petersen
Working Paper 11280
http://www.nber.org/papers/w11280
NATIONAL BUREAU OF ECONOMIC RESEARCH
1050 Massachusetts Avenue
Cambridge, MA 02138
April 2005
I thank the Financial Institutions and Markets Research Center at Northwestern University’s Kellogg School
for support. In writing this paper, I have benefitted greatly from discussions with Kent Daniel, Mariassunta
Giannetti, Toby Moskowitz, Joshua Rauh, Michael Roberts, Paola Sapienza, Doug Staiger, and Annette
Vissing-Jorgensen as well as the comments of seminar participants at the Federal Reserve Bank of Chicago,
Northwestern University, and the Universities of Chicago, Columbia, and Iowa. The research assistance of
Sungjoon Park, Nick Halpern, and Casey Liang is greatly appreciated. The views expressed herein are those
of the author(s) and do not necessarily reflect the views of the National Bureau of Economic Research.
©2005 by Mitchell A. Petersen. All rights reserved. Short sections of text, not to exceed two paragraphs,
may be quoted without explicit permission provided that full credit, including © notice, is given to the
source.

Estimating Standard Errors in Finance Panel Data Sets: Comparing Approaches
Mitchell A. Petersen
NBER Working Paper No. 11280
April 2005, Revised June 2006
JEL No. G1, G3, C1
ABSTRACT
In both corporate finance and asset pricing empirical work, researchers are often confronted with
panel data. In these data sets, the residuals may be correlated across firms and across time, and OLS
standard errors can be biased. Historically, the two literatures have used different solutions to this
problem. Corporate finance has relied on Rogers standard errors, while asset pricing has used the
Fama-MacBeth procedure to estimate standard errors. This paper will examine the different methods
used in the literature and explain when the different methods yield the same (and correct) standard
errors and when they diverge. The intent is to provide intuition as to why the different approaches
sometimes give different answers and thus give researchers guidance for their use.
Mitchell A. Petersen
Kellogg Graduate School of Management
Northwestern University
2001 Sheridan Road
Evanston, IL 60208
and NBER
petersen@northwestern.edu

1
I searched papers published in the Journal of Finance, the Journal of Financial Economics, and the Review
of Financial Studies in the years 2001- 2004 for a description of how the coefficients and standard errors were estimated
in a panel data set. I included both linear regressions as well as non-linear techniques such as logits and tobits in my
survey. Panel data sets are data sets where observations can be grouped into clusters (e.g. multiple observations per firm,
per industry, per year, or per country). I included only papers which report at least five observations in each dimension
(e.g. firms and years). 207 papers met the selection criteria. Papers which did not report the method for estimating the
standard errors, or reported correcting the standard errors only for heteroscedasticity (i.e. White standard errors which
are not robust to within cluster dependence) are coded as not having corrected the standard errors for within cluster
dependence. Where the paper’s description was ambiguous, I contacted the authors.
Although White or OLS standard errors may be correct, many of the published papers report regressions where
I would expect the residuals to be correlated across observations on the same firm in different years (e.g. bid-ask spread
regressed on exchange dummies, stock price, volatility, and average daily volume or leverage regressed on the market
to book ratio and firm size) or correlated across observations on different firms in the same year (e.g. equity returns
regresses on earnings surprises). In these cases, the bias in the standard errors can be quite large. See Section VI for two
illustrations.
1
I) Introduction
It is well known that OLS standard errors are unbiased when the residuals are independent
and identically distributed. When the residuals are correlated across observations, OLS standard
errors can be biased and either over or underestimate the true variability of the coefficient estimates.
Although the use of panel data sets (e.g. data sets that contain observations on multiple firms in
multiple years) is common in finance, the ways that researchers have addressed possible biases in
the standard errors varies widely and in many cases is incorrect. In recently published finance papers
which include a regression on panel data, forty-two percent of the papers did not adjust the standard
errors for possible dependence in the residuals.
1
Approaches for estimating the coefficients and
standard errors in the presence of within cluster correlation varied among the remaining papers.
Thirty-four percent of the papers estimated both the coefficients and the standard errors using the
Fama-MacBeth procedure (Fama-MacBeth, 1973). Twenty-nine percent of the papers included
dummy variables for each cluster (e.g. fixed effects or within estimation). The next two most
common methods used OLS (or an analogous method) to estimate the coefficients but reported
standard errors adjusted for correlation within a cluster. Seven percent of the papers adjusted the

2
standard errors using the Newey-West procedure (Newey and West, 1987) modified for use in a
panel data set, while 23 percent of the papers reported clustered standard errors (Williams, 2000,
Rogers, 1993, Andrews, 1991, Moulton, 1990, Arellano, 1987, Moulton, 1986, Liang, and Zeger,
1986) which are White standard errors adjusted to account for possible correlation within a cluster.
These are also called Rogers standard errors in the finance literature.
Although the literature has used a diversity of methods to estimate standard errors in panel
data sets, the chosen method is often incorrect and the literature provides little guidance to
researchers as to which method should be used. In addition, some of the advice in the literature is
simply wrong. Since the methods sometimes produce incorrect estimates, it is important to
understand how the methods compare and how to select the correct one. This is the paper’s
objective.
There are two general forms of dependence which are most common in finance applications.
They will serve as the basis for the analysis. The residuals of a given firm may be correlated across
years (time series dependence) for a given firm. I will call this an unobserved firm effect (see
Wooldridge, 2002). Alternatively, the residuals of a given year may be correlated across different
firms (cross-sectional dependence). I will call this a time effect. I will simulate panel data with both
forms of dependence, first individually and then jointly. With the simulated data, I can estimate the
coefficients and standard errors using each of the methods and compare their relative performance.
Section II contains the standard error estimates in the presence of an unobserved firm effect.
My results show that both OLS and the Fama-MacBeth standard errors are biased downward. The
Newey-West standard errors, as modified for panel data, are also biased but the bias is small. Of the
most common approaches used in the literature and examined in this paper, only clustered standard

3
errors are unbiased as they account for the residual dependence created by the firm effect. In Section
III, the same analysis is conducted with an unobserved time effect instead of a firm effect. Since the
Fama-MacBeth procedure is designed to address a time effect, the Fama-MacBeth standard errors
are unbiased. The intuition of these first two sections carries over to Section IV, were I simulate data
with both a firm and a time effect.
I initially specified the firm effect as a constant (e.g. it does not decay over time). In practice,
the firm effect may decay and so the correlation between residuals declines as the time between them
grows. In Section V, I simulate data with a more general correlation structure. This allows me to
compare OLS, clustered, and Fama-MacBeth standard errors in a more general setting. Simulating
the temporary firm effect also allows me to examine the relative accuracy of two additional methods
for adjusting standard errors: fixed effects (firm dummies) and adjusted Fama-MacBeth standard
errors whose use is becoming more common. I show that including firm dummies eliminates the bias
in OLS standard errors only when the firm effect is fixed. I also show that even after adjusting
Fama-MacBeth standard errors, as suggested by some authors (Cochrane, 2001), they are still
biased.
Most papers do not report standard errors estimated by multiple methods. Thus in Section
VI, I apply the various estimation techniques to two real data sets and compare their relative
performance. This serves two purposes. First, it demonstrates that the methods used in some
published papers produce biases in the standard errors and t-statistics which are very large. This is
why using the correct method to estimate standard errors is important. Examining actual data also
allows me to show how differences in standard error estimates can provide information about the
deficiency in a model and directions for improving them.

Citations
More filters
Journal ArticleDOI
TL;DR: This work considers statistical inference for regression when data are grouped into clusters, with regression model errors independent across clusters but correlated within clusters, when the number of clusters is large and default standard errors can greatly overstate estimator precision.
Abstract: We consider statistical inference for regression when data are grouped into clus- ters, with regression model errors independent across clusters but correlated within clusters. Examples include data on individuals with clustering on village or region or other category such as industry, and state-year dierences-in-dierences studies with clustering on state. In such settings default standard errors can greatly overstate es- timator precision. Instead, if the number of clusters is large, statistical inference after OLS should be based on cluster-robust standard errors. We outline the basic method as well as many complications that can arise in practice. These include cluster-specic �xed eects, few clusters, multi-way clustering, and estimators other than OLS.

3,236 citations

Journal ArticleDOI
TL;DR: The authors proposed a variance estimator for the OLS estimator as well as for nonlinear estimators such as logit, probit, and GMM that enables cluster-robust inference when there is two-way or multiway clustering that is nonnested.
Abstract: In this article we propose a variance estimator for the OLS estimator as well as for nonlinear estimators such as logit, probit, and GMM. This variance estimator enables cluster-robust inference when there is two-way or multiway clustering that is nonnested. The variance estimator extends the standard cluster-robust variance estimator or sandwich estimator for one-way clustering (e.g., Liang and Zeger 1986; Arellano 1987) and relies on similar relatively weak distributional assumptions. Our method is easily implemented in statistical packages, such as Stata and SAS, that already offer cluster-robust standard errors when there is one-way clustering. The method is demonstrated by a Monte Carlo analysis for a two-way random effects model; a Monte Carlo analysis of a placebo law that extends the state–year effects example of Bertrand, Duflo, and Mullainathan (2004) to two dimensions; and by application to studies in the empirical literature where two-way clustering is present.

2,542 citations

Journal ArticleDOI
TL;DR: This article examined the relative importance of many factors in the capital structure decisions of publicly traded American firms from 1950 to 2003 and found that the most reliable factors for explaining market leverage are: median industry leverage, market-to-book assets ratio (−), tangibility (+), profits (−), log of assets (+), and expected inflation (+).
Abstract: This paper examines the relative importance of many factors in the capital structure decisions of publicly traded American firms from 1950 to 2003. The most reliable factors for explaining market leverage are: median industry leverage (+ effect on leverage), market-to-book assets ratio (−), tangibility (+), profits (−), log of assets (+), and expected inflation (+). In addition, we find that dividend-paying firms tend to have lower leverage. When considering book leverage, somewhat similar effects are found. However, for book leverage, the impact of firm size, the market-to-book ratio, and the effect of inflation are not reliable. The empirical evidence seems reasonably consistent with some versions of the trade-off theory of capital structure.

2,380 citations

Journal ArticleDOI
TL;DR: In this article, the authors present a new Stata program, xtscc, that estimates pooled or dual least squares/weighted least squares regression and xed-eects (within) regression models with Driscoll and Kraay (Review of Economics and Statistics 80: 549{560) standard errors.
Abstract: I present a new Stata program, xtscc, that estimates pooled or- dinary least-squares/weighted least-squares regression and xed-eects (within) regression models with Driscoll and Kraay (Review of Economics and Statistics 80: 549{560) standard errors. By running Monte Carlo simulations, I compare the nite-sample properties of the cross-sectional dependence{consistent Driscoll{ Kraay estimator with the properties of other, more commonly used covariance ma- trix estimators that do not account for cross-sectional dependence. The results in- dicate that Driscoll{Kraay standard errors are well calibrated when cross-sectional dependence is present. However, erroneously ignoring cross-sectional correlation in the estimation of panel models can lead to severely biased statistical results. I illustrate the xtscc program by considering an application from empirical nance. Thereby, I also propose a Hausman-type test for xed eects that is robust to general forms of cross-sectional and temporal dependence.

1,995 citations

Journal ArticleDOI
TL;DR: In this article, the authors investigate how corporate governance impacts firm value by examining both the value and the use of cash holdings in poorly and well governed firms, and show that firms with poor corporate governance dissipate cash quickly and in ways that significantly reduce operating performance.
Abstract: In this paper, we investigate how corporate governance impacts firm value by examining both the value and the use of cash holdings in poorly and well governed firms. Cash represents a large and growing fraction of corporate assets and generally is at the discretion of management. We use several measures of corporate governance and show that governance has a substantial impact on firm value through its impact on cash: $1.00 of cash in a poorly governed firm is valued by the market at only $0.42 to $0.88, depending on the measure of governance. Good governance approximately doubles this value of cash. Furthermore, governance has a significant impact on how firms use cash. We show that firms with poor corporate governance dissipate cash quickly and in ways that significantly reduce operating performance. This negative impact of large cash holdings on future operating performance is cancelled out if the firm is well governed. All of our results hold after controlling for the level of acquisitions undertaken by cash rich firms, indicating that acquisitions are not solely responsible for the value destruction in poorly governed, cash rich firms. The findings presented in this paper provide direct evidence of how governance can improve or destroy firm value and insight into the importance of governance in determining corporate cash policy.

1,554 citations

References
More filters
Book
01 Jan 2001
TL;DR: This is the essential companion to Jeffrey Wooldridge's widely-used graduate text Econometric Analysis of Cross Section and Panel Data (MIT Press, 2001).
Abstract: The second edition of this acclaimed graduate text provides a unified treatment of two methods used in contemporary econometric research, cross section and data panel methods. By focusing on assumptions that can be given behavioral content, the book maintains an appropriate level of rigor while emphasizing intuitive thinking. The analysis covers both linear and nonlinear models, including models with dynamics and/or individual heterogeneity. In addition to general estimation frameworks (particular methods of moments and maximum likelihood), specific linear and nonlinear methods are covered in detail, including probit and logit models and their multivariate, Tobit models, models for count data, censored and missing data schemes, causal (or treatment) effects, and duration analysis. Econometric Analysis of Cross Section and Panel Data was the first graduate econometrics text to focus on microeconomic data structures, allowing assumptions to be separated into population and sampling assumptions. This second edition has been substantially updated and revised. Improvements include a broader class of models for missing data problems; more detailed treatment of cluster problems, an important topic for empirical researchers; expanded discussion of "generalized instrumental variables" (GIV) estimation; new coverage (based on the author's own recent research) of inverse probability weighting; a more complete framework for estimating treatment effects with panel data, and a firmly established link between econometric approaches to nonlinear panel data and the "generalized estimating equation" literature popular in statistics and other fields. New attention is given to explaining when particular econometric methods can be applied; the goal is not only to tell readers what does work, but why certain "obvious" procedures do not. The numerous included exercises, both theoretical and computer-based, allow the reader to extend methods covered in the text and discover new insights.

28,298 citations

Journal ArticleDOI
TL;DR: In this article, a parameter covariance matrix estimator which is consistent even when the disturbances of a linear regression model are heteroskedastic is presented, which does not depend on a formal model of the structure of the heteroSkewedness.
Abstract: This paper presents a parameter covariance matrix estimator which is consistent even when the disturbances of a linear regression model are heteroskedastic. This estimator does not depend on a formal model of the structure of the heteroskedasticity. By comparing the elements of the new estimator to those of the usual covariance estimator, one obtains a direct test for heteroskedasticity, since in the absence of heteroskedasticity, the two estimators will be approximately equal, but will generally diverge otherwise. The test has an appealing least squares interpretation.

25,689 citations

ReportDOI
TL;DR: In this article, a simple method of calculating a heteroskedasticity and autocorrelation consistent covariance matrix that is positive semi-definite by construction is described.
Abstract: This paper describes a simple method of calculating a heteroskedasticity and autocorrelation consistent covariance matrix that is positive semi-definite by construction. It also establishes consistency of the estimated covariance matrix under fairly general conditions.

18,117 citations

Journal ArticleDOI
TL;DR: In this article, an extension of generalized linear models to the analysis of longitudinal data is proposed, which gives consistent estimates of the regression parameters and of their variance under mild assumptions about the time dependence.
Abstract: SUMMARY This paper proposes an extension of generalized linear models to the analysis of longitudinal data. We introduce a class of estimating equations that give consistent estimates of the regression parameters and of their variance under mild assumptions about the time dependence. The estimating equations are derived without specifying the joint distribution of a subject's observations yet they reduce to the score equations for multivariate Gaussian outcomes. Asymptotic theory is presented for the general class of estimators. Specific cases in which we assume independence, m-dependence and exchangeable correlation structures from each subject are discussed. Efficiency of the proposed estimators in two simple situations is considered. The approach is closely related to quasi-likelih ood. Some key ironh: Estimating equation; Generalized linear model; Longitudinal data; Quasi-likelihood; Repeated measures.

17,111 citations

Journal ArticleDOI
TL;DR: In this article, the null hypothesis of no misspecification was used to show that an asymptotically efficient estimator must have zero covariance with its difference from a consistent but asymptonically inefficient estimator, and specification tests for a number of model specifications in econometrics.
Abstract: Using the result that under the null hypothesis of no misspecification an asymptotically efficient estimator must have zero asymptotic covariance with its difference from a consistent but asymptotically inefficient estimator, specification tests are devised for a number of model specifications in econometrics. Local power is calculated for small departures from the null hypothesis. An instrumental variable test as well as tests for a time series cross section model and the simultaneous equation model are presented. An empirical model provides evidence that unobserved individual factors are present which are not orthogonal to the included right-hand-side variable in a common econometric specification of an individual wage equation.

16,198 citations

Frequently Asked Questions (8)
Q1. What contributions have the authors mentioned in the paper "Nber working paper series estimating standard errors in finance panel data sets: comparing approaches" ?

This paper will examine the different methods used in the literature and explain when the different methods yield the same ( and correct ) standard errors and when they diverge. The intent is to provide intuition as to why the different approaches sometimes give different answers and thus give researchers guidance for their use. Papers which did not report the method for estimating the standard errors, or reported correcting the standard errors only for heteroscedasticity ( i. e. White standard errors which are not robust to within cluster dependence ) are coded as not having corrected the standard errors for within cluster dependence. Where the paper ’ s description was ambiguous, I contacted the authors. Although White or OLS standard errors may be correct, many of the published papers report regressions where I would expect the residuals to be correlated across observations on the same firm in different years ( e. g. bid-ask spread regressed on exchange dummies, stock price, volatility, and average daily volume or leverage regressed on the market to book ratio and firm size ) or correlated across observations on different firms in the same year ( e. g. equity returns regresses on earnings surprises ). In recently published finance papers which include a regression on panel data, forty-two percent of the papers did not adjust the standard errors for possible dependence in the residuals. Thirty-four percent of the papers estimated both the coefficients and the standard errors using the Fama-MacBeth procedure ( Fama-MacBeth, 1973 ). Twenty-nine percent of the papers included dummy variables for each cluster ( e. g. fixed effects or within estimation ). The next two most common methods used OLS ( or an analogous method ) to estimate the coefficients but reported standard errors adjusted for correlation within a cluster. Seven percent of the papers adjusted the 

The independence assumption is used to move from the first to the second line in equation (3) (i.e., the covariance between residuals is zero). 

By estimating a generalized least squares version of the random effects model (i.e. a panel data set with an unobserved firm effect), more efficient coefficient estimates can be obtained (see Wooldridge, 2002). 

Since the firm effect influences both the yearly coefficient estimate and the sample average of the yearly coefficient estimates, it does not appear in the estimated variance. 

By examining how standard errors change when the authors cluster by firm or time (i.e. compare columns The authorto II and The authorto III), the authors can determine the nature of the dependence which remains in the residuals and this can guide us on how to improve their models. 

In recently published finance papers which include a regression on panel data, forty-two percent of the papers did not adjust the standard errors for possible dependence in the residuals. 

Thirty-four percent of the papers estimated both the coefficients and the standard errors using the Fama-MacBeth procedure (Fama-MacBeth, 1973). 

I allowed the fraction of variability in both the residual and the independent variable which is due to the time effect to range from zero to seventy-five percent in twenty-five percent increments.