scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Propensity Score Matching in Accounting Research

TL;DR: Propensity score matching (PSM) has become a popular technique for estimating average treatment effects (ATEs) in accounting research, but studies often oversell the capabilities of PSM, fail to disclose important design choices, and/or implement PSM in a theoretically inconsistent manner.
Abstract: Propensity score matching (PSM) has become a popular technique for estimating average treatment effects (ATEs) in accounting research. In this study, we discuss the usefulness and limitations of PSM relative to more traditional multiple regression (MR) analysis. We discuss several PSM design choices and review the use of PSM in 86 articles in leading accounting journals from 2008-2014. We document a significant increase in the use of PSM from 0 studies in 2008 to 26 studies in 2014. However, studies often oversell the capabilities of PSM, fail to disclose important design choices, and/or implement PSM in a theoretically inconsistent manner. We then empirically illustrate complications associated with PSM in three accounting research settings. We first demonstrate that when the treatment is not binary, PSM tends to confine analyses to a subsample of observations where the effect size is likely to be smallest. We also show that seemingly innocuous design choices greatly influence sample composition and estimates of the ATE. We conclude with suggestions for future research considering the use of matching methods.
Citations
More filters
Journal ArticleDOI
TL;DR: It is shown that it is premature to suggest that propensity score matching (PSM) eliminates the Big N effect, and random combinations of PSM design choices that achieve covariate balance and four commonly used audit quality measures find that this finding may be affected by PSM’s sensitivity to its design choices and/or by the validity of the auditquality measures used in the analysis.
Abstract: A large auditing literature concludes that Big N auditors provide higher audit quality than non-Big N auditors. Recently, however, a high-profile study suggests that propensity score matching (PSM) on client characteristics eliminates the Big N effect [Lawrence A, Minutti-Meza M, Zhang P (2011) Can Big 4 versus non-Big 4 differences in audit-quality proxies be attributed to client characteristics? Accounting Rev. 86(1):259–286]. We conjecture that this finding may be affected by PSM’s sensitivity to its design choices and/or by the validity of the audit quality measures used in the analysis. To investigate, we examine random combinations of PSM design choices that achieve covariate balance, and four commonly used audit quality measures. We find that the majority of these design choices support a Big N effect for most of the audit quality measures. Overall, our findings show that it is premature to suggest that PSM eliminates the Big N effect. This paper was accepted by Suraj Srinivasan, accounting.

253 citations

Journal ArticleDOI
TL;DR: The extent to which each popular databases that identify restatements, securities class action lawsuits, and Accounting and Auditing Enforcement Releases is subject to concerns and suggestions are offered for researchers using these databases.
Abstract: An extensive accounting and finance literature examines the causes and effects of financial misreporting or misconduct based on samples drawn from four popular databases that identify restatements, securities class action lawsuits, and Securities and Exchange Commission (SEC) Accounting and Auditing Enforcement Releases (AAERs). We show, however, that the results from empirical tests can depend on which database is accessed. To examine the causes of such discrepancies, we compare the information in each database to a detailed sample of 1,243 case histories in which the SEC brought enforcement action for financial misrepresentation. These comparisons allow us to identify, measure, and estimate the economic importance of four characteristics of each database that affect inferences from empirical tests. First, these databases contain information on only the event that is used to proxy for misconduct (e.g., restatements), so they omit other relevant announcements that affect a researcher’s interpretation and use of the events. Second, the initial public revelation of financial misconduct occurs, on average, months before the initial coverage in these databases, leading to discrepancies in event study measures and pre/post comparison tests. Third, most of the events captured by these databases are unrelated to financial fraud, and efforts to cull out non-fraud events yield heterogeneous results. Fourth, the databases omit large numbers of events they were designed to capture. We show the extent to which each database is subject to these concerns and offer suggestions for researchers seeking to use these databases.

177 citations

Journal ArticleDOI
TL;DR: In this article, the effects of influential observations in capital market accounting research (CMAR) studies were investigated using robust regression and robust regression was shown to outperform winsorization and truncation, especially in the presence of unusual or infrequent economic events that are correlated with the dependent and independent variables of interest.
Abstract: Capital market accounting research (CMAR) studies routinely encounter observations taking on extreme values that are likely to affect statistical inferences. Ex-ante univariate approaches such as truncation and winsorization are the most common methods used in CMAR to mitigate the effect of extreme data values. While expedient, each relies on researcher-selected cut-offs which possibly alter legitimate observations in the process. More importantly, there is no empirical evidence in CMAR that either approach is effective at identifying and dealing with the effect of influential observations. We document the efficacy and trade-offs associated with using winsorization, truncation, and robust regression to address the effects of influential observations in CMAR. We first replicate three published CMAR studies to show how the approaches can yield different estimates and statistical inferences. We then use simulations to compare the approaches in controlled settings where we hold the data-generating process constant. The results indicate that robust regression generally outperforms winsorization and truncation, especially in the presence of unusual (or infrequent) economic events that are correlated with the dependent and independent variables of interest. The findings lead us to recommend that future CMAR studies consider using robust regression, or at least report sensitivity/robustness tests using robust regression, especially because robust regression focuses on overall model fit to deal with influential observations, in addition to being relatively straight-forward to implement in typical CMAR settings.

176 citations

Journal ArticleDOI
TL;DR: This article found that firm political connections positively predict comment letter (CL) reviews and substantive characteristics of such reviews, including the number of issues evaluated and the seniority of SEC staff involved.

102 citations


Cites background from "Propensity Score Matching in Accoun..."

  • ...…to ensure that we find a meaningful match for each of the PC firms (Shipman et al. 2017).7 For the 4,301 PC firm-years defined as having non-zero 7 Shipman et al. (2017) argue that matching without replacement may result in lower quality matches and smaller sample size than matching with…...

    [...]

  • ...…(or “caliper”) of 0.0005, and allow for replacement in the selection of matches to ensure that we find a meaningful match for each of the PC firms (Shipman et al. 2017).7 For the 4,301 PC firm-years defined as having non-zero 7 Shipman et al. (2017) argue that matching without replacement may…...

    [...]

Journal ArticleDOI
TL;DR: This article proposed entropy balancing, a recently developed matching technique, as a means to improve the measurement of a normal accrual, identifying weights for each control sample observation such that the distributions of underlying fundamental determinants of accruals are designed to be identical across treatment and control samples.
Abstract: Extant research in accounting raises issues with the specification and power of discretionary accrual measures. We propose entropy balancing, a recently developed matching technique, as a means to improve the measurement of a normal accrual. Entropy balancing identifies weights for each control sample observation such that the distributions of underlying fundamental determinants of accruals are designed to be identical across treatment and control samples. We show that entropy-balanced accruals are well-specified within samples of extreme financial performance, are significantly associated with ex post indicators of earnings management, and, based on simulations, exhibit sufficient power for detecting earnings management. Finally, in contrast to existing discretionary accruals measures, we find that entropy-balanced discretionary accruals are insignificant (significantly positive) in the year of an initial public offering (seasoned equity offering).

94 citations


Cites background or methods from "Propensity Score Matching in Accoun..."

  • ...…linear regression, thereby relaxing the requirement to specify a functional form between independent and dependent variables (Rosenbaum and Rubin, 1983; Shipman et al., 2017).2 In particular, these approaches allow us to avoid assuming linearity for the full set of determinants we examine....

    [...]

  • ...…copy available at: https://ssrn.com/abstract=2556389 17 the Logit propensity model (and second-stage regression).12 We conduct PSM using both one- nearest neighbor (with replacement) and five-nearest neighbor matching with a caliper of 0.03 following the recommendation by Shipman et al. (2017)....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: A structure for thinking about matching methods and guidance on their use is provided, coalescing the existing research (both old and new) and providing a summary of where the literature on matching methods is now and where it should be headed.
Abstract: When estimating causal effects using observational data, it is desirable to replicate a randomized experiment as closely as possible by obtaining treated and control groups with similar covariate distributions. This goal can often be achieved by choosing well-matched samples of the original treated and control groups, thereby reducing bias due to the covariates. Since the 1970's, work on matching methods has examined how to best choose treated and control subjects for comparison. Matching methods are gaining popularity in fields such as economics, epidemiology, medicine, and political science. However, until now the literature and related advice has been scattered across disciplines. Researchers who are interested in using matching methods-or developing methods related to matching-do not have a single place to turn to learn about past and current research. This paper provides a structure for thinking about matching methods and guidance on their use, coalescing the existing research (both old and new) and providing a summary of where the literature on matching methods is now and where it should be headed.

3,952 citations

Journal ArticleDOI
TL;DR: In this article, a rigorous distribution theory for kernel-based matching is presented, and the method of matching is extended to more general conditions than the ones assumed in the statistical literature on the topic.
Abstract: This paper develops the method of matching as an econometric evaluation estimator. A rigorous distribution theory for kernel-based matching is presented. The method of matching is extended to more general conditions than the ones assumed in the statistical literature on the topic. We focus on the method of propensity score matching and show that it is not necessarily better, in the sense of reducing the variance of the resulting estimator, to use the propensity score method even if propensity score is known. We extend the statistical literature on the propensity score by considering the case when it is estimated both parametrically and nonparametrically. We examine the benefits of separability and exclusion restrictions in improving the efficiency of the estimator. Our methods also apply to the econometric selection bias estimator. Matching is a widely-used method of evaluation. It is based on the intuitively attractive idea of contrasting the outcomes of programme participants (denoted Y1) with the outcomes of "comparable" nonparticipants (denoted Y0). Differences in the outcomes between the two groups are attributed to the programme. Let 1 and 11 denote the set of indices for nonparticipants and participants, respectively. The following framework describes conventional matching methods as well as the smoothed versions of these methods analysed in this paper. To estimate a treatment effect for each treated person iecI, outcome Yli is compared to an average of the outcomes Yoj for matched persons je10 in the untreated sample. Matches are constructed on the basis of observed characteristics X in Rd. Typically, when the observed characteristics of an untreated person are closer to those of the treated person ieI1, using a specific distance measure, the untreated person gets a higher weight in constructing the match. The estimated gain for each person i in the treated sample is

3,861 citations

Journal ArticleDOI
TL;DR: A unified approach is proposed that makes it possible for researchers to preprocess data with matching and then to apply the best parametric techniques they would have used anyway and this procedure makes parametric models produce more accurate and considerably less model-dependent causal inferences.
Abstract: Although published works rarely include causal estimates from more than a few model specifications, authors usually choose the presented estimates from numerous trial runs readers never see. Given the often large variation in estimates across choices of control variables, functional forms, and other modeling assumptions, how can researchers ensure that the few estimates presented are accurate or representative? How do readers know that publications are not merely demonstrations that it is possible to find a specification that fits the author's favorite hypothesis? And how do we evaluate or even define statistical properties like unbiasedness or mean squared error when no unique model or estimator even exists? Matching methods, which offer the promise of causal inference with fewer assumptions, constitute one possible way forward, but crucial results in this fast-growing methodological literature are often grossly misinterpreted. We explain how to avoid these misinterpretations and propose a unified approach that makes it possible for researchers to preprocess data with matching (such as with the easy-to-use software we offer) and then to apply the best parametric techniques they would have used anyway. This procedure makes parametric models produce more accurate and considerably less model-dependent causal inferences.

3,601 citations

Journal ArticleDOI
TL;DR: In this article, the authors examined the relation between audit quality and earnings management and found that clients of non-Big Six auditors report discretionary accruals that increase income relatively more than the discretionary accumruals reported by clients of big six auditors.
Abstract: This study examines the relation between audit quality and earnings management. Consistent with prior research, we treat audit quality as a dichotomous variable and assume that Big Six auditors are of higher quality than non-Big Six auditors. Earnings management is captured by discretionary accruals that are estimated using a cross-sectional version of the Jones 1991 model. Prior literature suggests that auditors are more likely to object to management's accounting choices that increase earnings (as opposed to decrease earnings) and that auditors are more likely to be sued when they are associated with financial statements that overstate earnings (as compared to understate earnings). Therefore, we hypothesize that clients of non-Big Six auditors report discretionary accruals that increase income relatively more than the discretionary accruals reported by clients of Big Six auditors. This hypothesis is supported by evidence from a sample of 10,379 Big Six and 2,179 non-Big Six firm years. Specifically, clients of non-Big Six auditors report discretionary accruals that are, on average, 1.5-2.1 percent of total assets higher than the discretionary accruals reported by clients of Big Six auditors. Also, consistent with earnings management, we find that the mean and median of the absolute value of discretionary accruals are greater for firms with non-Big Six auditors. This result also indicates that lower audit quality is associated with more “accounting flexibility”.

3,100 citations

Posted Content
TL;DR: The applied econometrician is like a farmer who notices that the yield is somewhat higher under trees where birds roost, and he uses this as evidence that bird droppings increase yields.
Abstract: Econometricians would like to project the image of agricultural experimenters who divide a farm into a set of smaller plots of land and who select randomly the level of fertilizer to be used on each plot. If some plots are assigned a certain amount of fertilizer while others are assigned none, then the difference between the mean yield of the fertilized plots and the mean yield of the unfertilized plots is a measure of the effect of fertilizer on agricultural yields. The econometrician's humble job is only to determine if that difference is large enough to suggest a real effect of fertilizer, or is so small that it is more likely due to random variation. This image of the applied econometrician's art is grossly misleading. I would like to suggest a more accurate one. The applied econometrician is like a farmer who notices that the yield is somewhat higher under trees where birds roost, and he uses this as evidence that bird droppings increase yields. However, when he presents this finding at the annual meeting of the American Ecological Association, another farmer in the audience objects that he used the same data but came up with the conclusion that moderate amounts of shade increase yields. A bright chap in the back of the room then observes that these two hypotheses are indistinguishable, given the available data. He mentions the phrase "identification problem," which, though no one knows quite what he means, is said with such authority that it is totally convincing. The meeting reconvenes in the halls and in the bars, with heated discussion

2,228 citations