scispace - formally typeset
Search or ask a question

Showing papers in "Political Analysis in 2013"


Journal ArticleDOI
TL;DR: A survey of automated text analysis for political science can be found in this article, where the authors provide guidance on how to validate the output of the models and clarify misconceptions and errors in the literature.
Abstract: Politics and political conflict often occur in the written and spoken word. Scholars have long recognized this, but the massive costs of analyzing even moderately sized collections of texts have hindered their use in political science research. Here lies the promise of automated text analysis: it substantially reduces the costs of analyzing large collections of text. We provide a guide to this exciting new area of research and show how, in many instances, the methods have already obtained part of their promise. But there are pitfalls to using automated methods—they are no substitute for careful thought and close reading and require extensive and problem-specific validation. We survey a wide range of new methods, provide guidance on how to validate the output of the models, and clarify misconceptions and errors in the literature. To conclude, we argue that for automated text methods to become a standard tool for political scientists, methodologists must contribute new methods and new methods of validation. Language is the medium for politics and political conflict. Candidates debate and state policy positions during a campaign. Once elected, representatives write and debate legislation. After laws are passed, bureaucrats solicit comments before they issue regulations. Nations regularly negotiate and then sign peace treaties, with language that signals the motivations and relative power of the countries involved. News reports document the day-to-day affairs of international relations that provide a detailed picture of conflict and cooperation. Individual candidates and political parties articulate their views through party platforms and manifestos. Terrorist groups even reveal their preferences and goals through recruiting materials, magazines, and public statements. These examples, and many others throughout political science, show that to understand what politics is about we need to know what political actors are saying and writing. Recognizing that language is central to the study of politics is not new. To the contrary, scholars of politics have long recognized that much of politics is expressed in words. But scholars have struggled when using texts to make inferences about politics. The primary problem is volume: there are simply too many political texts. Rarely are scholars able to manually read all the texts in even moderately sized corpora. And hiring coders to manually read all documents is still very expensive. The result is that

2,044 citations


Journal ArticleDOI
TL;DR: A set of alternative assumptions are considered that are sufficient to identify the average causal mediation effects when multiple, causally related mediators exist and develop a new sensitivity analysis for examining the robustness of empirical findings to the potential violation of a key identification assumption.
Abstract: Social scientists are often interested in testing multiple causal mechanisms through which a treatment affects outcomes. A predominant approach has been to use linear structural equation models and examine the statistical significance of the corresponding path coefficients. However, this approach implicitly assumes that the multiple mechanisms are causally independent of one another. In this article, we consider a set of alternative assumptions that are sufficient to identify the average causal mediation effects when multiple, causally related mediators exist. We develop a new sensitivity analysis for examining the robustness of empirical findings to the potential violation of a key identification assumption. We apply the proposed methods to three political psychology experiments, which examine alternative causal pathways between media framing and public opinion. Our analysis reveals that the validity of original conclusions is highly reliant on the assumed independence of alternative causal mechanisms, highlighting the importance of proposed sensitivity analysis. All of the proposed methods can be implemented via an open source R package, mediation.

281 citations


Journal ArticleDOI
TL;DR: It is demonstrated here that even with this level of advanced specification, the scope for fishing is considerable when there is latitude over selection of covariates, subgroups, and other elements of an analysis plan.
Abstract: Social scientists generally enjoy substantial latitude in selecting measures and models for hypothesis testing. Coupled with publication and related biases, this latitude raises the concern that researchers may intentionally or unintentionally select models that yield positive findings, leading to an unreliable body of published research. To combat this “fishing” problem in medical studies, leading journals now require preregistration of designs that emphasize the prior identification of dependent and independent variables. However, we demonstrate here that even with this level of advanced specification, the scope for fishing is considerable when there is latitude over selection of covariates, subgroups, and other elements of an analysis plan. These concerns could be addressed through the use of a form of comprehensive registration. We experiment with such an approach in the context of an ongoing field experiment for which we drafted a complete “mock report” of findings using fake data on treatment assignment. We describe the advantages and disadvantages of this form of registration and propose that a comprehensive but nonbinding approach be adopted as a first step to combat fishing by social scientists. Likely effects of comprehensive but nonbinding registration are discussed, the principal advantage being communication rather than commitment, in particular that it generates a clear distinction between exploratory analyses and genuine tests.

174 citations


Journal ArticleDOI
TL;DR: This article shows how counterfactual causal models may be written and tested when theories suggest spillover or other network-based interference among experimental units, and offers researchers the ability to model theories about how treatment given to some units may come to influence outcomes for other units.
Abstract: If an experimental treatment is experienced by both treated and control group units, tests of hypotheses about causal effects may be difficult to conceptualize, let alone execute. In this article, we show how counterfactual causal models may be written and tested when theories suggest spillover or other network-based interference among experimental units. We show that the “no interference” assumption need not constrain scholars who have interesting questions about interference. We offer researchers the ability to model theories about how treatment given to some units may come to influence outcomes for other units. We further show how to test hypotheses about these causal effects, and we provide tools to enable researchers to assess the operating characteristics of their tests given their own models, designs, test statistics, and data. The conceptual and methodological framework we develop here is particularly applicable to social networks, but may be usefully deployed whenever a researcher wonders about interference between units. Interference between units need not be an untestable assumption; instead, interference is an opportunity to ask meaningful questions about theoretically interesting phenomena.

144 citations


Journal ArticleDOI
Simon Hug1
TL;DR: It is shown that using Boolean algebra in an exploratory fashion without considering possible measurement errors may lead to dramatically misleading inferences, and remedies are suggested that help researchers to circumvent some of these pitfalls.
Abstract: An increasing number of analyses in various subfields of political science employ boolean algebra as proposed in Ragin's (1987) qualitative comparative analysis (QCA). This type of analysis is perfectly justifiable if the goal is to test deterministic hypotheses under the assumption of error-free measures of the employed variables. My contention is, however, that only in a very few research areas are our theories sufficiently advanced to yield deterministic hypotheses. Also, given the nature of our objects of study, error-free measures are largely an illusion. Hence, it is unsurprising that many studies employ QCA inductively and gloss over possible measurement errors. In this paper I address these issues and demonstrate the consequences of these problems with simple empirical examples. In an analysis similar to Monte Carlo simulation I show that using boolean algebra in an exploratory fashion without considering possible measurement errors may lead to dramatically misleading inferences. I then suggest remedies that help researchers to circumvent some of these pitfalls.

101 citations


Journal ArticleDOI
TL;DR: The authors examined a larger number of cases and a greater range of opinions than in previous studies and found substantial variation in multilevel regression and poststratification performance and suggested that the conditions necessary for MRP to perform well will not always be met.
Abstract: Multilevel regression and poststratification (MRP) is a method to estimate public opinion across geographic units from individual-level survey data. If it works with samples the size of typical national surveys, then MRP offers the possibility of analyzing many political phenomena previously believed to be outside the bounds of systematic empirical inquiry. Initial investigations of its performance with conventional national samples produce generally optimistic assessments. This article examines a larger number of cases and a greater range of opinions than in previous studies and finds substantial variation in MRP performance. Through empirical and Monte Carlo analyses, we develop an explanation for this variation. The findings suggest that the conditions necessary for MRP to perform well will not always be met. Thus, we draw a less optimistic conclusion than previous studies do regarding the use of MRP with samples of the size found in typical national surveys. © The Author 2013Published by Oxford University Press on behalf of the Society for Political Methodology. All rights reserved.

91 citations


Journal ArticleDOI
TL;DR: The authors examine one popular parametric measurement model of latent traits for text and then compare its results to systematic human judgments of the texts as a benchmark for validity, in the context of unsupervised scaling methods for latent traits.
Abstract: Automated and statistical methods for estimating latent political traits and classes from textual data hold great promise, because virtually every political act involves the production of text. Statistical models of natural language features, however, are heavily laden with unrealistic assumptions about the process that generates these data, including the stochastic process of text generation, the functional link between political variables and observed text, and the nature of the variables (and dimensions) on which observed text should be conditioned. While acknowledging statistical models of latent traits to be “wrong,” political scientists nonetheless treat their results as sufficiently valid to be useful. In this article, we address the issue of substantive validity in the face of potential model failure, in the context of unsupervised scaling methods of latent traits. We critically examine one popular parametric measurement model of latent traits for text and then compare its results to systematic human judgments of the texts as a benchmark for validity.

81 citations


Journal ArticleDOI
TL;DR: In this article, the authors present an empirical case study on the effect of Election Day Registration (EDR) on turnout and show that EDR likely had negligible effects in the states of Minnesota and Wisconsin.
Abstract: Political scientists are often interested in estimating causal effects. Identification of causal estimates with observational data invariably requires strong untestable assumptions. Here, we outline a number of the assumptions used in the extant empirical literature. We argue that these assumptions require careful evaluation within the context of specific applications. To that end, we present an empirical case study on the effect of Election Day Registration (EDR) on turnout. We show how different identification assumptions lead to different answers, and that many of the standard assumptions used are implausible. Specifically, we show that EDR likely had negligible effects in the states of Minnesota and Wisconsin. We conclude with an argument for stronger research designs.

81 citations


Journal ArticleDOI
TL;DR: In this article, a new method for estimating positions of political parties across country and time-specic contexts by introducing a latent variable model for manifesto data is presented. But the method is limited to estimating the left-right positions of European parties competing in 238 elections across 25 countries and over 60 years.
Abstract: This article presents a new method for estimating positions of political parties across country- and time-specic contexts by introducing a latent variable model for manifesto data. We estimate latent positions and exploit bridge observations to make the scales comparable. We also incorporate expert survey data as prior information in the estimation process to avoid ex post facto interpretation of the latent space. To illustrate the empirical contribution of our method we estimate the left-right positions of 388 European parties competing in 238 elections across 25 countries and over 60 years. Compared to the puzzling volatility of existing estimates, we nd that parties more modestly change their left-right positions over time. We also show that estimates without country- and time-specic bias parameters risk serious, systematic bias in about two thirds of our data. This suggests that researchers should carefully consider the comparability of party positions across countries and/or time.

71 citations


Journal ArticleDOI
TL;DR: This article developed a simple reweighting method for estimating the ATE, shedding light on the identification challenge posed in moving from the local average treatment effect (LATE) to the ATe.
Abstract: Political scientists frequently use instrumental variables (IV) estimation to estimate the causal effect of an endogenous treatment variable. However, when the treatment effect is heterogeneous, this estimation strategy only recovers the local average treatment effect (LATE). The LATE is an average treatment effect (ATE) for a subset of the population: units that receive treatment if and only if they are induced by an exogenous IV. However, researchers may instead be interested in the ATE for the entire population of interest. In this article, we develop a simple reweighting method for estimating the ATE, shedding light on the identification challenge posed in moving from the LATE to the ATE. We apply our method to two published experiments in political science in which we demonstrate that the LATE has the potential to substantively differ from the ATE.

61 citations


Journal ArticleDOI
TL;DR: Quantitative questions are feasible and useful for the study of economic voting and sheds light on where partisan bias enters economic assessments: in perceiving, judging, or reporting economic quantities.
Abstract: Survey questions about quantities offer a number of advantages over more common qualitative questions. However, concerns about survey respondents' abilities to accurately report numbers have limited the use of quantitative questions. This article shows quantitative questions are feasible and useful for the study of economic voting. First, survey respondents are capable of accurately assessing familiar economic quantities, such as the price of gas. Second, careful question design-in particular providing respondents with benchmark quantities-can reduce measurement error due to respondents not understanding the scale on which more complex quantities, such as the unemployment rate, are measured. Third, combining quantitative and qualitative questions sheds light on where partisan bias enters economic assessments: in perceiving, judging, or reporting economic quantities.

Journal ArticleDOI
TL;DR: The case for study registration is made through a case in which the result could easily be manipulated, and the method of registration through a study of the impact of the immigration issue in the 2010 election for the U.S. House of Representatives is illustrated.
Abstract: This article makes the case for the systematic registration of political studies. By proposing a research design before an outcome variable is observed, a researcher commits himor herself to a theoretically motivated method for studying the object of interest. Further, study registration prompts peers of the discipline to evaluate a study’s quality on its own merits, reducing norms to accept significant results and reject null findings. To advance this idea, the Political Science Registered Studies Dataverse (http:// dvn.iq.harvard.edu/dvn/dv/registration) has been created, in which scholars may create a permanent record of a research design before completing a study. This article also illustrates the method of registration through a study of the impact of the immigration issue in the 2010 election for the U.S. House of Representatives. Prior to the election, a design for this study was posted on the Society for Political Methodology website (http://polmeth.wustl.edu/mediaDetail.php?docId1⁄41258). After the votes were counted, the study was completed in accord with the design. The treatment effect in this theoretically specified design was indiscernible, but a specification search could yield a significant result. Hence, this article illustrates the argument for study registration through a case in which the result could easily be manipulated.

Journal ArticleDOI
TL;DR: A Bayesian dynamic panel model is presented, which facilitates the analysis of repeated preferences using individual-level panel data and captures unobserved individual preference heterogeneity both via standard parametric random effects and a robust alternative based on Bayesian nonparametric density estimation.
Abstract: Much politico-economic research on individuals’ preferences is cross-sectional and does not model dynamic aspects of preference or attitude formation. I present a Bayesian dynamic panel model, which facilitates the analysis of repeated preferences using individual-level panel data. My model deals with three problems. First, I explicitly include feedback from previous preferences taking into account that available survey measures of preferences are categorical. Second, I model individuals' initial conditions when entering the panel as resulting from observed and unobserved individual attributes. Third, I capture unobserved individual preference heterogeneity both via standard parametric random effects and a robust alternative based on Bayesian nonparametric density estimation. I use this model to analyze the impact of income and wealth on preferences for government intervention using the British Household Panel Study from 1991 to 2007.

Journal ArticleDOI
TL;DR: This article introduces the basic stages of a CAT algorithm and presents the details for one approach to item selection appropriate for public opinion research, and demonstrates the advantages of CAT via simulation and empirically comparing dynamic and static measures of political knowledge.
Abstract: Survey researchers avoid using large multi-item scales to measure latent traits due to both the financial costs and the risk of driving up nonresponse rates. Typically, investigators select a subset of available scale items rather than asking the full battery. Reduced batteries, however, can sharply reduce measurement precision and introduce bias. In this article, we present computerized adaptive testing (CAT) as a method for minimizing the number of questions each respondent must answer while preserving measurement accuracy and precision. CAT algorithms respond to individuals' previous answers to select subsequent questions that most efficiently reveal respondents' positions on a latent dimension. We introduce the basic stages of a CAT algorithm and present the details for one approach to item selection appropriate for public opinion research. We then demonstrate the advantages of CAT via simulation and empirically comparing dynamic and static measures of political knowledge.

Journal ArticleDOI
TL;DR: This article shows how to apply Bayesian methods to noisy ratio scale distances for both the classical similarities problem as well as the unfolding problem and proves that fixing the origin and rotation is sufficient to identify a configuration.
Abstract: In this article, we show how to apply Bayesian methods to noisy ratio scale distances for both the classical similarities problem as well as the unfolding problem. Bayesian methods produce essentially the same point estimates as the classical methods, but are superior in that they provide more accurate measures of uncertainty in the data. Identification is nontrivial for this class of problems because a configuration of points that reproduces the distances is identified only up to a choice of origin, angles of rotation, and sign flips on the dimensions. We prove that fixing the origin and rotation is sufficient to identify a configuration in the sense that the corresponding maxima/minima are inflection points with full-rank Hessians. However, an unavoidable result is multiple posterior distributions that are mirror images of one another. This poses a problem for Markov chain Monte Carlo (MCMC) methods. The approach we take is to find the optimal solution using standard optimizers. The configuration of points from the optimizers is then used to isolate a single Bayesian posterior that can then be easily analyzed with standard MCMC methods.

Journal ArticleDOI
TL;DR: This article shows how Dynamic Network Logistic Regression techniques can be used to implement decision theoretic models for network dynamics in a panel data context and identifies the combination of processes that best characterizes the choice behavior of the contending blogs.
Abstract: Methods for analysis of network dynamics have seen great progress in the past decade. This article shows how Dynamic Network Logistic Regression techniques (a special case of the Temporal Exponential Random Graph Models) can be used to implement decision theoretic models for network dynamics in a panel data context. We also provide practical heuristics for model building and assessment. We illustrate the power of these techniques by applying them to a dynamic blog network sampled during the 2004 US presidential election cycle. This is a particularly interesting case because it marks the debut of Internet-based media such as blogs and social networking web sites as institutionally recognized features of the American political landscape. Using a longitudinal sample of all Democratic National Convention/Republican National Convention-designated blog citation networks, we are able to test the influence of various strategic, institutional, and balance-theoretic mechanisms as well as exogenous factors such as seasonality and political events on the propensity of blogs to cite one another over time. Using a combination of deviance-based model selection criteria and simulation-based model adequacy tests, we identify the combination of processes that best characterizes the choice behavior of the contending blogs.

Journal ArticleDOI
TL;DR: This article responds to Shultziner's critique that argues that identical twins are more alike not because of genetic similarity, but because they select into more similar environments and respond to stimuli in comparable ways, and that these effects bias twin model estimates to such an extent that they are invalid.
Abstract: In this article, we respond to Shultziner’s critique that argues that identical twins are more alike not because of genetic similarity, but because they select into more similar environments and respond to stimuli in comparable ways, and that these effects bias twin model estimates to such an extent that they are invalid. The essay further argues that the theory and methods that undergird twin models, as well as the empirical studies which rely upon them, are unaware of these potential biases. We correct this and other misunderstandings in the essay and find that gene-environment (GE) interplay is a well-articulated concept in behavior genetics and political science, operationalized as gene-environment correlation and gene-environment interaction. Both are incorporated into interpretations of the classical twin design (CTD) and estimated in numerous empirical studies through extensions of the CTD. We then conduct simulations to quantify the influence of GE interplay on estimates from the CTD. Due to the criticism’s mischaracterization of the CTD and GE interplay, combined with the absence of any empirical evidence to counter what is presented in the extant literature and this article, we conclude that the critique does not enhance our understanding of the processes that drive political traits, genetic or otherwise.

Journal ArticleDOI
TL;DR: The authors argued that identical twins tend to be more alike than non-identical twins because the former are more similarly affected by the same environmental conditions, but the content of those greater trait similarities is nevertheless completely malleable and determined by particular environments.
Abstract: This article offers a new explanation for the results of twin studies in political science that supposedly disclose a genetic basis for political traits. I argue that identical twins tend to be more alike than nonidentical twins because the former are more similarly affected by the same environmental conditions, but the content of those greater trait similarities is nevertheless completely malleable and determined by particular environments. The twin studies method thus can neither prove nor refute the argument for a genetic basis of political traits such as liberal and conservative preferences or voting turnout. The meaning of heritability estimates results in twin studies are discussed, as well as the definition and function of the environment in the political science twin studies. The premature attempts to associate political traits with specific genes despite countertrends in genetics are also examined. I conclude by proposing that the alternative explanation of this article may explain certain puzzles in behavioral genetics, particularly why social and political traits have higher heritability estimates than common physical and medical traits. I map the main point of disagreements with the methodology and the interpretation of its results, and delineate the main operative implications for future research.

Journal ArticleDOI
TL;DR: A framework for understanding the challenges that were then emerging in data management for business, now ubiquitously referred to as “The Three Vs of Big Data” is laid out, which captures much of the challenge and opportunity that big data presents to political science, and social science more generally.
Abstract: In the last three years, the concept of “Big Data” has emerged from technical obscurity to fully fledged, overblown, memetic phenomenon complete with Rorschach-test perceptions and backlash. For political scientists, the phrase seems to evoke a cognitive mapping onto our old continuum of “small-N” vs. “large-N ,” implying perhaps “really large N .” This in turn tends to evoke the social science instinct to sample: “You don’t have to eat the whole cow to know that it’s tough.” (Firebaugh 2008) But this misunderstands the defining characteristics of “big,” the qualitative differences from what has come before, and the challenges and opportunities that big data present to political science. Over a decade ago, IT analyst Doug Laney laid out a framework for understanding the challenges that were then emerging in data management for business, now ubiquitously referred to as “The Three Vs of Big Data”: Volume, Velocity, and Variety (Laney 2001). Simply put, data were beginning to be created at a scale, speed, and diversity of forms sufficient to overwhelm relational databases and other conventional modes of enterprise information management. This framework proved both catchy and useful for the “data science” and “analytics” communities, where technological innovations and new data-driven companies, products, and services are often characterized as addressing the challenges posed by, or extracting new value from, one of the Vs. Broadly conceived, this framework captures much of the challenge and opportunity that big data presents to political science, and social science more generally. As is de rigueur I extend the Vs,1 and those that arise at the fuzzy borders among political science, computer science, and data science. Included in this virtual issue are ten articles from the last decade of Political Analysis, which are individually excellent and worthy of your attention, and which are collectively useful for illustrating these five Vs of big data political science: volume, velocity, variety, vinculation, and validity:

Journal ArticleDOI
TL;DR: In this article, the authors provide an analysis of differential item functioning based on grouping variables commonly used in political science research to explore the utility of each item in the construction of valid knowledge scales.
Abstract: show the impact of invariance by comparing results from the valid and invalid scales. We provide an analysis of differential item functioning based on grouping variables commonly used in political science research to explore the utility of each item in the construction of valid knowledge scales. An application of the VTT suggests it is more appropriate to conceive of these items as effects of a latent variable rather than cause or formative indicators. These results suggest that models attempting to explain apparent knowledge gaps between subgroups have been unsuccessful because previously constructed scales were validated by fiat.

Journal ArticleDOI
TL;DR: In this paper, the authors present several new and overdue methodological improvements: coding knowledge data using formal and specific coding rules based on a substantive rationale for the validity of the codes, recognizing partially correct answers, using multiple coders working independently, using machine coding, and testing reliability and validity.
Abstract: Political knowledge research faces a problem, perhaps even a crisis. For two decades, the American National Election Studies asked open-ended questions about political knowledge and coded answers using procedures that are neither reliable nor replicable and that were never shown to be optimally valid. Consequently, conclusions based on these widely used measures of the public's competence are in doubt. This article presents several new and overdue methodological improvements: coding knowledge data using formal and specific coding rules based on a substantive rationale for the validity of the codes, recognizing partially correct answers, using multiple coders working independently, using machine coding, and testing reliability and validity. The new methods are an improvement because they are transparent and replicable and they produce valid and extremely reliable knowledge data. Further, machine coding produces codes nearly identical to those from a team of human coders, at much lower cost.

Journal ArticleDOI
TL;DR: A model that combines a dynamic perspective on actors' political positions with a probabilistic account of how these positions are translated into emphases of policy topics in political texts is presented, based on a model based on the Comparative Manifesto Project.
Abstract: This article presents a new method of reconstructing actors' political positions from coded political texts. It is based on a model that combines a dynamic perspective on actors' political positions with a probabilistic account of how these positions are translated into emphases of policy topics in political texts. In the article it is shown how model parameters can be estimated based on a maximum marginal likelihood principle and how political actors' positions can be reconstructed using empirical Bayes techniques. For this purpose, a Monte Carlo Expectation Maximization algorithm is used that employs independent sample techniques with automatic Monte Carlo sample size adjustment. An example application is given by estimating a model of an economic policy space and a noneconomic policy space based on the data from the Comparative Manifesto Project. Parties' positions in policy spaces reconstructed using these models are made publicly available for download.

Journal ArticleDOI
TL;DR: The traditional system of scientific and scholarly publishing is breaking down in two different directions as mentioned in this paper, and the traditional system is being replaced by a two-tier system of research and publishing.
Abstract: The traditional system of scientific and scholarly publishing is breaking down in two different directions.

Journal ArticleDOI
TL;DR: Molinari et al. as discussed by the authors developed a new Bayesian method for performing sensitivity analysis regarding nonparametric identification bounds for the average treatment effect of a binary treatment under general missingness or nonrandom assignment.
Abstract: How a treatment causes a particular outcome is a focus of inquiry in political science. When treatment data are either nonrandomly assigned or missing, the analyst will often invoke ignorability assumptions: that is, both the treatment and missingness are assumed to be as if randomly assigned, perhaps conditional on a set of observed covariates. But what if these assumptions are wrong? What if the analyst does not know why—or even if—a particular subject received a treatment? Building on Manski, Molinari offers an approach for calculating nonparametric identification bounds for the average treatment effect of a binary treatment under general missingness or nonrandom assignment. To make these bounds substantively more informative, Molinari’s technique permits adding monotonicity assumptions (e.g., assuming that treatment effects are weakly positive). Given the potential importance of these assumptions, we develop a new Bayesian method for performing sensitivity analysis regarding them. This sensitivity analysis allows analysts to interpret the assumptions’ consequences quantitatively and visually. We apply this method to two problems in political science, highlighting the method’s utility for applied research. How does job loss affect vote choice? Can democracy impede war onset? Do single member districts impact political party structure? Whether using experimental data with random assignment or observational data with unknown assignment mechanisms, political scientists seek to infer the causal effect of treatments, z, on outcomes, y. Unfortunately, in many, if not most, studies involving observational data, the treatment is not randomly assigned or, even worse, is missing for some observations. Nonrandom treatment assignment is the defining feature of observational data (Cochran and Rubin 1973; Rubin 2006). Without random assignment, variables that affect y other than z may be distributed differently across the treated and control groups in ways that cannot be statistically determined. In other words, in observational data the treatment is generally confounded with other covariates, and only some of the potential confounders may be observable. Consequently, “it is virtually impossible in many practical circumstances to be convinced that the estimates of the effects of treatments are in fact unbiased” (Cochran and Rubin 1973, 30). Nevertheless, analysts must

Journal ArticleDOI
TL;DR: This work builds on biased coin and minimization procedures for discrete covariates and demonstrates that its methods outperform complete randomization, producing better covariate balance in simulated data.
Abstract: In typical political experiments, researchers randomize a set of households, precincts, or individuals to treatments all at once, and characteristics of all units are known at the time of randomization. However, in many other experiments, subjects “trickle in” to be randomized to treatment conditions, usually via complete randomization. To take advantage of the rich background data that researchers often have (but underutilize) in these experiments, we develop methods that use continuous covariates to assign treatments sequentially. We build on biased coin and minimization procedures for discrete covariates and demonstrate that our methods outperform complete randomization, producing better covariate balance in simulated data. We then describe how we selected and deployed a sequential blocking method in a clinical trial and demonstrate the advantages of our having done so. Further, we show how that method would have performed in two larger sequential political trials. Finally, we compare causal effect estimates from differences in means, augmented inverse propensity weighted estimators, and randomization test inversion.

Journal ArticleDOI
TL;DR: In this paper, a new measure of democracy, the DCC index, is proposed and constructed from five popular indices of democracy (freedom House, Polity IV, Vanahanen's index of democratization, Cheibub et al. index of democracy and dictatorship, and the Cingranelli-Richards index of electoral self-determination).
Abstract: Utilizing hierarchical cluster analysis, a new measure of democracy, the DCC index, is proposed and constructed from five popular indices of democracy (Freedom House, Polity IV, Vanahanen’s index of democratization, Cheibub et al.’s index of democracy and dictatorship, and the Cingranelli-Richards index of electoral self-determination). The DCC was used to classify the regime types for twenty-four countries in the Americas and thirty-nine countries in Europe over a thirty-year period. The results indicated that democracy is a latent class variable. Sensitivity and specificity analyses were conducted for the five existing democracy indices as well as the newly proposed Unified Democracy Scores index and a predicted DCC score. This analysis revealed significant problems with existing measures. Overall, the predicted DCC index attained the highest level of accuracy although one other index achieved high levels of accuracy in identifying nondemocracies.

Journal ArticleDOI
TL;DR: In this article, the authors developed a new methodology to measure the generosity of unemployment insurance programs with a single metric, and applied this measurement strategy to the unemployment insurance program of the United Kingdom.
Abstract: Unemployment insurance policies are multidimensional objects, with variable waiting periods, eligibility duration, benefit levels, and asset tests, making intertemporal or international comparisons very difficult. Furthermore, labor market conditions, such as the likelihood and duration of unemployment, matter when assessing the generosity of different policies. In this article, we develop a new methodology to measure the generosity of unemployment insurance programs with a single metric. We build a first model with all characteristics of the complex unemployment insurance policy. Our model features heterogeneous agents that are liquidity constrained but can self-insure. We then build a second model, similar in all aspects but one: the unemployment insurance policy is one-dimensional (no waiting periods, eligibility limits, or asset tests, but constant benefits). We then determine which level of benefits in this second model makes society indifferent between both policies. We apply this measurement strategy to the unemployment insurance program of the United Kingdom.

Journal ArticleDOI
Timm Betz1
TL;DR: In this article, a rank-based estimator for 2SLS is proposed to account for non-random measurement error in the endogenous variable and weak instruments in the presence of weak instruments.
Abstract: Two common problems in applications of two-stage least squares (2SLS) are nonrandom measurement error in the endogenous variable and weak instruments. In the presence of nonrandom measurement error, 2SLS yields inconsistent estimates. In the presence of weak instruments, confidence intervals and p-values can be severely misleading. This article introduces a rank-based estimator, grounded in randomization inference, which addresses both problems within a unified framework. Monte Carlo studies illustrate the deficiencies of 2SLS and the virtues of the rank-based estimator in terms of bias and efficiency. A replication of a study of the effect of economic shocks on democratic transitions demonstrates the practical implications of accounting for nonrandom measurement error and weak instruments. In situations where ordinary least squares performs poorly, instrumental variable techniques are a popular alternative, with two-stage least squares (2SLS) being the most commonly used estimator (Sovey and Green 2011). This article introduces a rank-based instrumental variables estimator, grounded in randomization inference, which provides a unified framework to address two problems for the use of 2SLS. The first is nonrandom measurement error, such that the instrument is correlated with the measurement error in an explanatory variable. In this case, coefficient estimates obtained from 2SLS are inconsistent. The second is the presence of weak instruments, in which case 2SLS yields incorrect measures of statistical uncertainty. The estimator presented in this article is robust with respect to both problems. The estimator was first developed by Rosenbaum (1996, 2002); Imbens and Rosenbaum (2005) demonstrate that the estimator is robust to weak instruments. This article expands on Imbens and Rosenbaum by showing that variants of the estimator can also accommodate nonrandom measurement error. At the same time, this article provides the instrumental variables estimator for the techniques described in Keele, McConnaughy, and White (2012), who advocate the use of randomization inference in political science. 1 The article proceeds in three parts. The first part briefly describes the problems for 2SLS arising from nonrandom measurement error and weak instruments. The second part introduces randomization inference and a rank-based instrumental variables estimator. Monte Carlo studies illustrate the deficiencies of 2SLS in the presence of nonrandom measurement error and show the superiority �,

Journal ArticleDOI
TL;DR: The use of discrete distribution tests, specifically the chi-square test and the discrete Kolmogorov—Smirnov (KS) test, as simple devices for comparing and analyzing ordered responses typically found in surveys are demonstrated.
Abstract: Field survey experiments often measure amorphous concepts in discretely ordered categories, with postsurvey analytics that fail to account for the discrete attributes of the data. This article demonstrates the use of discrete distribution tests, specifically the chi-square test and the discrete Kolmogorov—Smirnov (KS) test, as simple devices for comparing and analyzing ordered responses typically found in surveys. In Monte Carlo simulations, we find the discrete KS test to have more power than the chi-square test when distributions are right or left skewed, regardless of the sample size or the number of alternatives. The discrete KS test has at least as much power as the chi-square, and sometimes more so, when distributions are bi-modal or approximately uniform and samples are small. After deriving rules of usage for the two tests, we implement them in two cases typical of survey analysis. Using our own data collected after Hurricanes Katrina and Rita, we employ our rules to both validate and assess treatment effects in a natural experimental setting.

Journal ArticleDOI
TL;DR: In this article, the authors discuss the need to register empirical studies to prevent the scientific equivalent of schoolboy cheating: reporting tests of hypotheses that became evident only after the data were in hand.
Abstract: Social scientists long have debated how closely their research methods resemble the pure, classical ideal: sequentially (and absent pejorative “data snooping”) formulate a theory; develop empirically falsifiable hypotheses; collect data; and conduct the appropriate statistical tests. Meanwhile, in private quarters, they acknowledged that true research seldom proceeds in this fashion. Rather, in the event, the world is observed, data are collected, hypotheses formed, tests conducted, more data collected, and hypotheses revised (e.g., the classic Bernal 1974). Results are collected by the field's scientists into a body of knowledge that defines “known science” and sets the accepted boundaries for future research. Kuhn (1970) labeled this a paradigm. Occasionally, he argued, results appear that lie outside the bounds of the extant paradigm—then, innovation occurs. The articles included in this issue's symposium discuss “registration” of empirical studies. The purpose is to reduce “publication bias,” that is, to prevent the scientific equivalent of schoolboy cheating: reporting tests of hypotheses that became evident only after the data were in hand. Registration has little power against the type of research fraud that is discovered from time to time. There are costs, however, when scientists operating within an accepted paradigm discourage researchers from exploring and reporting any/all relationships and correlations in a data set.