scispace - formally typeset
Journal ArticleDOI

Tracking the Impact of Media on Voter Choice in Real Time: A Bayesian Dynamic Joint Model

15 Jan 2018-Journal of the American Statistical Association (Taylor & Francis)-Vol. 113, Iss: 524, pp 1457-1475
TL;DR: A Bayesian zero-inflated dynamic multinomial choice model is developed that enables the joint modeling of: the interplay and dynamics associated with the individual voter's choice intentions over time, actual vote, and the heterogeneity in the exposure to marketing communications over time. more

Abstract: Commonly used methods of evaluating the impact of marketing communications during political elections struggle to account for respondents’ exposures to these communications due to the problems associated with recall bias. In addition, they completely fail to account for the impact of mediated or earned communications, such as newspaper articles or television news, that are typically not within the control of the advertising party, nor are they effectively able to monitor consumers’ perceptual responses over time. This study based on a new data collection technique using cell-phone text messaging (called real-time experience tracking or RET) offers the potential to address these weaknesses. We propose an RET-based model of the impact of communications and apply it to a unique choice situation: voting behavior during the 2010 UK general election, which was dominated by three political parties. We develop a Bayesian zero-inflated dynamic multinomial choice model that enables the joint modeling of: th... more

Topics: Voting behavior (51%)

Summary (6 min read)

1 Introduction

  • Recent advances in information technology, along with the advent of social media and enablement of user-generated content, have resulted in a proliferation of consumer data.
  • Importantly, the authors distinguish between exposure to communications that are controlled and paid for by a firm or party (‘paid’ media such as advertising, newspaper inserts, and billboards) and those that are not controlled by them (‘earned’ media such as news, editorials, communications about a particular brand by rival brands, etc.).
  • The authors explore the potential of a real-time experience tracking (RET) technique in addressing many of the aforementioned weaknesses.
  • The respondents, who receive prior training about sending these brief, very structured messages, provide information on three essential aspects of their encounter: the brand name, the communications type (for example, TV advertisement or press editorial), and their perceptual response to the encounter.
  • For modeling purposes, this context has three benefits: the final party choice is made simultaneously by all participants, thereby alleviating problems of data censoring; the 2010 UK election campaign spanned a four-week period from the announcement of the election to the actual vote, allowing us to track intentions for the entire choice cycle for all participants; and it enables the capture of paid and earned communications on the choice of political party for an individual in a way that many existing research techniques cannot.

2 Motivating Data

  • Many organizations continue to rely on cross-sectional surveys for collecting data, particularly in politics.
  • One approach used by market research agencies such as Ipsos MORI is self-reported influence, which is based on asking respondents what media influenced them in their voting decisions (Worcester et al. 2011).
  • A second approach is to correlate recalled exposure or perceptual response to different media with brand attitudes (O’Cass 2002).
  • The use of a panel responding to repeated surveys, such as the American National Election Studies database, arising from interviews conducted before and after each presidential election (Klein and Ahluwalia 2005), addresses the issue of common method bias but not the problem of recall bias.
  • It is well known that the time of voting decision is important and has been highlighted as a possible segmentation variable, not least because those who decide how to vote during a campaign are affected by media campaigns (Fournier et al. 2004).

2.1 Real-time Experience Tracking (RET)

  • Figure 1 illustrates how RET works by showing the data collection process for the politics study.
  • The pre-survey also captures brand intention at the start of the study.
  • In order to evaluate the relative impact on the choice of earned and paid/owned communications with more rigor and usefulness, a dynamic model is needed that takes account of: a) communications frequency, and not just dichotomous exposure; b) communications valence, as attitude towards communications influences attitude towards the party c) dynamics over time in both choice intentions and communications exposure and valence; and d) consumers’ two-stage decision-making process of whether to vote, and if so which party to choose.

2.2 The Campaign: The 2010 UK General Election

  • The above mentioned RET method was used during the 2010 UK general election to collect their motivating data.
  • The party shares in the sample’s final vote were 31.2% Conservative, 22.9% Labour and 34.8% Liberal Democrat.
  • The initial survey captured the respondent’s voting history: here, whether they had voted in the last general election, and if so, what their vote had been.
  • Thus, each respondent’s was provided online at the end of the pre-survey and then the coding scheme was summarised by text message to the respondent for easy access when texting about exposures.
  • The reason for this is that broadcast political advertising is illegal in the UK and ubiquitous in the US.

3 A Zero-Inflated Joint Model of Media Communications and

  • The authors joint model consists of five components that capture the impact of media exposures on choice.
  • 2) a zero-inflated multinomial logit model to capture the final vote choice given the decision to vote.
  • Zero-inflated distribution is used in order to account for the excess zeros caused by the fact that respondents are typically not exposed to the majority of marketing communications on a daily or regular basis.
  • 5) finally, the authors join the models using a correlated random effects approach.
  • Another motivation for joint modeling is the fact that even when the authors are interested in understanding how choice intentions depend on exposures, they need to jointly model since these exposures are not exogenously determined.

3.1 Dynamic Multinomial Logit Model for Choice Intentions

  • To build the model for the intention to vote for a particular political party during any week of the campaign, the authors note that voters were faced with four primary choices, viz., ‘Labour’, ‘Conservatives’, ‘Liberal Democrats’ or ‘Do Not Intend to Vote’.
  • The authors denote the random voter’s heterogeneity effect for the ith voter and kth party as bik.
  • The `1 and `2 are coefficients of interaction between earned media and paid media for frequency and valence respectively.
  • The authors formulate a dynamic model with the lagged term (yi,t−1) incorporating the presence of any stickiness in voters’ choices from previous periods.
  • There are several reasons why these coefficients might vary over time, e.g., due to changes in exposures, changes in the policy of the party, or occurrence of unexpected events which may lead the voters to think differently.

3.2 Zero-Inflated Multinomial Logit Model for Final Vote

  • Since the final choice (vote) took place at the same time for all respondents, the data for the final vote is not time varying.
  • As explained earlier, the authors model the decision of whether or not to vote with a binary distribution and the conditional choice decision as a multinomial logit model, resulting in a zero-inflated multinomial model.
  • The authors also include the factor of whether the voter intended to vote for any major parties in previous weeks.
  • The, at is the effect of the tth week’s voting intention on the final vote and since these effects will also have diminishing effects over time, the authors assume that at = a × φ3−t.
  • Here a is the overall effect and φ measures the decay over weeks.

3.3 Model for the Frequency and Valence of Communications Exposures

  • The authors include both the frequency and the average valence of communications exposures within this model to separately capture the effect of the number of exposures and the encounter valence in the model.
  • Second, any associations that are influenced by the message may grow in attitude strength through familiarity (Erdem and Keane 1996).
  • The valence of communications may also have an impact on the choice of the party.
  • Since the presence of these excessive zeros leads to spurious over-dispersion (Park et al. 2011), the authors fit a zero-inflated Poisson (ZIP) regression model which is a discrete mixture of true zeros (degenerate distribution) and positive counts for frequency of exposures (Lambert 1992).
  • Note that although exposure valence is measured on a five-point scale, in their data the authors have these scores averaged over each week, which makes xVitkl a continuous variable.

3.4 Correlation Structure and Heterogeneity: Combining the Models

  • All the three models described above carry information about the voting pattern of the voters and are therefore inter-related.
  • Since these outcomes are measured on a variety of different scales (viz., multinomial, ZIP), it is not possible to directly model the joint predictors’ effects due to the lack of any natural multivariate distribution for characterizing such a dependency.
  • Moreover, without inter-relating or jointly considering these outcomes, it is hard to answer questions about how the evolution of one response (e.g., intention to vote) is related to the evolution of another (e.g., final vote).
  • A flexible solution is to model the association between different responses by correlating the random heterogeneous effects from each of the responses.
  • Due to the IIA property of the multinomial logistic model, it is not possible to find cross effects it is not possible to incorporate cross effects (e.g., how a negative exposure to Labour may affect the voting intention for the Liberal Democrats) directly in the utility equation (3).

4.1 Likelihood

  • 1{Fi,0 & Fi=k} where Voted-2005i represents whether the individual voted in the 2005 UK general election (from equation (7)); ∑ t(ςt) = ς31{yi3.
  • Next, the likelihood function for exposure frequency is: f (xFitkl|Ω,Vi) ∝ ∏ i,t,k,l [ (1 − αFil ) + αFil e−λitkl ]1−1(xFitkl=0) × αFil e−λitklλ xFitkl itkl xFitkl! 1(xFitkl=0) (15) where αFil and λitkl are given in equation (10).

4.2 Prior Specification and Posterior Inference

  • The authors estimate the model parameters using a Bayesian framework.
  • The authors now describe the prior distributions for the model parameters.
  • A similar approach has been used in mixed effects model settings (Natarajan and McCulloch 1998).
  • This prior implies that the odds ratio (centered at 1) has a 95% interval between exp(-4) to exp(4), which has a very wide range and is weakly informative (Dunson et al. 2003; Gelman et al. 2008).
  • The sampler from the posterior obtained from the Markov Chain Monte Carlo (MCMC) simulation allowed us to achieve a summary measure of the parameter estimates.

4.3 Model Selection and Model Fit

  • Before discussing their results, the authors first compare their proposed model with some alternative models to test the quality of model fit.
  • The authors also compute Log-Pseudo Marginal Likelihood (LPML) as an additional model selection criterion and Posterior Predictive P-value for model fit.
  • Hence, the authors follow the approach in Jiang et al. (2015), and Celeux et al. (2006), and calculate DIC(D), by first considering the DIC measure with “complete data” with b and then integrating out the unobserved b. Eb{DIC(D,b)} = −4Eθ[log p(D,b|θ)|D,b] + 2 log p(D,b|Eθ(θ|D,b).
  • The summary statistics from the predicted and observed data are given by χ2(y, θg) and χ2(yrep,g, θg), respectively, where yrep,g denote the replicated value of y from the posterior predictive distribution of θ at the gth iteration of the Gibbs sampler.
  • From the above results, the authors see that for their proposed model (model-2) the DIC4 value is the lowest and the LPML value is the highest.

5.1 Results for the The Impact of Paid and Earned Media Exposures on

  • Voting Intention Table 2 depicts the results of the lagged variables on the voting intentions model.
  • The authors find that the valence of both paid and earned media has a positive and significant impact on voting intentions (with the exception of earned media valence for week 2 and paid media valence for week 1, which are non-significant).
  • This effect is more prominent with less well known brands, where attitudes are not already accessible (Goh et al. 2011).
  • Further work is required to check these observations about valence wearout, which ideally would be studied over longer periods and in alternative decision contexts.
  • Parameter estimates for the impact of demographics and socio-political issues on voting intentions also provide some interesting insight.

5.1.1 Impact of Increased Exposure Frequency and Valence

  • Having established the importance of these media effects, the authors assess the impact of marginal increases in exposure frequency or valence on choice behavior.
  • Z are other control variables and b are random effects.
  • The authors can compare effects of changes in exposure frequencies using βF1,t and β F 2,t and exposure frequencies for other media as well as compare the values of βF·,t for different values of t.
  • For illustrative purposes, the authors show the impact of a small increase in earned and paid media frequency and valence for the average voter with 30%, 50% and 80% probability of voting for party 3 (Liberal Democrats) at one, two and three weeks prior to the final vote.
  • The authors find that the marginal effect of improved earned media valence is positive and, by some margin, highest in week 1.

5.2 Results from the Final Vote model

  • Table 6 contains the results from the logistic regression analysis of whether or not a person voted.
  • The positive and significant coefficient φ2 suggests that those who voted in 2005 were also more likely to vote in the current election, as were those who intended to vote for one of the three major parties (φ3).
  • As expected, the decay parameter tells us that the most recent decisions have the greatest impact.
  • As exposure valence continues to have an effect on intentions in each week, this emphasizes the need to prioritize the quality of media exposures right up until the final choice decision.
  • Also, a being positive and significant reaffirms their assumption that the effectiveness of these exposures depreciates over time.

5.3 Results from the Exposure Frequency and Average Valence Model

  • The results from estimating exposure frequency and valence (Tables 9 and 10) provide several interesting results.
  • Those from lower social categories (manual, semi-manual and lowest grade workers) have higher paid media frequencies (ζF11,2); this may result in part from their greater television consumption.
  • Older age voters report higher frequencies particularly of earned media (ζF13,1), and they report a lower valence for these earned media exposures (ζV13,1).
  • People in employment report higher exposure frequencies (ζF16,1, ζ F1 6,2).
  • This corresponds with their higher likelihood of voting, particularly for those in the public sector (Corey and Garand, 2002).

5.4 Correlation Matrix

  • The estimated value of the correlation matrix is given in Table 11.
  • This is the estimate of the variancecovariance matrix from section 3.4 and equation 12.
  • In order to interpret the results, the authors label the strength of the associations using 0-0.19 as very weak, 0.2-0.39 weak, 0.40-0.59 moderate, 0.6-0.79 strong and 0.8-1 very strong.
  • These cut-offs reflect somewhat arbitrary limits, and results should be considered in the context from which they are derived.
  • The authors find a very weak positive linear association between exposure valence of paid media and exposure frequency of earned media (corr (hVi1, h F i2) = 0.168, significant) indicating that how positive or negative a paid communication is perceived to be is only weakly related with how often earned media is experienced.

5.5 Simulation Study

  • The simulation is focused on evaluating the finite sample performance of the proposed Bayesian estimation when the data are generated, mimicking the real data the authors have.
  • Based on the results, the authors find that the estimates under the proposed model reliably recovered the true estimates with reasonable coverage probability.

6 Conclusions

  • The authors have reviewed the relative strengths and limitations of existing methods of evaluating voters’ response to communications, proposed the use of RET for this purpose, presented a dynamic model of the impact of communications on voters’ choice, and applied the model to a three-way political party choice.
  • This once again emphasizes the managerial importance of understanding communications valence and not just exposure, contrary to the accepted gold standard in commercial practice of market mix modeling based on exposure or proxies for it, and contrary to the common managerial emphasis on ‘share of voice’ metrics.
  • The authors findings suggest that this might have had a residual effect on the final vote, even after party touchpoint valences evened out towards the election day, as the decay for the impact of touchpoint valence is lower than that for touchpoint frequency; furthermore, positivelyvalenced exposures continue to have an impact beyond their influence on voting intentions, also appearing in the final vote model.
  • While this has many benefits as compared with the cross-sectional survey, clearly a brand manager would ideally want insight into drivers of sales, not just attitude shift: after all, their final choice model shows that it is not just the final week’s attitudes which shape the actual decision.
  • Overall, the authors believe that RET could have a profound impact on the measurement of communications effectiveness, given its ability to model individual consumer response to communication encounters and, importantly, evaluate the relative influence of those encounters.

Did you find this useful? Give us your feedback

Figures (14)
  • Table 6: Decision to Vote: Final Vote Model
    Table 6: Decision to Vote: Final Vote Model
  • Table 1: Summary Statistics 1
    Table 1: Summary Statistics 1
  • Table 2: Parameter Estimates-Dynamic Coefficients (Intention to Vote Model).
    Table 2: Parameter Estimates-Dynamic Coefficients (Intention to Vote Model).
  • Table 5: Sensitivity Analysis for Average Level of Encounters
    Table 5: Sensitivity Analysis for Average Level of Encounters
  • Figure 1: RET Data collection.
    Figure 1: RET Data collection.
  • Table 4: Average Frequency and Average Valence of Encounters
    Table 4: Average Frequency and Average Valence of Encounters
  • Table 10: Average Valence Model : Demographics
    Table 10: Average Valence Model : Demographics
  • Table 9: Model Exposure Frequency : Demographics
    Table 9: Model Exposure Frequency : Demographics
More filters

Journal ArticleDOI
TL;DR: It turns out that the proposed semi-LHDs yield desirable prediction accuracy not only in the interior but also on the edge area of the experimental domain, so they are recommended as the experimental designs for simulation-based industrial engineering experiments. more

Abstract: Computer simulations have been receiving a lot of attention in industrial engineering as the rapid growth in computer power and numerical techniques. In contrast to physical experiments which are usually carried out in factories, laboratories or fields, computer simulations can save considerable time and cost. From the statistical perspective, the current research work about computer simulations is mostly focusing on modeling the relationship between the output variable from the simulator and the input variables set by the experimenter. However, an experimental design with careful selection of the values of the input variables can significantly affect the quality of the statistical model. Specifically, prediction on the edge area of the experimental domain, which is extremely critical for an industrial engineering experiment often suffers from inadequate data information because the design points usually do not well cover the edge area of the experimental domain. To address this issue, a new type of design, called semi-LHD is proposed in this paper. Such a design type has the following appealing properties: (1) it encompasses a Latin hypercube design as a sub-design so that the design points are uniformly scattered over the interior of the design region; and (2) it possesses some extra marginal design points which are close to the edge so that the prediction accuracy on the edge area of the experimental domain is fully taken into account. Detailed algorithms for finding the marginal design points and how to construct the proposed semi-LHDs are given. Numerical comparisons between the proposed semi-LHDs with the commonly-used Latin hypercube designs, in terms of prediction accuracy, are illustrated through simulation studies. It turns out that the proposed semi-LHDs yield desirable prediction accuracy not only in the interior but also on the edge area of the experimental domain, so they are recommended as the experimental designs for simulation-based industrial engineering experiments. more

Cites background from "Tracking the Impact of Media on Vot..."

  • ...Computer experiments are becoming widely used in industrial engineering where computational methods are used to simulate the real-world phenomena (see, for example, references [1-7])....


More filters

01 Jan 1995-
TL;DR: Detailed notes on Bayesian Computation Basics of Markov Chain Simulation, Regression Models, and Asymptotic Theorems are provided. more

Abstract: FUNDAMENTALS OF BAYESIAN INFERENCE Probability and Inference Single-Parameter Models Introduction to Multiparameter Models Asymptotics and Connections to Non-Bayesian Approaches Hierarchical Models FUNDAMENTALS OF BAYESIAN DATA ANALYSIS Model Checking Evaluating, Comparing, and Expanding Models Modeling Accounting for Data Collection Decision Analysis ADVANCED COMPUTATION Introduction to Bayesian Computation Basics of Markov Chain Simulation Computationally Efficient Markov Chain Simulation Modal and Distributional Approximations REGRESSION MODELS Introduction to Regression Models Hierarchical Linear Models Generalized Linear Models Models for Robust Inference Models for Missing Data NONLINEAR AND NONPARAMETRIC MODELS Parametric Nonlinear Models Basic Function Models Gaussian Process Models Finite Mixture Models Dirichlet Process Models APPENDICES A: Standard Probability Distributions B: Outline of Proofs of Asymptotic Theorems C: Computation in R and Stan Bibliographic Notes and Exercises appear at the end of each chapter. more

16,069 citations

Journal ArticleDOI
Abstract: Summary. We consider the problem of comparing complex hierarchical models in which the number of parameters is not clearly defined. Using an information theoretic argument we derive a measure pD for the effective number of parameters in a model as the difference between the posterior mean of the deviance and the deviance at the posterior means of the parameters of interest. In general pD approximately corresponds to the trace of the product of Fisher's information and the posterior covariance, which in normal models is the trace of the ‘hat’ matrix projecting observations onto fitted values. Its properties in exponential families are explored. The posterior mean deviance is suggested as a Bayesian measure of fit or adequacy, and the contributions of individual observations to the fit and complexity can give rise to a diagnostic plot of deviance residuals against leverages. Adding pD to the posterior mean deviance gives a deviance information criterion for comparing models, which is related to other information criteria and has an approximate decision theoretic justification. The procedure is illustrated in some examples, and comparisons are drawn with alternative Bayesian and classical proposals. Throughout it is emphasized that the quantities required are trivial to compute in a Markov chain Monte Carlo analysis. more

10,825 citations

Journal ArticleDOI
Diane Lambert1Institutions (1)
01 Feb 1992-Technometrics
Abstract: Zero-inflated Poisson (ZIP) regression is a model for count data with excess zeros. It assumes that with probability p the only possible observation is 0, and with probability 1 – p, a Poisson(λ) random variable is observed. For example, when manufacturing equipment is properly aligned, defects may be nearly impossible. But when it is misaligned, defects may occur according to a Poisson(λ) distribution. Both the probability p of the perfect, zero defect state and the mean number of defects λ in the imperfect state may depend on covariates. Sometimes p and λ are unrelated; other times p is a simple function of λ such as p = l/(1 + λ T ) for an unknown constant T . In either case, ZIP regression models are easy to fit. The maximum likelihood estimates (MLE's) are approximately normal in large samples, and confidence intervals can be constructed by inverting likelihood ratio tests or using the approximate normality of the MLE's. Simulations suggest that the confidence intervals based on likelihood ratio test... more

3,205 citations

"Tracking the Impact of Media on Vot..." refers methods in this paper

  • ...2011), we fit a zero-inflated Poisson (ZIP) regression model which is a discrete mixture of true zeros (degenerate distribution) and positive counts for frequency of exposures (Lambert 1992)....


  • ...Since the presence of these excessive zeros leads to spurious over-dispersion (Park, Chang, and Ghosh 2011), we fit a zero-inflated Poisson (ZIP) regression model which is a discrete mixture of true zeros (degenerate distribution) and positive counts for frequency of exposures (Lambert 1992)....


Journal ArticleDOI
Abstract: Attitude toward the ad (Aad) has been postulated to be a causal mediating variable in the process through which advertising influences brand attitudes and purchase intentions. Previous conceptual a... more

2,154 citations

Mike West1, Jeff HarrisonInstitutions (1)
01 Nov 1989-
Abstract: to the DLM: The First-Order Polynomial Model.- to the DLM: The Dynamic Regression Model.- The Dynamic Linear Model.- Univariate Time Series DLM Theory.- Model Specification and Design.- Polynomial Trend Models.- Seasonal Models.- Regression, Autoregression, and Related Models.- Illustrations and Extensions of Standard DLMs.- Intervention and Monitoring.- Multi-Process Models.- Non-Linear Dynamic Models: Analytic and Numerical Approximations.- Exponential Family Dynamic Models.- Simulation-Based Methods in Dynamic Models.- Multivariate Modelling and Forecasting.- Distribution Theory and Linear Algebra. more

2,121 citations

No. of citations received by the Paper in previous years