Advertising Content and Consumer Engagement on Social Media: Evidence from Facebook
Summary (4 min read)
1 Introduction
- Models of informative advertising (c.f. Butters (1977); Grossman and Shapiro (1984)) allow for advertising to inform agents only about price and product existence − yet, casual observation and several studies in lab settings (c.f. Armstrong (2010)) suggest advertisements contain much more information and content beyond prices.
- While many brands have established a social media presence, it is not clear what kind of content works better and for which firm, and in what way.
- For each post, their data also contains time-series information on two kinds of engagement measures −.
- The authors main finding from the empirical analysis is that persuasive content drives social media engagement significantly.
2 Data
- The authors dataset is derived from the “pages” feature offered by Facebook.
- Pages enable companies to create profile pages and to post status updates, advertise new promotions, ask questions and push content directly to consumers.
- The authors data comprises posts served from firms’ pages onto the Facebook profiles of the users that are linked to the firm on the platform.
- Check out what the pros are wearing here: http://bit.ly/nyiPeW.”1.
2.1.1 Raw Data and Selection Criteria
- To collect the data, the authors partnered with an anonymous firm, henceforth referred to as Company X that provides analytical services to Facebook Page owners by leveraging data from Facebook’s Insights.
- The data also includes two consumer engagement metrics: the number of Likes and comments for each post each day.
- The authors leverage this information in the methodology they develop later for accounting for non-random assignment of posts to users by Facebook.
- The raw data contains about a million unique posts by about 2,600 unique companies.
- These finer categories are combined into 6 broader industry categories following Facebook’s page classification criteria.
2.1.2 Content-coded Data
- First, the authors contract with workers through AMT and tag 5,000 messages for a variety of content profiles.
- Best practices reported in the recent literature are used to ensure the quality of results from AMT and to improve the performance of the NLP algorithm (accuracy, recall, precision).
- The authors include these content categories to investigate more formally considerations laid out in industry white papers, trade-press articles and blog reports about the efficacy of message attributes in social media engagement.
- Table 3 shows sample messages taken from Walmart’s page in December 2012 and shows how the authors would have tagged them.
- The authors discuss their methods (which involve obtaining agreement across 9 tagging individuals) in section 2.2.
2.1.3 Data Descriptive Graphics
- This section presents descriptive statistics of the main stylized patterns in the data.
- Figure 2 shows box plots of the log of impressions, Likes, and comments versus the time (in days) since a post is released (τ).
- Emotional messages obtain the most number of Likes followed by posts identified as “likely to be posted by friends” (variable: FRIENDLIKELY).
- This means that 6 in 10 posts by celebrity pages in the data have some sort of small talk and/or content that does not relate to products or brands; and that there are no posts by celebrity owned pages that feature price comparisons.
- 10 11 Industry Category VS Message Content Appearance Percentage Biggest: Celebrity Smalltalk at 60.4% & Smallest: Celebrity PriceCompare at 0%.
2.2 Amazon Mechanical Turk
- The authors now describe their methodology for content-coding messages using AMT.
- Once a Turker tags more than 20 messages, a couple of tagged samples are randomly picked and manually examined for quality and performance.
- The authors believe their methodology for content-classification has good external validity.
- Finally, evaluating AMT based studies, Buhrmester et al. (2011) concludes that (1) Turkers are demographically more diverse than regular psychometric studies samples, and (2) the data obtained are at least as reliable as those obtained via traditional methods as measured by psychometric standards such as Cronbach’s Alpha, a commonly used inter-rater reliability measure.
2.3 Natural Language Processing (NLP) for Attribute Tagging
- Natural Language Processing is an interdisciplinary field composed of techniques and ideas from computer science, statistics and linguistics for enabling computers to parse, understand, store, and convey information in human language.
- When presented with a new set of sentences, the algorithm breaks these down to building blocks, identifies sentence-level attributes and assigns labels using the statistical models that were fine-tuned in the training process.
- The authors then utilize rule-based methods to identify brand and product mentions by looking up these lists.
- Finally, the authors utilize ensemble learning methods that combine classifications from the many classifiers and rule-based algorithms they use.
- This is repeated 10 times, each time using a different subset as the validation sample, and the performance measures averaged across the 10 runs.
3 Empirical Strategy
- Unfortunately, a complication arises because Facebook’s policy of delivery of messages to users is non-random: users more likely to find a post appealing are more likely to see the post in their newsfeed, a filtering implemented via Facebook’s “EdgeRank” algorithm.
- //whatisEdgeRank.com for a brief description of EdgeRank, also known as See http.
- Facebook categorizes post-type into 5 classes: status update, photo, video, app, or link.
- Time (τ) refers to the time since the post.
- The econometrics below sets up estimation using the aggregate post-level panel data split by demographics that the authors observe, while acknowledging the fact that non-random targeting is occurring at the individual-level.
3.1 First-stage: Approximating EdgeRank’s Assignment
- The authors represent post k’s type in a vector zk, the time since post k was released in τk, and the history of user i’s past engagement with company j on Facebook in a vector hijt.
- The authors will also estimate the right-hand function gd(.) separately for each demographic bucket, in effect allowing for slope heterogeneity in demographics in addition to intercept heterogeneity across demographics.
- S1 is a cubic spline smoothing function, essentially a piecewise-defined function consisting of many cubic polynomials joined together at regular intervals of the domain such that the fitted curve, the first and second derivatives are continuous.
3.2 Second-stage: Modeling Engagement given Post-Assignment
- The authors operationalize engagement via two actions, Likes and comments on the post.
- The selection functions Ψ̂(d)kjt serve as weights that reweigh the probability of Liking to account for the fact that those users were endogenously sampled, thereby correcting for the non-random nature of post assignment when estimating the outcome equation.
- 23 Maximizing the implied binomial likelihood across all the data, treating Ψ̂kjt as given, then delivers estimates of ψ.
- This essentially serves as a “quasi” control function that corrects for the selectivity in the second stage (Blundell and Powell, 2003), where the authors measure the effect of post characteristics on outcomes.
- The only post-characteristics used by EdgeRank for assignment is zk, which is controlled for.
4.1 First-Stage
- The first-stage model, as specified in Equation 3, approximates EdgeRank’s post assignment algorithm.
- For all demographics, the photo type has the highest coefficient (around 0.25) suggesting that photos are preferred to all other media types by EdgeRank.
- Figure 11 presents a box plot of the coefficients for τ across all 14 demographic bins.
- Finally, the coefficients for number of fans, N(d)jt , are positive and significant but they have relatively low magnitude.
4.2 Second-Stage
- In the second-stage, the authors measure the effect of content characteristics on engagement using their selectivitycorrected model from the first-stage.
- Interestingly, the interaction between persuasive and informative content is positive, implying that informative content increases engagement only in the presence of persuasive content in the message.
- This highlights the importance of EdgeRank correction.
- Looking at Likes, fewer persuasive content variables have positive impact but the results are qualitatively similar to that for comments.
- Similarly, the message type coefficients also vary by industry.
4.3 Out-of-Sample Prediction & Managerial Implications
- To conclude the paper, the authors assess the extent to which the models they develop may be used as an aid to content engineering, and to predict the expected levels of engagement for various hypothetical content profiles a firm may consider for a potential message it could serve to users.
- Then the authors discuss a back-of-the-envelope calculation to show how adding or removing particular content profiles may affect engagement outcomes for typical posts in their data.
- In the last two columns the authors present the predicted and actual ranks for the three messages in terms of their engagement.
- Now note that the standard deviation of the number of impressions is 129,874.
- For a message two standard deviations from the mean number of impressions, i.e., at 10,000 + 2×129,874 = 269,748 impressions, a 30% increase in comments and Likes translates to roughly an increase of 41 comments and 405 Likes, suggesting that content engineering can produce a fairly substantial increase in engagement for many posts.
5 Conclusions and Implications
- The authors show through a large-scale study that content engineering in social media has a significant impact on user engagement as measured by Likes and comments for posts.
- This presents a challenge to marketers who seek to build a large following on social media and who seek to leverage that following to disseminate information about new products and promotions.
- In addition, their results are moderated by industry type suggesting there is no one-size-fits-all content strategy and that firms need to test multiple content strategies.
- The authors find that posts mentioning holidays, especially by consumer product companies, have a negative effect on engagement.
- The authors hope this study contributes to improve content engineering by firms on social media sites and, more generally, creates interest in evaluating the effects of advertising content on consumer engagement.
Did you find this useful? Give us your feedback
Citations
588 citations
Cites background or methods from "Advertising Content and Consumer En..."
...Chen and Lee (2018) investigated the use of Snapchat for social media marketing while targeting young consumers....
[...]
...Chen & Lee, 2018 2) Studies should use a variety of methods to test relationships between different variables (e.g. experimental design) Chen & Lee, 2018 3) Studies designed to explore the dynamics and variations among subcultures and subgroups of different social media platforms....
[...]
...Chen & Lee, 2018 4) Future studies should explore the use of social media platforms in different culture context Chen & Lee, 2018 5) Some of the studies’ sample is skewed toward large, global brands, whose social media marketing operation is generally wellTafesse & Wien, 2018 Table 3 (continued…...
[...]
351 citations
269 citations
Cites background from "Advertising Content and Consumer En..."
...Anthropomorphized consumer robots make consumers feeling warm (Kim, Schmitt, and Thalmann 2019), and natural language–based social robots engage customers (Lee, Hosanagar, and Nair 2018)....
[...]
254 citations
Cites background or methods from "Advertising Content and Consumer En..."
...At the feeling level, more real-time and accurate emotion sensing from postedmessages can better engage customers and provide a better interaction experience (Hartmann et al. 2019; Lee et al. 2018)....
[...]
...It is achieved by bringing together diverse AI literatures on algorithms (e.g., Bauer and Jannach 2018; Davis and Marcus 2015), psychology (e.g., Lee et al. 2018; Leung et al. 2018), societal effects (e.g., Autor and Dorn 2013; Frey and Osborne 2017), and managerial implications (e.g., Huang et al.…...
[...]
...The two higher strategic levels, marketing research and marketing strategy, are not included, due to them being less observable from marketing practice Table 3 Prior and current AI research organized by the strategic framework AI intelligence Strategic decision Mechanical AI Thinking AI Feeling AI Marketing research Data collection • IoT visualizes usage and experience data (Ng and Wakenshaw 2017) • Connected devices collect customer intelligence (Cooke and Zubcsek 2017) • Various online platforms make unstructured big data available for cloud computing to predict sales and consumption (Liu et al. 2016) • Unstructured data for marketing insights (Balducci and Marinova 2018) • Sensors tracking driving behavior provide insurers individual-level driving data (Soleymanian et al. 2019) • Retail tracking technologies, such as heat maps, video surveillance, and Beacons collect in-store shopper data (Kirkpatrick 2020) Market analysis • IoT reconfigures product and service that shifts boundaries of Things (Ng and Wakenshaw 2017) • NLP and ML map market structures for large retail assortments (Gabel et al. 2019) • Lexicon-based and ML algorithms text mining social media data for marketing research (Hartmann et al. 2019) • Big data marketing analytics for marketing insights (Berger et al. 2019 ; Chintagunta et al. 2016; Liu et al. 2016;Wedel and Kannan 2016) • Analytical and intuitive AI for service analytics (Huang and Rust 2018) • AI for solving marketing problems (Overgoor et al. 2019) Customer understanding • Deep learning and NLP analyze customer perceptions (Ramaswamy and DeClerck 2018) • Sentiment analysis for social media content understands consumer responses using their own language (Hewett et al. 2016 ; Humphreys and Wang 2018 ; Ordenes et al. 2017)....
[...]
...2020) • NLP analyzes social media ad content enhances consumer engagement (Lee et al. 2018) • AI chatbots for outbound sales calls (Luo et al....
[...]
...2018), and aiding social media content engineering by employing natural language processing algorithms that discover the associations between social media marketing content and user engagement (Lee et al. 2018)....
[...]
221 citations
References
[...]
38,208 citations
19,261 citations
14,144 citations
"Advertising Content and Consumer En..." refers methods in this paper
...To better describe the correlation matrix graphically and to cluster highly correlated variables together, we ran cluster analysis (hierarchical clustering with the number of clusters determined using the average silhouette width (Rousseeuw, 1987)), which suggested that there are two clusters in the data....
[...]
12,021 citations
"Advertising Content and Consumer En..." refers background in this paper
...Further, the branding literature suggests that functional benefits of a brand also become more persuasive when expressed by the brand’s personality (Keller 1993; Aaker 1996)....
[...]
11,507 citations
"Advertising Content and Consumer En..." refers background in this paper
...For more details, refer to Hastie et al. (2009)....
[...]
...…L1 regularization (which penalizes the number of attributes and is commonly used for attribute selection for problems with many attributes; see (Hastie et al., 2009)), Naive Bayes (a probabilistic classifier that applies Bayes theorem based on presence or absence of features), and support…...
[...]
Related Papers (5)
Frequently Asked Questions (19)
Q2. What have the authors stated for future works in "Advertising content and consumer engagement on social media: evidence from facebook" ?
Here again, it is possible this effect may reduce in the future if firms start using emotional content excessively pushing consumer response to the region of declining returns. Future studies that evaluate other measures of interest can add value, particularly in validating the generalizability of their findings and in exploring mechanisms underpinning the effects the authors describe. There may be other measures worth considering, including whether users share posts with friends, visit the websites of firms posting messages, or buy more products from these firms. Kumar et al. ( 2013 ) show that social media can be used to generate growth in sales, and ROI, connecting social media metrics such as “ comments ” to financial metrics.
Q3. What are some examples of sentence-level attributes and rules?
Some examples of sentence-level attributes and rules include: frequent noun words (bag-of-words approach), bigrams, the ratio of partof-speech used, tf-idf (term-frequency and inverse document frequency) weighted informative word weights, and whether “a specific key-word is present” rule.
Q4. What are the practices used to ensure the quality of results from AMT?
Best practices reported in the recent literature are used to ensure the quality of results from AMT and to improve the performance of the NLP algorithm (accuracy, recall, precision).
Q5. What is the role of content engineering in marketing?
Content engineering seeks to develop ad content that better engage targeted users and drive the desired goals of the marketer from the campaigns they implement.
Q6. Why do the authors think their results have broad applicability?
Because of the scale of their study (over 800 firms and 100,000 messages analyzed), the authors believe their results generalize and have broad applicability.
Q7. What is the effect of the interaction between persuasive and informative content?
the interaction between persuasive and informative content is positive, implying that informative content increases engagement only in the presence of persuasive content in the message.
Q8. How many posts are removed after no significant activity?
Removing periods after which no significant activity is observed for a post reduces this to 665,916 rows of post-level snapshots (where activity is defined as either impressions, Likes, or comments).
Q9. How do they show that social media can be used to generate growth in sales?
Kumar et al. (2013) show that social media can be used to generate growth in sales, and ROI, connecting social media metrics such as “comments” to financial metrics.
Q10. What is the effect of the combining of results from a few Turkers?
Snow et al. (2008) show that combining results from a few Turkers can produce data equivalent in quality to that of expert labelers13for a variety of text tagging tasks.
Q11. What is the complication of Facebook’s delivery of messages to users?
a complication arises because Facebook’s policy of delivery of messages to users is non-random: users more likely to find a post appealing are more likely to see the post in their newsfeed, a filtering implemented via Facebook’s “EdgeRank” algorithm.
Q12. What age group are the impressions from the new-born clothing brand?
For posts by the the new-born clothing brand, the most impressions are among from females in the age-groups of 25-34, 18-24 and 35-44.
Q13. Why is the coefficients for the number of fans positive?
This is because their model includes a smoothed term of the number of fans, s(N(d)jt ), which soaks up both the magnitude and nonlinearity.
Q14. What is the method for combining the prediction from individual classifiers?
This step involves combining the prediction from individual classifiers by weightedmajority voting, unweighted-majority voting, or a more elaborate method called isotonic regression (Zadrozny and Elkan, 2002) and choosing the best performing method in terms of accuracy, precision and recall for each content profiles.
Q15. How many coefficients are excluded from the table?
The authors exclude the 16 estimated τ coefficients from the table since they are all negative and statistically significant just as in the EdgeRank model in Figure 11.
Q16. What are the intercepts for the post assignment algorithm?
The intercepts (θ(d)0 ) indicate that posts by companies in their dataset are shown most often to Females ages 35-44, Females 45-54 and Males 25-34.
Q17. What is the effect of persuasive content on engagement?
The authors find that persuasive content has a positive and statistically significant effect on both types of engagement; further, informative content reduces engagement.
Q18. What is the canonical economic model of advertising as a signal?
The canonical economic model of advertising as a signal (c.f. Nelson (1974); Kihlstrom and Riordan (1984); Milgrom and Roberts (1986)) does not postulate any direct role for ad content because advertising intensity conveys all relevant information about product quality in equilibrium to market participants.
Q19. What is the likely explanation for the lack of engagement on Facebook?
One possible explanation is that near holidays, all Facebook pages indiscriminately mention holidays, leading to a dulled responses.