scispace - formally typeset
Open AccessJournal ArticleDOI

Stock returns and investor sentiment: textual analysis and social media

TLDR
This paper examined the relationship between investor sentiment and stock returns by employing textual analysis on social media posts, and found that investor sentiment measure has a positive and significant effect on abnormal stock returns.
Abstract
The behavioral finance literature has found that investor sentiment has predictive ability for equity returns. This differs from standard finance theory, which provides no role for investor sentiment. We examine the relationship between investor sentiment and stock returns by employing textual analysis on social media posts. We find that our investor sentiment measure has a positive and significant effect on abnormal stock returns. These findings are consistent across a number of different models and specifications, providing further evidence against non-behavioral theories.

read more

Content maybe subject to copyright    Report

(4342.(7&(918=#460.3,&5*67*6.*7 (4342.(7

Stock Returns and Investor Sentiment: Textual
Analysis and Social Media
Zachary McGurk
Canisius College2(,960>(&3.7.97*)9
Adam Nowak
West Virginia University)&24;&02&.1;:9*)9
Joshua C. Hall
West Virginia University/47-9&-&112&.1;:9*)9
4114;8-.7&3)&)).8.43&1;4607&8 -@576*7*&6(-6*547.846=;:9*)9*(43%;460.3,5&5*67
&684+8-* 42598*6(.*3(*7422437&3)8-* .3&3(*&3).3&3(.&1&3&,*2*38
422437
?.7#460.3,&5*6.7'649,-884=49+46+6**&3)45*3&((*77'=8-*(4342.(7&8?**7*&6(-*547.846=#"!8-&7'**3&((*58*)+46
.3(197.43.3(4342.(7&(918=#460.3,&5*67*6.*7'=&3&98-46.>*)&)2.3.786&8464+?**7*&6(-*547.846=#"!46246*.3+462&8.43
51*&7*(438&(8 .&3-&62432&.1;:9*)9
.,.8&1422437.8&8.43
(960$&(-&6=4;&0)&2&3)&1147-9&84(0*89637&3)3:*7846*38.2*38 *<89&13&1=7.7&3)4(.&1*).&
 Economics Faculty Working Papers Series
-@576*7*&6(-6*547.846=;:9*)9*(43%;460.3,5&5*67

Stock Returns and Investor Sentiment: Textual
Analysis and Social Media
Zachary McGurk
Department of Economics & Finance, Canisus College
Buffalo, NY United States
mcgurkz@canisius.edu
Adam Nowak
John Chambers College of Business and Economics, West Virginia University
Morgantown, WV 26506, United States
adam.nowak@mail.wvu.edu
Joshua C. Hall
John Chambers College of Business and Economics, West Virginia University
Morgantown, WV 26506, United States
joshua.hall@mail.wvu.edu
(Corresponding author)
Abstract
The behavioral finance literature has found that investor sentiment has pre-
dictive ability for equity returns. This differs from standard finance theory,
which provides no role for investor sentiment. We examine the relationship
between investor sentiment and stock returns by employing textual analysis on
social media posts. We find that our investor sentiment measure has a positive
and significant effect on abnormal stock returns. These findings are consis-
tent across a number of different models and specifications, providing further
evidence against non-behavioral theories.
JEL-Classification: G12, G13, G14
Keywords: Investor sentiment, supervised learning, stock returns, social media, suffi-
cient reduction, predictive regression
1

1 Introduction
As described in Malkiel and Fama (1970), the Efficient Market Hypothesis (EMH)
predicts asset prices fully reflect all available information. Rational investors in re-
sponse choose asset portfolios which diversify away idiosyncratic risk. As such asset
prices are only a function of market fundamentals. When asset prices are mispriced
through the actions of irrational investors, rational investors are able to use arbitrage
to correct asset prices.
In contrast to the EMH, behavioral finance theory suggests that the feelings of
irrational investors (Investor Sentiment) drive a portion of asset prices. Due to the
specific characteristics of some assets (small, hard to value, limited information, etc.),
arbitrage by rational investors becomes costly and asset prices are perpetually mis-
priced.
1
Recent empirical studies have found Investor Sentiment to be related to stock
returns.
2
While the empirical finance literature has found Investor Sentiment to be a valid
predictor of the cross section and time series of stock returns, studies differ how the
Investor Sentiment measure is estimated. As noted by Baker and Wurgler (2006,
2007) Investor Sentiment is difficult to directly measure. As a result, the literature
has relied on proxies developed from market/investor surveys, data mining methods,
and textual analysis from annual reports, commercial media, and social media.
Due to data limitations, the market/investor survey and data mining methods
literature focus on the impact of investor sentiment on returns over monthly or larger
time horizons. While most of these studies show a relationship between asset returns
and investor sentiment, these studies may not capture the full impact of investor
sentiment. If asset markets are partially efficient (i.e. investor sentiment does not
determine a portion of stocks), and information is randomly dispersed, then markets
should be the least efficient in the very short run.
Another critique of this literature is that these investor sentiment measures show
overall market sentiment rather than asset specific sentiment. Baker and Wurgler
(2006) discusses that due to imperfect information about smaller firms, any new
information causes investors to engage in irrational speculative trading. Market sen-
timent may not necessarily capture this speculative feeling in smaller firms.
While much of the textual analysis literature has been able to account for the
preceding critiques, the estimation methods used may be not be able to fully cap-
ture investor sentiment. A portion of the previous literature has relied on dictionary
1
See, for example, Baker and Wurgler (2006, 2007)
2
See Nardo et al. (2016), Bukovina (2016) and Zhou (2018) for a review of the recent literature.
1

based methods in determining Investor Sentiment (Loughran and McDonald, 2011;
Chen et al., 2014; Jiang et al., 2019). To estimate sentiment, these studies pre-define
a dictionary of positive and negative finance words and determine overall investor
sentiment as the net positive word counts. The limitation of this approach is that
there may be important missing terms which show sentiment. This method also gives
each word equal weight in determining sentiment and does not account for sentiment
shown in multi-word phrases.
3
Other studies have utilized machine learning methods to estimate investor sen-
timent (Bartov et al., 2018; Ranco et al., 2015; Yang et al., 2015; Sun et al., 2016;
Renault, 2017; Behrendt and Schmidt, 2018). These studies provide an improvement
on dictionary-based methods as the created investor sentiment indexes allow for dif-
ferent weighting of textual terms. These papers focus on the extreme short run (5 to
30-minute intervals) impact of investor sentiment on returns and given data limita-
tions are unable to create equity specific investor sentiment.
4
Given the limitations of the previous literature, we propose a new method for es-
timating Twitter based stock specific investor sentiment index utilizing as developed
in Taddy (2013a). This method differs in that estimates of sentiment do not rely
on a predefined dictionary, and individual words are not assumed to be related the
same sentiment information. Further, given the data rich environment of Twitter,
we are able to create equity specific investor sentiment indexes. In this method, a
training set of posts by individual users on Twitter (tweets) are determined to either
convey positive, neutral, or negative sentiment. These are then used to predict the
sentiment information from all remaining tweets. For comparison, we also develop
a dictionary based investor sentiment utilizing a similar method as Loughran and
McDonald (2011).
We further utilize our investor sentiment index to test the empirical validity of
EMH and Behavioral Finance theories. We specifically determine the relationship
between our investor sentiment measures (negative, neutral, and positive sentiment)
and cross-section abnormal stock returns. For robustness, we test if this relationship
is similar across firm size. Finally, we determine if investor sentiment is useful in
forecasting abnormal returns at the market level and by firm size.
The social media platform Twitter is used by over 320 million users who express
opinions and thoughts on a number of different subject matters including equity
3
Loughran and McDonald (2011) include a method for weighting individual words, however, this
is based on word frequency rather than perceived sentiment information.
4
Bloomberg and Thompson Reuters have created commercial equity specific textual analysis
based investor sentiment measures. These measures are proprietary and as such estimation methods
are unknown. These measure are used by Sun et al. (2016) and Behrendt and Schmidt (2018).
2

prices.
5
Further, Twitter is unique in that an individual can reference specific stocks
by affixing a ‘$’ before the stock symbol in a tweet. This allows all Twitter users
to search for tweets discussing a particular stock. This allows researchers to collect
tweets supplied by individuals specific to a stock. Anecdotal evidence has shown
individual Twitter posts (tweets) to influence specific stock returns. On January 10,
2011, Business Insider reported hip hop artist, 50-Cent (Curtis Jackson), tweeted
HNHI is the stock symbol for TVG there launching 15 different products.
they are no joke get in now.
The article goes on to state (Weisenthal, 2011, no page number):
In the three months to the end of September, the company was operating
at a loss with cash of just $198,000 and a deficit of $3.3m. Then, on
November 23, it said it would offer 180m shares to the public at a price
of just 17 cents... trading under the stock name HNHI was worth just 4
cents each. Spurred by the tweet, the stock took off. It hit nearly 50 cents
on Monday, before closing at 39 cents.
By the end of the month, the stock was up to $1.68. This price increase was relatively
short lived. In early May, 50-cent terminated his relationship with this company, and
the stock dropped in value to $0.1.
Overall, we find a relationship between abnormal stock returns and our estimated
investor sentiment indexes. We find an increase in positive sentiment is related to
an increase in abnormal returns while also finding that estimated negative estimated
sentiment had a limited relationship with abnormal returns. These results are consis-
tent across firm size. Using out-of-sample forecasting tests, we find investor sentiment
is able to produce marginally more accurate forecasts compared to a constant only
model. Gains in forecast accuracy, however, is limited to around one percent. Our
results indicate that individuals on Twitter are relaying stale information as opposed
to providing novel insights.
The remainder of the paper proceeds as follows. Section 2 details the relevant
literature on investor sentiment and textual analysis, Section 3 describes the method-
ology and data, Section 4 details the cross sectional analysis, Section 5 provides a
discussion on forecasting method utilized and forecasting results, and Section 6 con-
cludes.
5
Source: Twitter 2018 Annual Report.
3

Citations
More filters
Journal ArticleDOI

The Importance of Fear: Investor Sentiment and Stock Market Returns

TL;DR: In this article, the authors identify a strong relationship between investor sentiment and stock returns that is consistent with theoretical explanations of sentiment and determine that VIX is the preferred measure of sentiment in terms of improving model fit and adding explanatory power.
Journal ArticleDOI

Social Media Users’ Opinions on Remote Work during the COVID-19 Pandemic. Thematic and Sentiment Analysis

TL;DR: The study proved the opinion that it will permanently stay in the post-COVID time, that the topic of remote work at epidemic peak in March 2020 increased almost 15 times during a year.
Journal ArticleDOI

Sentiment analysis in textual, visual and multimodal inputs using recurrent neural networks

TL;DR: An extensive review over the applicability, challenges, issues, and approaches over the role of sequential deep neural networks in sentiment analysis of multimodal data using RNN and its architectural variants is presented.
Journal ArticleDOI

Do Consumer Perceptions of Tanking Impact Attendance at National Basketball Association Games? A Sentiment Analysis Approach

TL;DR: In this article, the authors studied the impact of consumers' sentiment regarding tanking on game attendance in the National Basketball Association and found that the volume of discussions for the home team and sentiment toward tanking by the away team impact game attendance.
Journal ArticleDOI

The effect of online environmental news on green industry stocks: The mediating role of investor sentiment

TL;DR: Wang et al. as discussed by the authors investigated the relationship among environmental news, investor sentiment, and green industry stock returns in China and developed an index of media environmental attention with the news data from the CSMAR (China Stock Market & Accounting Research) database.
References
More filters
Journal ArticleDOI

Efficient capital markets: a review of theory and empirical work*

Eugene F. Fama
- 01 May 1970 - 
TL;DR: Efficient Capital Markets: A Review of Theory and Empirical Work Author(s): Eugene Fama Source: The Journal of Finance, Vol. 25, No. 2, Papers and Proceedings of the Twenty-Eighth Annual Meeting of the American Finance Association New York, N.Y. December, 28-30, 1969 (May, 1970), pp. 383-417 as mentioned in this paper
Journal ArticleDOI

Investor sentiment and the cross-section of stock returns

TL;DR: The authors study how investor sentiment affects the cross-section of stock returns and find that when sentiment is low, subsequent returns are relatively high for small stocks, young stocks, high volatility stocks, unprofitable stocks, non-dividend-paying stocks, extreme growth stocks, and distressed stocks.
Posted Content

Investor Sentiment and the Cross-Section of Stock Returns

TL;DR: This article examined how investor sentiment affects the cross-section of stock returns and found that when sentiment is low, subsequent returns are relatively high on smaller stocks, high volatility stocks, unprofitable stocks, non-dividend-paying stocks, extreme-growth stocks, and distressed stocks, consistent with an initial underpricing of these stocks.
Posted Content

When is a Liability not a Liability? Textual Analysis, Dictionaries, and 10-Ks

TL;DR: This article developed an alternative negative word list, along with five other word lists, that better reflect tone in financial text and linked the word lists to 10 K filing returns, trading volume, return volatility, fraud, material weakness, and unexpected earnings.
Journal ArticleDOI

When Is a Liability Not a Liability? Textual Analysis, Dictionaries, and 10‐Ks

Tim Loughran, +1 more
- 01 Feb 2011 - 
TL;DR: In this paper, the authors show that word lists developed for other disciplines misclassify common words in financial text and develop an alternative negative word list, along with five other word lists, that better reflect tone of financial text.
Related Papers (5)
Frequently Asked Questions (12)
Q1. What have the authors contributed in "Stock returns and investor sentiment: textual analysis and social media" ?

This differs from standard finance theory, which provides no role for investor sentiment. The authors examine the relationship between investor sentiment and stock returns by employing textual analysis on social media posts. The authors find that their investor sentiment measure has a positive and significant effect on abnormal stock returns. These findings are consistent across a number of different models and specifications, providing further evidence against non-behavioral theories. 

This should allow future researchers to understand the long run implications of individual firm investor sentiment. With complete Twitter data, future research can test the role of each of the sources of investor sentiment ( Avery and Chevalier, 1999 ) on abnormal returns. Future research should also look to employ non-linear forecasting models such as employed in Bekiros et al. ( 2016 ). Specifically, the number of followers or re-tweets may be useful in determining the effect of expert opinion caused investor sentiment on abnormal returns. 

In fact, Loughran and McDonald (2011) find that around 74 percent of the negative tokens found in the Harvard IV-4 word list are not deemed negative in a finance context. 

13Because the tweets frequently indicate the direction of the stock (up or down), the authors modify a list of stop words from the SnowballC package in R to retain finance-specific words. 

Due to data limitations, the market/investor survey and data mining methods literature focus on the impact of investor sentiment on returns over monthly or larger time horizons. 

While overall market-level investor sentiment is likely driving the systemic mispricing of assets, equity-specific investor sentiment is likely to play a role in idiosyncratic mispricing. 

due to esoteric finance vocabulary, it is possible that a randomly selected individual will miss subtleties associated with payoffs that can lead to an incorrect labeling. 

If asset markets are partially efficient (i.e. investor sentiment does not determine a portion of stocks), and information is randomly dispersed, then markets should be the least efficient in the very short run. 

Aboody et al. (2018) suggest overnight returns may be an appropriate proxy for investor sentiment and find that high overnight returns predict returns. 

Frijns et al. (2017) show that US investor sentiment (as measured by the American Association of Individual Investors Investor sentiment survey) is related to market returns for several developed countries. 

The period between mid-November and March also saw a number of external events which likely had a negative impact on the market, specifically the shutdown of the US federal government. 

Given the large volatility of stock returns/abnormal market returns, these results do not imply that the unigram and bigram model can produce relatively accurate daily forecasts.