What are the future works mentioned in the paper "Cross-domain sentiment classification using a sentiment sensitive thesaurus" ?

In future, the authors plan to generalize the proposed method to solve other types of domain adaptation tasks.

How do the authors train a binary classifier to predict positive and negative sentiment in reviews?

Using the extended vectors d′ to represent reviews, the authors train a binary classifier from the source domain labeled reviews to predict positive and negative sentiment in reviews.

How can the authors create a thesaurus from a large set of reviews?

By using a sparse matrix format and approximate vector similarity computation techniques [21], the authors can efficiently create a thesaurus from a large set of reviews.

What is the main challenge of the cross-domain sentiment classification problem?

The authors model the cross-domain sentiment classification problem as one of feature expansion, where the authors append2 additional related features to feature vectors that represent source and target domain reviews in order to reduce the mis-match of features between the two domains.

Why do the authors believe that the lack of performance on books domain is a consequence of the proposed?

Because the proposed method relies upon the availability of unlabeled data for the construction of a sentiment sensitive thesaurus, the authors believe that the lack of performance on books domain is a consequence of this.

What is the potential for improving the performance of the proposed method?

given that it is much cheaper to obtain unlabeled data for a target domain than labeled data, there is strong potential for improving the performanceof the proposed method in this domain.

What is the effect of using multiple source domain in the proposed method?

To study the effect of using multiple source domain in the proposed method, the authors select the electronics domain as the target and train a sentiment classifier using all possible combinations of the three source domains books (B), kitchen appliances (K), and DVDs (D).

What is the way to analyze the features learned by the proposed method?

To analyze the features learned by the proposed method the authors train the proposed method using kitchen, DVDs, and electronics as source domains.

What is the proposed method for extending feature vectors in a binary classifier?

the proposed method is agnostic to the properties of the classifier and can be used to expand feature vectors for any binary classifier.

(Open Access) Cross-Domain Sentiment Classification Using a Sentiment Sensitive Thesaurus (2013) | Danushka Bollegala

Q: What contributions have the authors mentioned in the paper "Cross-domain sentiment classification using a sentiment sensitive thesaurus" ?

The authors propose a method to overcome this problem in cross-domain sentiment classification. First, the authors create a sentiment sensitive distributional thesaurus using labeled data for the source domains and unlabeled data for both source and target domains. Next, the authors use the created thesaurus to expand feature vectors during train and test times in a binary classifier. The authors conduct an extensive empirical analysis of the proposed method on single and multi-source domain adaptation, unsupervised and supervised domain adaptation, and numerous similarity measures for creating the sentiment sensitive thesaurus.

Cross-domain sentiment classification using a sentiment

sensitive thesaurus

Article (Accepted Version)

http://sro.sussex.ac.uk

Bollegala, Danushka, Weir, David and Carroll, John (2013) Cross-domain sentiment classification

using a sentiment sensitive thesaurus. IEEE Transactions on Knowledge and Data Engineering,

25 (8). pp. 1719-1731. ISSN 1041-4347

This version is available from Sussex Research Online: http://sro.sussex.ac.uk/id/eprint/43452/

This document is made available in accordance with publisher policies and may differ from the

published version or from the version of record. If you wish to cite this item you are advised to

consult the publisher’s version. Please see the URL above for details on accessing the published

version.

Sussex Research Online is a digital repository of the research output of the University.

author(s) and/or other copyright owners. To the extent reasonable and practicable, the material

made available in SRO has been checked for eligibility before being made available.

Copies of full text items generally can be reproduced, displayed or performed and given to third

parties in any format or medium for personal research or study, educational, or not-for-profit

purposes without prior permission or charge, provided that the authors, title and full bibliographic

details are credited, a hyperlink and/or URL is given for the original metadata page and the

content is not changed in any way.

Cross-Domain Sentiment Classiﬁcation

using a Sentiment Sensitive Thesaurus

Danushka Bollegala, Member, IEEE, David Weir and John Carroll

Abstract—Automatic classiﬁcation of sentiment is important for numerous applications such as opinion mining, opinion summarization,

contextual advertising, and market analysis. Typically, sentiment classiﬁcation has been modeled as the problem of training a binary

classiﬁer using reviews annotated for positive or negative sentiment. However, sentiment is expressed differently in different domains,

and annotating corpora for every possible domain of interest is costly. Applying a sentiment classiﬁer trained using labeled data for a

particular domain to classify sentiment of user reviews on a different domain often results in poor performance because words that

occur in the train (source) domain might not appear in the test (target) domain. We propose a method to overcome this problem in

cross-domain sentiment classiﬁcation. First, we create a sentiment sensitive distributional thesaurus using labeled data for the source

domains and unlabeled data for both source and target domains. Sentiment sensitivity is achieved in the thesaurus by incorporating

document level sentiment labels in the context vectors used as the basis for measuring the distributional similarity between words.

Next, we use the created thesaurus to expand feature vectors during train and test times in a binary classiﬁer. The proposed method

signiﬁcantly outperforms numerous baselines and returns results that are comparable with previously proposed cross-domain sentiment

classiﬁcation methods on a benchmark dataset containing Amazon user reviews for different types of products. We conduct an

extensive empirical analysis of the proposed method on single and multi-source domain adaptation, unsupervised and supervised

domain adaptation, and numerous similarity measures for creating the sentiment sensitive thesaurus. Moreover, our comparisons

against the SentiWordNet, a lexical resource for word polarity, show that the created sentiment-sensitive thesaurus accurately captures

words that express similar sentiments.

Index Terms—Cross-Domain Sentiment Classiﬁcation, Domain Adaptation, Thesauri Creation

1 INTRODUCTION

SERS express their opinions about products or ser-

vices they consume in blog posts, shopping sites, or

review sites. Reviews on a wide variety of commodities

are available on the Web such as, books (amazon.com),

hotels (tripadvisor.com), movies (imdb.com), automo-

biles (caranddriver.com), and restaurants (yelp.com). It

is useful for both the consumers as well as for the

producers to know what general public think about a

particular product or service. Automatic document level

sentiment classiﬁcation [1], [2] is the task of classifying

a given review with respect to the sentiment expressed

by the author of the review. For example, a sentiment

classiﬁer might classify a user review about a movie

as positive or negative depending on the sentiment ex-

pressed in the review. Sentiment classiﬁcation has been

applied in numerous tasks such as opinion mining [3],

opinion summarization [4], contextual advertising [5],

and market analysis [6]. For example, in an opinion

summarization system it is useful to ﬁrst classify all

reviews into positive or negative sentiments and then

create a summary for each sentiment type for a particular

product. A contextual advert placement system might

decide to display an advert for a particular product if a

positive sentiment is expressed in a blog post.

• D. Bollegala is with University of Tokyo,

danushka@iba.t.u-tokyo.ac.jp

D. Weir and J. Carroll are with University of Sussex,

{j.a.carroll,d.j.weir}@sussex.ac.uk

Supervised learning algorithms that require labeled

data have been successfully used to build sentiment

classiﬁers for a given domain [1]. However, sentiment

is expressed differently in different domains, and it is

costly to annotate data for each new domain in which we

would like to apply a sentiment classiﬁer. For example,

in the electronics domain the words “durable” and “light”

are used to express positive sentiment, whereas “expen-

sive” and “short battery life” often indicate negative sen-

timent. On the other hand, if we consider the books do-

main the words “exciting” and “thriller” express positive

sentiment, whereas the words “boring” and “lengthy”

usually express negative sentiment. A classiﬁer trained

on one domain might not perform well on a different

domain because it fails to learn the sentiment of the

unseen words.

The cross-domain sentiment classiﬁcation problem [7], [8]

focuses on the challenge of training a classiﬁer from one

or more domains (source domains) and applying the

trained classiﬁer on a different domain (target domain).

A cross-domain sentiment classiﬁcation system must

overcome two main challenges. First, we must iden-

tify which source domain features are related to which

target domain features. Second, we require a learning

framework to incorporate the information regarding the

relatedness of source and target domain features. In this

paper, we propose a cross-domain sentiment classiﬁca-

tion method that overcomes both those challenges.

We model the cross-domain sentiment classiﬁcation

problem as one of feature expansion, where we append

additional related features to feature vectors that repre-

sent source and target domain reviews in order to reduce

the mis-match of features between the two domains.

Methods that use related features have been successfully

used in numerous tasks such as query expansion [9] in

information retrieval [10], and document classiﬁcation

[11]. For example, in query expansion, a user query

containing the word car might be expanded to car OR

automobile, thereby retrieving documents that contain

either the term car or the term automobile. However, to

the best of our knowledge, feature expansion techniques

have not previously been applied to the task of cross-

domain sentiment classiﬁcation.

We create a sentiment sensitive thesaurus that aligns

different words that express the same sentiment in

different domains. We use labeled data from multiple

source domains and unlabeled data from source and

target domains to represent the distribution of features.

We use lexical elements (unigrams and bigrams of word

lemma) and sentiment elements (rating information) to

represent a user review. Next, for each lexical element

we measure its relatedness to other lexical elements

and group related lexical elements to create a sentiment

sensitive thesaurus. The thesaurus captures the related-

ness among lexical elements that appear in source and

target domains based on the contexts in which the lexical

elements appear (its distributional context). A distinctive

aspect of our approach is that, in addition to the usual

co-occurrence features typically used in characterizing a

word’s distributional context, we make use, where possi-

ble, of the sentiment label of a document: i.e. sentiment

labels form part of our context features. This is what

makes the distributional thesaurus sentiment sensitive.

Unlabeled data is cheaper to collect compared to labeled

data and is often available in large quantities. The use

of unlabeled data enables us to accurately estimate the

distribution of words in source and target domains.

The proposed method can learn from a large amount

of unlabeled data to leverage a robust cross-domain

sentiment classiﬁer.

In our proposed method, we use the automatically

created thesaurus to expand feature vectors in a binary

classiﬁer at train and test times by introducing related

lexical elements from the thesaurus. We use L1 regu-

larized logistic regression as the classiﬁcation algorithm.

However, the proposed method is agnostic to the prop-

erties of the classiﬁer and can be used to expand feature

vectors for any binary classiﬁer. As shown later in the

experiments, L1 regularization enables us to select a

small subset of features for the classiﬁer.

Our contributions in this work can be summarized as

follows.

• We propose a fully automatic method to create

a thesaurus that is sensitive to the sentiment of

words expressed in different domains. We utilize

both labeled and unlabeled data available for the

source domains and unlabeled data from the target

domain.

• We propose a method to use the created thesaurus

to expand feature vectors at train and test times in

a binary classiﬁer.

• We compare the sentiment classiﬁcation accuracy of

our proposed method against numerous baselines

and previously proposed cross-domain sentiment

classiﬁcation methods for both single source and

multi-source adaptation settings.

• We conduct a series of experiments to evaluate the

potential applicability of the proposed method in

real-world domain adaptation settings. The perfor-

mance of the proposed method directly depends

on the sentiment sensitive thesaurus we use for

feature expansion. In Section 6.3, we create multiple

thesauri using different relatedness measures and

study the level of performance achieved by the

proposed method. In real-world settings we usu-

ally have numerous domain at our disposal that

can be used as sources to adapt to a novel target

domain. Therefore, it is important to study how the

performance of the proposed method vary when we

have multiple source domains. We study this effect

experimentally in Section 6.4. The amount of train-

ing data required by a domain adaptation method

to achieve an acceptable level of performance on a

target domain is an important factor. In Section 6.5,

we experimentally study the effect on source/target

labeled/unlabeled dataset sizes on the proposed

method.

• We study the ability of our method to accurately

predict the polarity of words using SentiWordNet,

a lexical resource in which each WordNet synset is

associated with a polarity score.

2 PROBLEM SETTING

We deﬁne a domain D as a class of entities in the world

or a semantic concept. For example, different types of

products such as books, DVDs, or automobiles are con-

sidered as different domains. Given a review written by

a user on a product that belongs to a particular domain,

the objective is to predict the sentiment expressed by

the author in the review about the product. We limit

ourselves to binary sentiment classiﬁcation of entire

reviews.

We denote a source domain by D

src

and a target

domain by D

tar

. The set of labeled instances from the

source domain, L(D

src

), contains pairs (t, c) where a

review, t, is assigned a sentiment label, c. Here, c ∈

{1, −1}, and the sentiment labels +1 and −1 respectively

denote positive and negative sentiments. In addition to

positive and negative sentiment reviews, there can also

be neutral and mixed reviews in practical applications. If

a review discusses both positive and negative aspects of

a particular product, then such a review is considered

as a mixed sentiment review. On the other hand, if a

review does not contain neither positive nor negative

sentiment regarding a particular product then it is con-

sidered as neutral. Although this paper only focuses on

positive and negative sentiment reviews, it is not hard to

extend the proposed method to address multi-category

sentiment classiﬁcation problems.

In addition to the labeled data from the source do-

main, we assume the availability of unlabeled data from

both source and target domains. We denote the set of

unlabeled data in the source domain by U (D

src

), and the

set of unlabeled data in the target domain by U(D

tar

We deﬁne cross-domain sentiment classiﬁcation as the

task of learning a binary classiﬁer, F using L(D

src

U(D

src

), and U(D

tar

) to predict the sentiment label of

a review t in the target domain. Unlike previous work

which attempts to learn a cross-domain classiﬁer using a

single source domain, we use data from multiple source

domains to learn a robust classiﬁer that generalizes

across multiple domains.

3 A MOTIVATING EXAMPLE

Let us consider the reviews shown in Table 1 for the two

domains: books and kitchen appliances. Table 1 shows two

positive and one negative reviews from each domain.

We have emphasized the words that express the sen-

timent of the author in a review using boldface. From

Table 1 we see that the words excellent, broad, high

quality, interesting, and well researched are used to

express a positive sentiment on books, whereas the word

disappointed indicates a negative sentiment. On the

other hand, in the kitchen appliances domain the words

thrilled, high quality, professional, energy saving, lean,

and delicious express a positive sentiment, whereas

the words rust and disappointed express a negative

sentiment. Although words such as high quality would

express a positive sentiment in both domains, and dis-

appointed a negative sentiment, it is unlikely that we

would encounter words such as well researched for

kitchen appliances or rust or delicious in reviews on

books. Therefore, a model that is trained only using

reviews on books might not have any weights learnt for

delicious or rust, which makes it difﬁcult to accurately

classify reviews on kitchen appliances using this model.

One solution to this feature mismatch problem is

to use a thesaurus that groups different words that

express the same sentiment. For example, if we know

that both excellent and delicious are positive sentiment

words, then we can use this knowledge to expand a

feature vector that contains the word delicious using the

word excellent, thereby reducing the mismatch between

features in a test instance and a trained model. There

are two important questions that must be addressed in

this approach: how to automatically construct a thesaurus

that is sensitive to the sentiments expressed by words?, and

how to use the thesaurus to expand feature vectors during

training and classiﬁcation?. The ﬁrst question is discussed

in Section 4, where we propose a distributional approach

to construct a sentiment sensitive thesaurus using both

labeled and unlabeled data from multiple domains. The

second question is addressed in Section 5, where we

propose a ranking score to select the candidates from

the thesaurus to expand a given feature vector.

4 SENTIMENT SENSITIVE THESAURUS

As we saw in our example in Section 3, a fundamental

problem when applying a sentiment classiﬁer trained on

a particular domain to classify reviews on a different

domain is that words (hence features) that appear in the

reviews in the target domain do not always appear in the

trained model. To overcome this feature mismatch prob-

lem, we construct a sentiment sensitive thesaurus that

captures the relatedness of words as used in different

domains. Next, we describe the procedure to construct

our sentiment sensitive thesaurus.

Given a labeled or an unlabeled review, we ﬁrst

split the review into individual sentences and conduct

part-of-speech (POS) tagging and lemmatization using

the RASP system [12]. Lemmatization is the process of

normalizing the inﬂected forms of a word to its lemma.

For example, both singular and plural versions of a noun

are lemmatized to the same base form. Lemmatization

reduces the feature sparseness and has shown to be

effective in text classiﬁcation tasks [13].

We then apply a simple word ﬁlter based on POS

tags to ﬁlter out function words, retaining only nouns,

verbs, adjectives, and adverbs. In particular, adjectives

have been identiﬁed as good indicators of sentiment in

previous work [14], [15]. Following the previous work

in cross-domain sentiment classiﬁcation, we model a

review as a bag of words. We then select unigrams

and bigrams from each sentence. For the remainder of

this paper, we will refer both unigrams and bigrams

collectively as lexical elements. In previous work on

sentiment classiﬁcation it has been shown that the use

of both unigrams and bigrams are useful to train a

sentiment classiﬁer [7]. We note that it is possible to

create lexical elements from both source domain labeled

reviews (L(D

src

)) as well as unlabeled reviews from

source and target domains (U(D

src

) and U(D

tar

)).

Next, from each source domain labeled review we

create sentiment elements by appending the label of the

review to each lexical element we generate from that

review. For example, consider the sentence selected from

a positive review on a book shown in Table 2. In Table 2,

we use the notation “*P” to indicate positive sentiment

elements and “*N” to indicate negative sentiment ele-

ments. The example sentence shown in Table 2 is selected

from a positively labeled review, and generates positive

sentiment elements as show in Table 2. Sentiment ele-

ments, extracted only using labeled reviews in the source

domain, encode the sentiment information for lexical

elements extracted from source and target domains.

We represent a lexical or sentiment element u by a

feature vector u, where each lexical or sentiment element

w that co-occurs with u in a review sentence contributes

a feature to u. Moreover, the value of the feature w in

vector u is denoted by f (u, w). The vector u can be

TABLE 1

Positive (+) and negative (-) sentiment reviews in two different domains: books and kitchen.

books kitchen appliances

+ Excellent and broad survey of the development of civilization

with all the punch of high quality ﬁction.

I was so thrilled when I unpack my processor. It is so high

quality and professional in both looks and performance.

+ This is an interesting and well researched book. Energy saving grill. My husband loves the burgers that I make

from this grill. They are lean and delicious.

- Whenever a new book by Philippa Gregory comes out, I buy it

hoping to have the same experience, and lately have been sorely

disappointed.

These knives are already showing spots of rust despite washing

by hand and drying. Very disappointed.

TABLE 2

Generating lexical and sentiment elements from a positive review sentence.

sentence Excellent and broad survey of the development of civilization.

POS tags Excellent/JJ and/CC broad/JJ survey/NN1 of/IO the/AT

development/NN1 of/IO civilization/NN1

lexical elements (unigrams) excellent, broad, survey, development, civilization

lexical elements (bigrams) excellent+broad, broad+survey, survey+development, development+civilization

sentiment elements excellent*P, broad*P, survey*P, development*P, civilization*P,

excellent+broad*P, broad+survey*P, survey+development*P, development+civilization*P

seen as a compact representation of the distribution of an

element u over the set of elements that co-occur with u

in the reviews. The Distributional hypothesis states that

words that have similar distributions are semantically

similar [16].

We compute f(u, w) as the pointwise mutual infor-

mation between a lexical element u and a feature w as

follows:

f(u, w) = log

c(u,w)

i=1

c(i,w)

j=1

c(u,j)

. (1)

Here, c(u, w) denotes the number of review sentences in

which a lexical element u and a feature w co-occur, n

and m respectively denote the total number of lexical

elements and the total number of features, and N =

i=1

j=1

c(i, j). Using pointwise mutual information

to weight features has been shown to be useful in numer-

ous tasks in natural language processing such as similar-

ity measurement [17], word classiﬁcation [18], and word

clustering [19]. However, pointwise mutual information

is known to be biased towards infrequent elements and

features. We follow the discounting approach proposed

by Pantel & Ravichandran [18] to overcome this bias.

Next, for two lexical or sentiment elements u and v

(represented by feature vectors u and v, respectively),

we compute the relatedness τ(v, u) of the element v to

the element u as follows:

τ(v, u) =

w∈{x|f (v,x)>0}

f(u, w)

w∈{x|f (u,x)>0}

f(u, w)

. (2)

The relatedness score τ (v, u) can be interpreted as the

proportion of pmi-weighted features of the element u

that are shared with element v. Note that pointwise mu-

tual information values can become negative in practice

even after discounting for rare occurrences. To avoid

considering negative pointwise mutual information val-

ues, we only consider positive weights in Equation 2.

Note that relatedness is an asymmetric measure accord-

ing the deﬁnition given in Equation 2, and the related-

ness τ (v, u) of an element v to another element u is not

necessarily equal to τ(u, v), the relatedness of u to v.

In cross-domain sentiment classiﬁcation the source

and target domains are not symmetric. For example,

consider the two domains shown in Table 1. Given

the target domain (kitchen appliances) and the lexi-

cal element “energy saving”, we must identify that it

is similar in sentiment (positive) to a source domain

(books) lexical element such as “well researched” and

expand “energy saving” by “well researched”, when we

must classify a review in the target (kitchen appliances)

domain. Conversely, let us assume that “energy saving”

also appears in the books domain (e.g. a book about

ecological systems that attempt to minimize the use of

energy) but “well researched” does not appear in the

kitchen appliances domain. Under such circumstances,

we must not expand “well researched” by “energy sav-

ing” when we must classify a target (books) domain

using a model trained on the source (kitchen appliances)

domain reviews.

The relatedness measure deﬁned in Equation 2 can be

further explained as the co-occurrences of u that can be

recalled using v according to the co-occurrence retrieval

framework proposed by Weeds and Weir [20]. In Section

6.3, we empirically compare the proposed relatedness

measure with several other popular relatedness mea-

sures in a cross-domain sentiment classiﬁcation task.

We use the relatedness measure deﬁned in Equation

2 to construct a sentiment sensitive thesaurus in which,

for each lexical element u we list up lexical elements v

that co-occur with v (i.e. f(u, v) > 0) in the descending

order of the relatedness values τ(v, u). For example,

for the word excellent the sentiment sensitive thesaurus

would list awsome and delicious as related words. In the

remainder of the paper, we use the term base entry to

refer to a lexical element u (e.g. excellent in the previous

Cross-Domain Sentiment Classification Using a Sentiment Sensitive Thesaurus

Figures

Citations

A survey on opinion mining and sentiment analysis

A review of affective computing

Sentiment Analysis of Twitter Data :A Survey of Techniques

A survey of sentiment analysis in social media

Sentiment Analysis of Twitter Data: A Survey of Techniques

References

Introduction to Modern Information Retrieval

Foundations of Statistical Natural Language Processing

Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

Opinion Mining and Sentiment Analysis

Mining and summarizing customer reviews

Related Papers (5)

Thumbs up? Sentiment Classification using Machine Learning Techniques

Opinion Mining and Sentiment Analysis

Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews

A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts

Mining and summarizing customer reviews

Frequently Asked Questions (10)

Q1. What contributions have the authors mentioned in the paper "Cross-domain sentiment classification using a sentiment sensitive thesaurus" ?

Q2. What are the future works mentioned in the paper "Cross-domain sentiment classification using a sentiment sensitive thesaurus" ?

Q3. How do the authors train a binary classifier to predict positive and negative sentiment in reviews?

Q4. How can the authors create a thesaurus from a large set of reviews?

Q5. What is the main challenge of the cross-domain sentiment classification problem?

Q6. Why do the authors believe that the lack of performance on books domain is a consequence of the proposed?

Q7. What is the potential for improving the performance of the proposed method?

Q8. What is the effect of using multiple source domain in the proposed method?

Q9. What is the way to analyze the features learned by the proposed method?

Q10. What is the proposed method for extending feature vectors in a binary classifier?