scispace - formally typeset
Open AccessJournal ArticleDOI

Cross-Domain Sentiment Classification Using a Sentiment Sensitive Thesaurus

Reads0
Chats0
TLDR
The proposed method significantly outperforms numerous baselines and returns results that are comparable with previously proposed cross-domain sentiment classification methods on a benchmark data set containing Amazon user reviews for different types of products.
Abstract
Automatic classification of sentiment is important for numerous applications such as opinion mining, opinion summarization, contextual advertising, and market analysis. Typically, sentiment classification has been modeled as the problem of training a binary classifier using reviews annotated for positive or negative sentiment. However, sentiment is expressed differently in different domains, and annotating corpora for every possible domain of interest is costly. Applying a sentiment classifier trained using labeled data for a particular domain to classify sentiment of user reviews on a different domain often results in poor performance because words that occur in the train (source) domain might not appear in the test (target) domain. We propose a method to overcome this problem in cross-domain sentiment classification. First, we create a sentiment sensitive distributional thesaurus using labeled data for the source domains and unlabeled data for both source and target domains. Sentiment sensitivity is achieved in the thesaurus by incorporating document level sentiment labels in the context vectors used as the basis for measuring the distributional similarity between words. Next, we use the created thesaurus to expand feature vectors during train and test times in a binary classifier. The proposed method significantly outperforms numerous baselines and returns results that are comparable with previously proposed cross-domain sentiment classification methods on a benchmark data set containing Amazon user reviews for different types of products. We conduct an extensive empirical analysis of the proposed method on single- and multisource domain adaptation, unsupervised and supervised domain adaptation, and numerous similarity measures for creating the sentiment sensitive thesaurus. Moreover, our comparisons against the SentiWordNet, a lexical resource for word polarity, show that the created sentiment-sensitive thesaurus accurately captures words that express similar sentiments.

read more

Content maybe subject to copyright    Report

Cross-domain sentiment classification using a sentiment
sensitive thesaurus
Article (Accepted Version)
http://sro.sussex.ac.uk
Bollegala, Danushka, Weir, David and Carroll, John (2013) Cross-domain sentiment classification
using a sentiment sensitive thesaurus. IEEE Transactions on Knowledge and Data Engineering,
25 (8). pp. 1719-1731. ISSN 1041-4347
This version is available from Sussex Research Online: http://sro.sussex.ac.uk/id/eprint/43452/
This document is made available in accordance with publisher policies and may differ from the
published version or from the version of record. If you wish to cite this item you are advised to
consult the publisher’s version. Please see the URL above for details on accessing the published
version.
Copyright and reuse:
Sussex Research Online is a digital repository of the research output of the University.
Copyright and all moral rights to the version of the paper presented here belong to the individual
author(s) and/or other copyright owners. To the extent reasonable and practicable, the material
made available in SRO has been checked for eligibility before being made available.
Copies of full text items generally can be reproduced, displayed or performed and given to third
parties in any format or medium for personal research or study, educational, or not-for-profit
purposes without prior permission or charge, provided that the authors, title and full bibliographic
details are credited, a hyperlink and/or URL is given for the original metadata page and the
content is not changed in any way.

1
Cross-Domain Sentiment Classification
using a Sentiment Sensitive Thesaurus
Danushka Bollegala, Member, IEEE, David Weir and John Carroll
Abstract—Automatic classification of sentiment is important for numerous applications such as opinion mining, opinion summarization,
contextual advertising, and market analysis. Typically, sentiment classification has been modeled as the problem of training a binary
classifier using reviews annotated for positive or negative sentiment. However, sentiment is expressed differently in different domains,
and annotating corpora for every possible domain of interest is costly. Applying a sentiment classifier trained using labeled data for a
particular domain to classify sentiment of user reviews on a different domain often results in poor performance because words that
occur in the train (source) domain might not appear in the test (target) domain. We propose a method to overcome this problem in
cross-domain sentiment classification. First, we create a sentiment sensitive distributional thesaurus using labeled data for the source
domains and unlabeled data for both source and target domains. Sentiment sensitivity is achieved in the thesaurus by incorporating
document level sentiment labels in the context vectors used as the basis for measuring the distributional similarity between words.
Next, we use the created thesaurus to expand feature vectors during train and test times in a binary classifier. The proposed method
significantly outperforms numerous baselines and returns results that are comparable with previously proposed cross-domain sentiment
classification methods on a benchmark dataset containing Amazon user reviews for different types of products. We conduct an
extensive empirical analysis of the proposed method on single and multi-source domain adaptation, unsupervised and supervised
domain adaptation, and numerous similarity measures for creating the sentiment sensitive thesaurus. Moreover, our comparisons
against the SentiWordNet, a lexical resource for word polarity, show that the created sentiment-sensitive thesaurus accurately captures
words that express similar sentiments.
Index Terms—Cross-Domain Sentiment Classification, Domain Adaptation, Thesauri Creation
F
1 INTRODUCTION
U
SERS express their opinions about products or ser-
vices they consume in blog posts, shopping sites, or
review sites. Reviews on a wide variety of commodities
are available on the Web such as, books (amazon.com),
hotels (tripadvisor.com), movies (imdb.com), automo-
biles (caranddriver.com), and restaurants (yelp.com). It
is useful for both the consumers as well as for the
producers to know what general public think about a
particular product or service. Automatic document level
sentiment classification [1], [2] is the task of classifying
a given review with respect to the sentiment expressed
by the author of the review. For example, a sentiment
classifier might classify a user review about a movie
as positive or negative depending on the sentiment ex-
pressed in the review. Sentiment classification has been
applied in numerous tasks such as opinion mining [3],
opinion summarization [4], contextual advertising [5],
and market analysis [6]. For example, in an opinion
summarization system it is useful to first classify all
reviews into positive or negative sentiments and then
create a summary for each sentiment type for a particular
product. A contextual advert placement system might
decide to display an advert for a particular product if a
positive sentiment is expressed in a blog post.
D. Bollegala is with University of Tokyo,
danushka@iba.t.u-tokyo.ac.jp
D. Weir and J. Carroll are with University of Sussex,
{j.a.carroll,d.j.weir}@sussex.ac.uk
Supervised learning algorithms that require labeled
data have been successfully used to build sentiment
classifiers for a given domain [1]. However, sentiment
is expressed differently in different domains, and it is
costly to annotate data for each new domain in which we
would like to apply a sentiment classifier. For example,
in the electronics domain the words “durable” and “light”
are used to express positive sentiment, whereas “expen-
sive” and “short battery life” often indicate negative sen-
timent. On the other hand, if we consider the books do-
main the words “exciting” and “thriller express positive
sentiment, whereas the words “boring” and “lengthy”
usually express negative sentiment. A classifier trained
on one domain might not perform well on a different
domain because it fails to learn the sentiment of the
unseen words.
The cross-domain sentiment classification problem [7], [8]
focuses on the challenge of training a classifier from one
or more domains (source domains) and applying the
trained classifier on a different domain (target domain).
A cross-domain sentiment classification system must
overcome two main challenges. First, we must iden-
tify which source domain features are related to which
target domain features. Second, we require a learning
framework to incorporate the information regarding the
relatedness of source and target domain features. In this
paper, we propose a cross-domain sentiment classifica-
tion method that overcomes both those challenges.
We model the cross-domain sentiment classification
problem as one of feature expansion, where we append

2
additional related features to feature vectors that repre-
sent source and target domain reviews in order to reduce
the mis-match of features between the two domains.
Methods that use related features have been successfully
used in numerous tasks such as query expansion [9] in
information retrieval [10], and document classification
[11]. For example, in query expansion, a user query
containing the word car might be expanded to car OR
automobile, thereby retrieving documents that contain
either the term car or the term automobile. However, to
the best of our knowledge, feature expansion techniques
have not previously been applied to the task of cross-
domain sentiment classification.
We create a sentiment sensitive thesaurus that aligns
different words that express the same sentiment in
different domains. We use labeled data from multiple
source domains and unlabeled data from source and
target domains to represent the distribution of features.
We use lexical elements (unigrams and bigrams of word
lemma) and sentiment elements (rating information) to
represent a user review. Next, for each lexical element
we measure its relatedness to other lexical elements
and group related lexical elements to create a sentiment
sensitive thesaurus. The thesaurus captures the related-
ness among lexical elements that appear in source and
target domains based on the contexts in which the lexical
elements appear (its distributional context). A distinctive
aspect of our approach is that, in addition to the usual
co-occurrence features typically used in characterizing a
word’s distributional context, we make use, where possi-
ble, of the sentiment label of a document: i.e. sentiment
labels form part of our context features. This is what
makes the distributional thesaurus sentiment sensitive.
Unlabeled data is cheaper to collect compared to labeled
data and is often available in large quantities. The use
of unlabeled data enables us to accurately estimate the
distribution of words in source and target domains.
The proposed method can learn from a large amount
of unlabeled data to leverage a robust cross-domain
sentiment classifier.
In our proposed method, we use the automatically
created thesaurus to expand feature vectors in a binary
classifier at train and test times by introducing related
lexical elements from the thesaurus. We use L1 regu-
larized logistic regression as the classification algorithm.
However, the proposed method is agnostic to the prop-
erties of the classifier and can be used to expand feature
vectors for any binary classifier. As shown later in the
experiments, L1 regularization enables us to select a
small subset of features for the classifier.
Our contributions in this work can be summarized as
follows.
We propose a fully automatic method to create
a thesaurus that is sensitive to the sentiment of
words expressed in different domains. We utilize
both labeled and unlabeled data available for the
source domains and unlabeled data from the target
domain.
We propose a method to use the created thesaurus
to expand feature vectors at train and test times in
a binary classifier.
We compare the sentiment classification accuracy of
our proposed method against numerous baselines
and previously proposed cross-domain sentiment
classification methods for both single source and
multi-source adaptation settings.
We conduct a series of experiments to evaluate the
potential applicability of the proposed method in
real-world domain adaptation settings. The perfor-
mance of the proposed method directly depends
on the sentiment sensitive thesaurus we use for
feature expansion. In Section 6.3, we create multiple
thesauri using different relatedness measures and
study the level of performance achieved by the
proposed method. In real-world settings we usu-
ally have numerous domain at our disposal that
can be used as sources to adapt to a novel target
domain. Therefore, it is important to study how the
performance of the proposed method vary when we
have multiple source domains. We study this effect
experimentally in Section 6.4. The amount of train-
ing data required by a domain adaptation method
to achieve an acceptable level of performance on a
target domain is an important factor. In Section 6.5,
we experimentally study the effect on source/target
labeled/unlabeled dataset sizes on the proposed
method.
We study the ability of our method to accurately
predict the polarity of words using SentiWordNet,
a lexical resource in which each WordNet synset is
associated with a polarity score.
2 PROBLEM SETTING
We define a domain D as a class of entities in the world
or a semantic concept. For example, different types of
products such as books, DVDs, or automobiles are con-
sidered as different domains. Given a review written by
a user on a product that belongs to a particular domain,
the objective is to predict the sentiment expressed by
the author in the review about the product. We limit
ourselves to binary sentiment classification of entire
reviews.
We denote a source domain by D
src
and a target
domain by D
tar
. The set of labeled instances from the
source domain, L(D
src
), contains pairs (t, c) where a
review, t, is assigned a sentiment label, c. Here, c
{1, 1}, and the sentiment labels +1 and 1 respectively
denote positive and negative sentiments. In addition to
positive and negative sentiment reviews, there can also
be neutral and mixed reviews in practical applications. If
a review discusses both positive and negative aspects of
a particular product, then such a review is considered
as a mixed sentiment review. On the other hand, if a
review does not contain neither positive nor negative
sentiment regarding a particular product then it is con-
sidered as neutral. Although this paper only focuses on

3
positive and negative sentiment reviews, it is not hard to
extend the proposed method to address multi-category
sentiment classification problems.
In addition to the labeled data from the source do-
main, we assume the availability of unlabeled data from
both source and target domains. We denote the set of
unlabeled data in the source domain by U (D
src
), and the
set of unlabeled data in the target domain by U(D
tar
).
We define cross-domain sentiment classification as the
task of learning a binary classifier, F using L(D
src
),
U(D
src
), and U(D
tar
) to predict the sentiment label of
a review t in the target domain. Unlike previous work
which attempts to learn a cross-domain classifier using a
single source domain, we use data from multiple source
domains to learn a robust classifier that generalizes
across multiple domains.
3 A MOTIVATING EXAMPLE
Let us consider the reviews shown in Table 1 for the two
domains: books and kitchen appliances. Table 1 shows two
positive and one negative reviews from each domain.
We have emphasized the words that express the sen-
timent of the author in a review using boldface. From
Table 1 we see that the words excellent, broad, high
quality, interesting, and well researched are used to
express a positive sentiment on books, whereas the word
disappointed indicates a negative sentiment. On the
other hand, in the kitchen appliances domain the words
thrilled, high quality, professional, energy saving, lean,
and delicious express a positive sentiment, whereas
the words rust and disappointed express a negative
sentiment. Although words such as high quality would
express a positive sentiment in both domains, and dis-
appointed a negative sentiment, it is unlikely that we
would encounter words such as well researched for
kitchen appliances or rust or delicious in reviews on
books. Therefore, a model that is trained only using
reviews on books might not have any weights learnt for
delicious or rust, which makes it difficult to accurately
classify reviews on kitchen appliances using this model.
One solution to this feature mismatch problem is
to use a thesaurus that groups different words that
express the same sentiment. For example, if we know
that both excellent and delicious are positive sentiment
words, then we can use this knowledge to expand a
feature vector that contains the word delicious using the
word excellent, thereby reducing the mismatch between
features in a test instance and a trained model. There
are two important questions that must be addressed in
this approach: how to automatically construct a thesaurus
that is sensitive to the sentiments expressed by words?, and
how to use the thesaurus to expand feature vectors during
training and classification?. The first question is discussed
in Section 4, where we propose a distributional approach
to construct a sentiment sensitive thesaurus using both
labeled and unlabeled data from multiple domains. The
second question is addressed in Section 5, where we
propose a ranking score to select the candidates from
the thesaurus to expand a given feature vector.
4 SENTIMENT SENSITIVE THESAURUS
As we saw in our example in Section 3, a fundamental
problem when applying a sentiment classifier trained on
a particular domain to classify reviews on a different
domain is that words (hence features) that appear in the
reviews in the target domain do not always appear in the
trained model. To overcome this feature mismatch prob-
lem, we construct a sentiment sensitive thesaurus that
captures the relatedness of words as used in different
domains. Next, we describe the procedure to construct
our sentiment sensitive thesaurus.
Given a labeled or an unlabeled review, we first
split the review into individual sentences and conduct
part-of-speech (POS) tagging and lemmatization using
the RASP system [12]. Lemmatization is the process of
normalizing the inflected forms of a word to its lemma.
For example, both singular and plural versions of a noun
are lemmatized to the same base form. Lemmatization
reduces the feature sparseness and has shown to be
effective in text classification tasks [13].
We then apply a simple word filter based on POS
tags to filter out function words, retaining only nouns,
verbs, adjectives, and adverbs. In particular, adjectives
have been identified as good indicators of sentiment in
previous work [14], [15]. Following the previous work
in cross-domain sentiment classification, we model a
review as a bag of words. We then select unigrams
and bigrams from each sentence. For the remainder of
this paper, we will refer both unigrams and bigrams
collectively as lexical elements. In previous work on
sentiment classification it has been shown that the use
of both unigrams and bigrams are useful to train a
sentiment classifier [7]. We note that it is possible to
create lexical elements from both source domain labeled
reviews (L(D
src
)) as well as unlabeled reviews from
source and target domains (U(D
src
) and U(D
tar
)).
Next, from each source domain labeled review we
create sentiment elements by appending the label of the
review to each lexical element we generate from that
review. For example, consider the sentence selected from
a positive review on a book shown in Table 2. In Table 2,
we use the notation “*P” to indicate positive sentiment
elements and “*N” to indicate negative sentiment ele-
ments. The example sentence shown in Table 2 is selected
from a positively labeled review, and generates positive
sentiment elements as show in Table 2. Sentiment ele-
ments, extracted only using labeled reviews in the source
domain, encode the sentiment information for lexical
elements extracted from source and target domains.
We represent a lexical or sentiment element u by a
feature vector u, where each lexical or sentiment element
w that co-occurs with u in a review sentence contributes
a feature to u. Moreover, the value of the feature w in
vector u is denoted by f (u, w). The vector u can be

4
TABLE 1
Positive (+) and negative (-) sentiment reviews in two different domains: books and kitchen.
books kitchen appliances
+ Excellent and broad survey of the development of civilization
with all the punch of high quality fiction.
I was so thrilled when I unpack my processor. It is so high
quality and professional in both looks and performance.
+ This is an interesting and well researched book. Energy saving grill. My husband loves the burgers that I make
from this grill. They are lean and delicious.
- Whenever a new book by Philippa Gregory comes out, I buy it
hoping to have the same experience, and lately have been sorely
disappointed.
These knives are already showing spots of rust despite washing
by hand and drying. Very disappointed.
TABLE 2
Generating lexical and sentiment elements from a positive review sentence.
sentence Excellent and broad survey of the development of civilization.
POS tags Excellent/JJ and/CC broad/JJ survey/NN1 of/IO the/AT
development/NN1 of/IO civilization/NN1
lexical elements (unigrams) excellent, broad, survey, development, civilization
lexical elements (bigrams) excellent+broad, broad+survey, survey+development, development+civilization
sentiment elements excellent*P, broad*P, survey*P, development*P, civilization*P,
excellent+broad*P, broad+survey*P, survey+development*P, development+civilization*P
seen as a compact representation of the distribution of an
element u over the set of elements that co-occur with u
in the reviews. The Distributional hypothesis states that
words that have similar distributions are semantically
similar [16].
We compute f(u, w) as the pointwise mutual infor-
mation between a lexical element u and a feature w as
follows:
f(u, w) = log
c(u,w)
N
P
n
i=1
c(i,w)
N
×
P
m
j=1
c(u,j)
N
!
. (1)
Here, c(u, w) denotes the number of review sentences in
which a lexical element u and a feature w co-occur, n
and m respectively denote the total number of lexical
elements and the total number of features, and N =
P
n
i=1
P
m
j=1
c(i, j). Using pointwise mutual information
to weight features has been shown to be useful in numer-
ous tasks in natural language processing such as similar-
ity measurement [17], word classification [18], and word
clustering [19]. However, pointwise mutual information
is known to be biased towards infrequent elements and
features. We follow the discounting approach proposed
by Pantel & Ravichandran [18] to overcome this bias.
Next, for two lexical or sentiment elements u and v
(represented by feature vectors u and v, respectively),
we compute the relatedness τ(v, u) of the element v to
the element u as follows:
τ(v, u) =
P
w∈{x|f (v,x)>0}
f(u, w)
P
w∈{x|f (u,x)>0}
f(u, w)
. (2)
The relatedness score τ (v, u) can be interpreted as the
proportion of pmi-weighted features of the element u
that are shared with element v. Note that pointwise mu-
tual information values can become negative in practice
even after discounting for rare occurrences. To avoid
considering negative pointwise mutual information val-
ues, we only consider positive weights in Equation 2.
Note that relatedness is an asymmetric measure accord-
ing the definition given in Equation 2, and the related-
ness τ (v, u) of an element v to another element u is not
necessarily equal to τ(u, v), the relatedness of u to v.
In cross-domain sentiment classification the source
and target domains are not symmetric. For example,
consider the two domains shown in Table 1. Given
the target domain (kitchen appliances) and the lexi-
cal element “energy saving”, we must identify that it
is similar in sentiment (positive) to a source domain
(books) lexical element such as “well researched” and
expand “energy saving” by “well researched”, when we
must classify a review in the target (kitchen appliances)
domain. Conversely, let us assume that “energy saving”
also appears in the books domain (e.g. a book about
ecological systems that attempt to minimize the use of
energy) but “well researched” does not appear in the
kitchen appliances domain. Under such circumstances,
we must not expand “well researched” by “energy sav-
ing” when we must classify a target (books) domain
using a model trained on the source (kitchen appliances)
domain reviews.
The relatedness measure defined in Equation 2 can be
further explained as the co-occurrences of u that can be
recalled using v according to the co-occurrence retrieval
framework proposed by Weeds and Weir [20]. In Section
6.3, we empirically compare the proposed relatedness
measure with several other popular relatedness mea-
sures in a cross-domain sentiment classification task.
We use the relatedness measure defined in Equation
2 to construct a sentiment sensitive thesaurus in which,
for each lexical element u we list up lexical elements v
that co-occur with v (i.e. f(u, v) > 0) in the descending
order of the relatedness values τ(v, u). For example,
for the word excellent the sentiment sensitive thesaurus
would list awsome and delicious as related words. In the
remainder of the paper, we use the term base entry to
refer to a lexical element u (e.g. excellent in the previous

Citations
More filters
Journal ArticleDOI

A survey on opinion mining and sentiment analysis

TL;DR: A rigorous survey on sentiment analysis is presented, which portrays views presented by over one hundred articles published in the last decade regarding necessary tasks, approaches, and applications of sentiment analysis.
Journal ArticleDOI

A review of affective computing

TL;DR: This first of its kind, comprehensive literature review of the diverse field of affective computing focuses mainly on the use of audio, visual and text information for multimodal affect analysis, and outlines existing methods for fusing information from different modalities.
Journal ArticleDOI

Sentiment Analysis of Twitter Data :A Survey of Techniques

TL;DR: A survey and a comparative analyses of existing techniques for opinion mining like machine learning and lexicon-based approaches, together with evaluation metrics are provided, using various machine learning algorithms on twitter data streams.
Journal ArticleDOI

A survey of sentiment analysis in social media

TL;DR: A large quantity of techniques and methods are categorized and compared in the area of sentiment analysis, and different types of data and advanced tools for research are introduced, as well as their limitations.
Journal ArticleDOI

Sentiment Analysis of Twitter Data: A Survey of Techniques

TL;DR: In this paper, a survey and a comparative analysis of existing techniques for opinion mining like machine learning and lexicon-based approaches, together with evaluation metrics are provided, and general challenges and applications of Sentiment Analysis on Twitter are discussed.
References
More filters
Book

Introduction to Modern Information Retrieval

TL;DR: Reading is a need and a hobby at once and this condition is the on that will make you feel that you must read.
Book

Foundations of Statistical Natural Language Processing

TL;DR: This foundational text is the first comprehensive introduction to statistical natural language processing (NLP) to appear and provides broad but rigorous coverage of mathematical and linguistic foundations, as well as detailed discussion of statistical methods, allowing students and researchers to construct their own implementations.
Book ChapterDOI

Text Categorization with Suport Vector Machines: Learning with Many Relevant Features

TL;DR: This paper explores the use of Support Vector Machines for learning text classifiers from examples and analyzes the particular properties of learning with text data and identifies why SVMs are appropriate for this task.
Book

Opinion Mining and Sentiment Analysis

TL;DR: This survey covers techniques and approaches that promise to directly enable opinion-oriented information-seeking systems and focuses on methods that seek to address the new challenges raised by sentiment-aware applications, as compared to those that are already present in more traditional fact-based analysis.
Proceedings ArticleDOI

Mining and summarizing customer reviews

TL;DR: This research aims to mine and to summarize all the customer reviews of a product, and proposes several novel techniques to perform these tasks.
Frequently Asked Questions (10)
Q1. What contributions have the authors mentioned in the paper "Cross-domain sentiment classification using a sentiment sensitive thesaurus" ?

The authors propose a method to overcome this problem in cross-domain sentiment classification. First, the authors create a sentiment sensitive distributional thesaurus using labeled data for the source domains and unlabeled data for both source and target domains. Next, the authors use the created thesaurus to expand feature vectors during train and test times in a binary classifier. The authors conduct an extensive empirical analysis of the proposed method on single and multi-source domain adaptation, unsupervised and supervised domain adaptation, and numerous similarity measures for creating the sentiment sensitive thesaurus. 

In future, the authors plan to generalize the proposed method to solve other types of domain adaptation tasks. 

Using the extended vectors d′ to represent reviews, the authors train a binary classifier from the source domain labeled reviews to predict positive and negative sentiment in reviews. 

By using a sparse matrix format and approximate vector similarity computation techniques [21], the authors can efficiently create a thesaurus from a large set of reviews. 

The authors model the cross-domain sentiment classification problem as one of feature expansion, where the authors append2 additional related features to feature vectors that represent source and target domain reviews in order to reduce the mis-match of features between the two domains. 

Because the proposed method relies upon the availability of unlabeled data for the construction of a sentiment sensitive thesaurus, the authors believe that the lack of performance on books domain is a consequence of this. 

given that it is much cheaper to obtain unlabeled data for a target domain than labeled data, there is strong potential for improving the performanceof the proposed method in this domain. 

To study the effect of using multiple source domain in the proposed method, the authors select the electronics domain as the target and train a sentiment classifier using all possible combinations of the three source domains books (B), kitchen appliances (K), and DVDs (D). 

To analyze the features learned by the proposed method the authors train the proposed method using kitchen, DVDs, and electronics as source domains. 

the proposed method is agnostic to the properties of the classifier and can be used to expand feature vectors for any binary classifier.