What are the future works in "Mining and summarizing customer reviews" ?

In their future work, the authors plan to further improve and refine their techniques, and to deal with the outstanding problems identified above, i. e., pronoun resolution, determining the strength of opinions, and investigating opinions expressed with adverbs, verbs and nouns. Finally, the authors will also look into monitoring of customer reviews. The authors believe that monitoring will be particularly useful to product manufacturers because they want to know any new positive or negative comments on their products whenever they are available. Although a new review may be added, it may not contain any new information.

how to add adverbs with orientation to seed list?

Every time an adjective with its orientation is added to the seed list, the seed list is updated; therefore calling OrientationSearch repeatedly is necessary in order to exploit the newly added information.

what is the orientation of an opinion sentence?

The opinion words are mostly either positive or negative, e.g., there are two positive opinion words, good and exceptional in “overall this is a good camera with a really good picture clarity & an exceptional close-up shooting capability.

What is the procedure used to extract infrequent features?

The authors extract infrequent features using the procedure in Figure 6:The authors use the nearest noun/noun phrase as the noun/noun phrase that the opinion word modifies because that is what happens most of the time.

(Open Access) Mining and summarizing customer reviews (2004) | Minqing Hu

Q: What are the contributions in "Mining and summarizing customer reviews" ?

In this research, the authors aim to mine and to summarize all the customer reviews of a product. The authors do not summarize the reviews by selecting a subset or rewrite some of the original sentences from the reviews to capture the main points as in the classic text summarization. This paper proposes several novel techniques to perform these tasks.

Q: What is the key step in the process of finding the features of a product?

Since their system aims to find what people like and dislike about a given product, how to find the product features that people talk about is the crucial step.

Mining and Summarizing Customer Reviews

Minqing Hu and Bing Liu

Department of Computer Science

University of Illinois at Chicago

851 South Morgan Street

Chicago, IL 60607-7053

{mhu1, liub}@cs.uic.edu

ABSTRACT

Merchants selling products on the Web often ask their customers

to review the products that they have purchased and the

associated services. As e-commerce is becoming more and more

popular, the number of customer reviews that a product receives

grows rapidly. For a popular product, the number of reviews can

be in hundreds or even thousands. This makes it difficult for a

potential customer to read them to make an informed decision on

whether to purchase the product. It also makes it difficult for the

manufacturer of the product to keep track and to manage customer

opinions. For the manufacturer, there are additional difficulties

because many merchant sites may sell the same product and the

manufacturer normally produces many kinds of products. In this

research, we aim to mine and to summarize all the customer

reviews of a product. This summarization task is different from

traditional text summarization because we only mine the features

of the product on which the customers have expressed their

opinions and whether the opinions are positive or negative. We do

not summarize the reviews by selecting a subset or rewrite some

of the original sentences from the reviews to capture the main

points as in the classic text summarization. Our task is performed

in three steps: (1) mining product features that have been

commented on by customers; (2) identifying opinion sentences in

each review and deciding whether each opinion sentence is

positive or negative; (3) summarizing the results. This paper

proposes several novel techniques to perform these tasks. Our

experimental results using reviews of a number of products sold

online demonstrate the effectiveness of the techniques.

Categories and Subject Descriptors

H.2.8 [Database Management]: Database Applications – data

mining. I.2.7 [Artificial Intelligence]: Natural Language

Processing – text analysis.

General Terms

Algorithms, Experimentation, Human Factors.

Keywords

Text mining, sentiment classification, summarization, reviews.

1. INTRODUCTION

With the rapid expansion of e-commerce, more and more products

are sold on the Web, and more and more people are also buying

products online. In order to enhance customer satisfaction and

shopping experience, it has become a common practice for online

merchants to enable their customers to review or to express

opinions on the products that they have purchased. With more and

more common users becoming comfortable with the Web, an

increasing number of people are writing reviews. As a result, the

number of reviews that a product receives grows rapidly. Some

popular products can get hundreds of reviews at some large

merchant sites. Furthermore, many reviews are long and have

only a few sentences containing opinions on the product. This

makes it hard for a potential customer to read them to make an

informed decision on whether to purchase the product. If he/she

only reads a few reviews, he/she may get a biased view. The large

number of reviews also makes it hard for product manufacturers

to keep track of customer opinions of their products. For a product

manufacturer, there are additional difficulties because many

merchant sites may sell its products, and the manufacturer may

(almost always) produce many kinds of products.

In this research, we study the problem of generating feature-based

summaries of customer reviews of products sold online. Here,

features broadly mean product features (or attributes) and

functions. Given a set of customer reviews of a particular product,

the task involves three subtasks: (1) identifying features of the

product that customers have expressed their opinions on (called

product features); (2) for each feature, identifying review

sentences that give positive or negative opinions; and (3)

producing a summary using the discovered information.

Let us use an example to illustrate a feature-based summary.

Assume that we summarize the reviews of a particular digital

camera, digital_camera_1. The summary looks like the following:

Digital_camera_1:

Feature: picture quality

Positive: 253

Negative: 6

Feature: size

Positive: 134

Negative: 10

…

Figure 1: An example summary

Permission to make digital or hard copies of all or part of this work for

personal or classroom use is granted without fee provided that copies are

not made or distributed for profit or commercial advantage and that

copies bear this notice and the full citation on the first page. To copy

otherwise, or republish, to post on servers or to redistribute to lists,

requires prior specific permission and/or a fee.

KDD’04, August 22–25, 2004, Seattle, Washington, USA.

In Figure 1, picture quality and (camera) size are the product

features. There are 253 customer reviews that express positive

opinions about the picture quality, and only 6 that express

negative opinions. The <individual review sentences> link points

to the specific sentences and/or the whole reviews that give

positive or negative comments about the feature.

With such a feature-based summary, a potential customer can

easily see how the existing customers feel about the digital

camera. If he/she is very interested in a particular feature, he/she

can drill down by following the <individual review sentences>

link to see why existing customers like it and/or what they

complain about. For a manufacturer, it is possible to combine

summaries from multiple merchant sites to produce a single report

for each of its products.

Our task is different from traditional text summarization [15, 39,

36] in a number of ways. First of all, a summary in our case is

structured rather than another (but shorter) free text document as

produced by most text summarization systems. Second, we are

only interested in features of the product that customers have

opinions on and also whether the opinions are positive or

negative. We do not summarize the reviews by selecting or

rewriting a subset of the original sentences from the reviews to

capture their main points as in traditional text summarization.

As indicated above, our task is performed in three main steps:

(1) Mining product features that have been commented on by

customers. We make use of both data mining and natural

language processing techniques to perform this task. This

part of the study has been reported in [19]. However, for

completeness, we will summarize its techniques in this paper

and also present a comparative evaluation.

(2) Identifying opinion sentences in each review and deciding

whether each opinion sentence is positive or negative. Note

that these opinion sentences must contain one or more

product features identified above. To decide the opinion

orientation of each sentence (whether the opinion expressed

in the sentence is positive or negative), we perform three

subtasks. First, a set of adjective words (which are normally

used to express opinions) is identified using a natural

language processing method. These words are also called

opinion words in this paper. Second, for each opinion word,

we determine its semantic orientation, e.g., positive or

negative. A bootstrapping technique is proposed to perform

this task using WordNet [29, 12]. Finally, we decide the

opinion orientation of each sentence. An effective algorithm

is also given for this purpose.

(3) Summarizing the results. This step aggregates the results of

previous steps and presents them in the format of Figure 1.

Section 3 presents the detailed techniques for performing these

tasks. A system, called FBS (Feature-Based Summarization), has

also been implemented. Our experimental results with a large

number of customer reviews of 5 products sold online show that

FBS and its techniques are highly effectiveness.

2. RELATED WORK

Our work is closely related to Dave, Lawrence and Pennock’s

work in [9] on semantic classification of reviews. Using available

training corpus from some Web sites, where each review already

has a class (e.g., thumbs-up and thumbs-downs, or some other

quantitative or binary ratings), they designed and experimented a

number of methods for building sentiment classifiers. They show

that such classifiers perform quite well with test reviews. They

also used their classifiers to classify sentences obtained from Web

search results, which are obtained by a search engine using a

product name as the search query. However, the performance was

limited because a sentence contains much less information than a

review. Our work differs from theirs in three main aspects: (1)

Our focus is not on classifying each review as a whole but on

classifying each sentence in a review. Within a review some

sentences may express positive opinions about certain product

features while some other sentences may express negative

opinions about some other product features. (2) The work in [9]

does not mine product features from reviews on which the

reviewers have expressed their opinions. (3) Our method does not

need a corpus to perform the task.

In [30], Morinaga et al. compare reviews of different products in

one category to find the reputation of the target product.

However, it does not summarize reviews, and it does not mine

product features on which the reviewers have expressed their

opinions. Although they do find some frequent phrases indicating

reputations, these phrases may not be product features (e.g.,

“doesn’t work”, “benchmark result” and “no problem(s)”). In [5],

Cardie et al discuss opinion-oriented information extraction. They

aim to create summary representations of opinions to perform

question answering. They propose to use opinion-oriented

“scenario templates” to act as summary representations of the

opinions expressed in a document, or a set of documents. Our task

is different. We aim to identify product features and user opinions

on these features to automatically produce a summary. Also, no

template is used in our summary generation.

Our work is also related to but different from subjective genre

classification, sentiment classification, text summarization and

terminology finding. We discuss each of them below.

2.1 Subjective Genre Classification

Genre classification classifies texts into different styles, e.g.,

“editorial”, “novel”, “news”, “poem” etc. Although some

techniques for genre classification can recognize documents that

express opinions [23, 24, 14], they do not tell whether the

opinions are positive or negative. In our work, we need to

determine whether an opinion is positive or negative and to

perform opinion classification at the sentence level rather than at

the document level.

A more closely related work is [17], in which the authors

investigate sentence subjectivity classification and concludes that

the presence and type of adjectives in a sentence is indicative of

whether the sentence is subjective or objective. However, their

work does not address our specific task of determining the

semantic orientations of those subjective sentences. Neither do

they find features on which opinions have been expressed.

2.2 Sentiment Classification

Works of Hearst [18] and Sack [35] on sentiment-based

classification of entire documents use models inspired by

cognitive linguistics. Das and Chen [8] use a manually crafted

lexicon in conjunction with several scoring methods to classify

stock postings on an investor bulletin. Huettner and Subasic [20]

also manually construct a discriminant-word lexicon and use

fuzzy logic to classify sentiments. Tong [41] generates sentiment

timelines. It tracks online discussions about movies and displays a

plot of the number of positive and negative sentiment messages

over time. Messages are classified by looking for specific phrases

that indicate the author’s sentiment towards the movie (e.g.,

“great acting”, “wonderful visuals”, “uneven editing”). Each

phrase must be manually added to a special lexicon and manually

tagged as indicating positive or negative sentiment. The lexicon is

domain dependent (e.g., movies) and must be rebuilt for each new

domain. In contrast, in our work, we only manually create a small

list of seed adjectives tagged with positive or negative labels. Our

seed adjective list is also domain independent. An effective

technique is proposed to grow this list using WordNet.

Turney’s work in [42] applies a specific unsupervised learning

technique based on the mutual information between document

phrases and the words “excellent” and “poor”, where the mutual

information is computed using statistics gathered by a search

engine. Pang et al. [33] examine several supervised machine

learning methods for sentiment classification of movie reviews

and conclude that machine learning techniques outperform the

method that is based on human-tagged features although none of

existing methods could handle the sentiment classification with a

reasonable accuracy. Our work differs from these works on

sentiment classification in that we perform classification at the

sentence level while they determine the sentiment of each

document. They also do not find features on which opinions have

been expressed, which is very important in practice.

2.3 Text Summarization

Existing text summarization techniques mainly fall in one of the

two categories: template instantiation and passage extraction.

Work in the former framework includes [10, 39]. They emphasize

on identification and extraction of certain core entities and facts in

a document, which are packaged in a template. This framework

requires background knowledge in order to instantiate a template

to a suitable level of detail. Therefore, it is not domain or genre

independent [37, 38]. This is different from our work as our

techniques do not fill any template and are domain independent.

The passage extraction framework [e.g., 32, 25, 36] identifies

certain segments of the text (typically sentences) that are the most

representative of the document’s content. Our work is different in

that we do not extract representative sentences, but identify and

extract those specific product features and the opinions related to

them.

Boguraev and Kennedy [2] propose to find a few very prominent

expressions, objects or events in a document and use them to help

summarize the document. Our work is again different as we find

all product features in a set of customer reviews regardless

whether they are prominent or not. Thus, our summary is not a

traditional text summary.

Most existing works on text summarization focus on a single

document. Some researchers also studied summarization of

multiple documents covering similar information. Their main

purpose is to summarize the similarities and differences in the

information content among these documents [27]. Our work is

related but quite different because we aim to find the key features

that are talked about in multiple reviews. We do not summarize

similarities and differences of reviews.

2.4 Terminology Finding

In terminology finding, there are basically two techniques for

discovering terms in corpora: symbolic approaches that rely on

syntactic description of terms, namely noun phrases, and

statistical approaches that exploit the fact that the words

composing a term tend to be found close to each other and

reoccurring [21, 22, 7, 6]. However, using noun phrases tends to

produce too many non-terms (low precision), while using

reoccurring phrases misses many low frequency terms, terms with

variations, and terms with only one word. Our association mining

based technique does not have these problems, and we can also

find infrequent features by exploiting the fact that we are only

interested in features that the users have expressed opinions on.

3. THE PROPOSED TECHNIQUES

Figure 2 gives the architectural overview of our opinion

summarization system.

The inputs to the system are a product name and an entry Web

page for all the reviews of the product. The output is the summary

of the reviews as the one shown in the introduction section.

The system performs the summarization in three main steps (as

discussed before): (1) mining product features that have been

commented on by customers; (2) identifying opinion sentences in

each review and deciding whether each opinion sentence is

positive or negative; (3) summarizing the results. These steps are

performed in multiple sub-steps.

Given the inputs, the system first downloads (or crawls) all the

reviews, and put them in the review database. It then finds those

“hot” (or frequent) features that many people have expressed their

opinions on. After that, the opinion words are extracted using the

Opinion Sentence Orientation Identification

Summary Generation

Figure 2: Feature-based opinion summarization

Frequent

Features

Review

Database

Crawl Reviews

Infrequent

Feature

Identification

Opinion

Words

POS Tagging

Feature Pruning

Frequent Feature

Identification

Summary

Opinion word

Extraction

Opinion Orientation

Identification

Infrequent

Features

resulting frequent features, and semantic orientations of the

opinion words are identified with the help of WordNet. Using the

extracted opinion words, the system then finds those infrequent

features. In the last two steps, the orientation of each opinion

sentence is identified and a final summary is produced. Note that

POS tagging is the part-of-speech tagging [28] from natural

language processing, which helps us to find opinion features.

Below, we discuss each of the sub-steps in turn.

3.1 Part-of-Speech Tagging (POS)

Product features are usually nouns or noun phrases in review

sentences. Thus the part-of-speech tagging is crucial. We used the

NLProcessor linguistic parser [31] to parse each review to split

text into sentences and to produce the part-of-speech tag for each

word (whether the word is a noun, verb, adjective, etc). The

process also identifies simple noun and verb groups (syntactic

chunking). The following shows a sentence with POS tags.

<VG> <W C='VBP'> am </W><W C='RB'> absolutely

</W></VG> <W C='IN'> in </W> <NG> <W C='NN'> awe

</W> </NG> <W C='IN'> of </W> <NG> <W C='DT'> this

</W> <W C='NN'> camera </W></NG><W C='.'> .

</W></S>

NLProcessor generates XML output. For instance, <W C=‘NN’>

indicates a noun and <NG> indicates a noun group/noun phrase.

Each sentence is saved in the review database along with the POS

tag information of each word in the sentence. A transaction file is

then created for the generation of frequent features in the next

step. In this file, each line contains words from one sentence,

which includes only the identified nouns and noun phrases of the

sentence. Other components of the sentence are unlikely to be

product features. Some pre-processing of words is also performed,

which includes removal of stopwords, stemming and fuzzy

matching. Fuzzy matching is used to deal with word variants and

misspellings [19].

3.2 Frequent Features Identification

This sub-step identifies product features on which many people

have expressed their opinions. Before discussing frequent feature

identification, we first give some example sentences from some

reviews to describe what kinds of opinions that we will be

handling. Since our system aims to find what people like and

dislike about a given product, how to find the product features

that people talk about is the crucial step. However, due to the

difficulty of natural language understanding, some types of

sentences are hard to deal with. Let us see an easy and a hard

sentence from the reviews of a digital camera:

“The pictures are very clear.”

In this sentence, the user is satisfied with the picture quality of the

camera, picture is the feature that the user talks about. While the

feature of this sentence is explicitly mentioned in the sentence,

some features are implicit and hard to find. For example,

“While light, it will not easily fit in pockets.”

This customer is talking about the size of the camera, but the word

size does not appear in the sentence. In this work, we focus on

finding features that appear explicitly as nouns or noun phrases in

the reviews. We leave finding implicit features to our future work.

Here, we focus on finding frequent features, i.e., those features

that are talked about by many customers (finding infrequent

features will be discussed later). For this purpose, we use

association mining [1] to find all frequent itemsets. In our context,

an itemset is simply a set of words or a phrase that occurs together

in some sentences.

The main reason for using association mining is because of the

following observation. It is common that a customer review

contains many things that are not directly related to product

features. Different customers usually have different stories.

However, when they comment on product features, the words that

they use converge. Thus using association mining to find frequent

itemsets is appropriate because those frequent itemsets are likely

to be product features. Those noun/noun phrases that are

infrequent are likely to be non-product features.

We run the association miner CBA [26], which is based on the

Apriori algorithm in [1] on the transaction set of noun/noun

phrases produced in the previous step. Each resulting frequent

itemset is a possible feature. In our work, we define an itemset as

frequent if it appears in more than 1% (minimum support) of the

review sentences. The generated frequent itemsets are also called

candidate frequent features in this paper.

However, not all candidate frequent features generated by

association mining are genuine features. Two types of pruning are

used to remove those unlikely features.

Compactness pruning: This method checks features that contain

at least two words, which we call feature phrases, and remove

those that are likely to be meaningless.

The association mining algorithm does not consider the position

of an item (or word) in a sentence. However, in a sentence, words

that appear together in a specific order are more likely to be

meaningful phrases. Therefore, some of the frequent feature

phrases generated by association mining may not be genuine

features. Compactness pruning aims to prune those candidate

features whose words do not appear together in a specific order.

See [19] for the detailed definition of compactness and also the

pruning procedure.

Redundancy pruning: In this step, we focus on removing

redundant features that contain single words. To describe the

meaning of redundant features, we use the concept of p-support

(pure support). p-support of feature ftr is the number of sentences

that ftr appears in as a noun or noun phrase, and these sentences

must contain no feature phrase that is a superset of ftr.

We use a minimum p-support value to prune those redundant

features. If a feature has a p-support lower than the minimum p-

support (in our system, we set it to 3) and the feature is a subset of

another feature phrase (which suggests that the feature alone may

not be interesting), it is pruned. For instance, life by itself is not a

useful feature while battery life is a meaningful feature phrase.

See [19] for more explanations.

3.3 Opinion Words Extraction

We now identify opinion words. These are words that are

primarily used to express subjective opinions. Clearly, this is

related to existing work on distinguishing sentences used to

express subjective opinions from sentences used to objectively

describe some factual information [43]. Previous work on

subjectivity [44, 4] has established a positive statistically

significant correlation with the presence of adjectives. Thus the

presence of adjectives is useful for predicting whether a sentence

is subjective, i.e., expressing an opinion. This paper uses

adjectives as opinion words. We also limit the opinion words

extraction to those sentences that contain one or more product

features, as we are only interested in customers’ opinions on these

product features. Let us first define an opinion sentence.

Definition: opinion sentence

If a sentence contains one or more product features and one or

more opinion words, then the sentence is called an opinion

sentence.

We extract opinion words in the following manner (Figure 3):

for each sentence in the review database

if (it contains a frequent feature, extract all the adjective

words as opinion words)

for each feature in the sentence

the nearby adjective is recorded as its effective

opinion. /* A nearby adjective refers to the adjacent

adjective that modifies the noun/noun phrase that is a

frequent feature. */

Figure 3: Opinion word extraction

For example, horrible is the effective opinion of strap in “The

strap is horrible and gets in the way of parts of the camera you

need access to.” Effective opinions will be useful when we

predict the orientation of opinion sentences.

3.4 Orientation Identification for Opinion

Words

For each opinion word, we need to identify its semantic

orientation, which will be used to predict the semantic orientation

of each opinion sentence. The semantic orientation of a word

indicates the direction that the word deviates from the norm for its

semantic group. Words that encode a desirable state (e.g.,

beautiful, awesome) have a positive orientation, while words that

represent undesirable states have a negative orientation (e.g.,

disappointing). While orientations apply to many adjectives, there

are also those adjectives that have no orientation (e.g., external,

digital) [17]. In this work, we are interested in only positive and

negative orientations.

Unfortunately, dictionaries and similar sources, i.e., WordNet

[29] do not include semantic orientation information for each

word. Hatzivassiloglou and McKeown [16] use a supervised

learning algorithm to infer the semantic orientation of adjectives

from constraints on conjunctions. Although their method achieves

high precision, it relies on a large corpus, and needs a large

amount of manually tagged training data. In Turney’s work [42],

the semantic orientation of a phrase is calculated as the mutual

information between the given phrase and the word “excellent”

minus the mutual information between the given phrase and the

word “poor”. The mutual information is estimated by issuing

queries to a search engine and noting the number of hits. The

paper [42], however, does not report the results of semantic

orientations of individual words/phrases. Instead it only gives the

classification results of reviews. We do not use these techniques

in this paper as both works rely on statistical information from a

rather big corpus. Their methods are also inefficient. For example,

in [42], for each word or phrase, a Web search and a substantial

processing of the returned results are needed.

In this research, we propose a simple and yet effective method by

utilizing the adjective synonym set and antonym set in WordNet

[29] to predict the semantic orientations of adjectives.

In WordNet, adjectives are organized into bipolar clusters, as

illustrated in Figure 4. The cluster for fast/slow, consists of two

half clusters, one for senses of fast and one for senses of slow.

Each half cluster is headed by a head synset, in this case fast and

its antonym slow. Following the head synset is the satellite

synsets, which represent senses that are similar to the sense of the

head adjective. The other half cluster is headed by the reverse

antonymous pair slow/fast, followed by satellite synsets for senses

of slow [12].

In general, adjectives share the same orientation as their

synonyms and opposite orientations as their antonyms. We use

this idea to predict the orientation of an adjective. To do this, the

synset of the given adjective and the antonym set are searched. If

a synonym/antonym has known orientation, then the orientation

of the given adjective could be set correspondingly. As the synset

of an adjective always contains a sense that links to head synset,

the search range is rather large. Given enough seed adjectives

with known orientations, we can almost predict the orientations of

all the adjective words in the review collection.

Thus, our strategy is to use a set of seed adjectives, which we

know their orientations and then grow this set by searching in the

WordNet. To have a reasonably broad range of adjectives, we

first manually come up a set of very common adjectives (in our

experiment, we used 30) as the seed list, e.g. positive adjectives:

great, fantastic, nice, cool and negative adjectives: bad, dull.

Then we resort to WordNet to predict the orientations of all the

adjectives in the opinion word list. Once an adjective’s orientation

is predicted, it is added to the seed list. Therefore, the list grows

in the process.

The complete procedure for predicting semantic orientations for

all the adjectives in the opinion list is shown in Figure 5.

Procedure OrientationPrediction takes the adjective seed list and

a set of opinion words whose orientations need to be determined.

Figure 4: Bipolar adjective structure,

(→ = similarity; = antonymy)

slow

dilatory

sluggish

leisurely

tardy

laggard

rapid

quick

alacritous

prompt

swift

fast

Abstract:

Merchants selling products on the Web often ask their customers to review the products that they have purchased and the associated services. As e-commerce is becoming more and more popular, the number of customer reviews that a product receives grows rapidly. For a popular product, the number of reviews can be in hundreds or even thousands. This makes it difficult for a potential customer to read them to make an informed decision on whether to purchase the product. It also makes it difficult for the manufacturer of the product to keep track and to manage customer opinions. For the manufacturer, there are additional difficulties because many merchant sites may sell the same product and the manufacturer normally produces many kinds of products. In this research, we aim to mine and to summarize all the customer reviews of a product. This summarization task is different from traditional text summarization because we only mine the features of the product on which the customers have expressed their opinions and whether the opinions are positive or negative. We do not summarize the reviews by selecting a subset or rewrite some of the original sentences from the reviews to capture the main points as in the classic text summarization. Our task is performed in three steps: (1) mining product features that have been commented on by customers; (2) identifying opinion sentences in each review and deciding whether each opinion sentence is positive or negative; (3) summarizing the results. This paper proposes several novel techniques to perform these tasks. Our experimental results using reviews of a number of products sold online demonstrate the effectiveness of the techniques.

Mining and summarizing customer reviews

Figures

Citations

Convolutional Neural Networks for Sentence Classification

Convolutional Neural Networks for Sentence Classification

Opinion Mining and Sentiment Analysis

Representation Learning with Contrastive Predictive Coding

Sentiment Analysis and Opinion Mining

References

WordNet : an electronic lexical database

Fast Algorithms for Mining Association Rules in Large Databases

Foundations of Statistical Natural Language Processing

Thumbs up? Sentiment Classiflcation using Machine Learning Techniques

Thumbs up? Sentiment Classification using Machine Learning Techniques

Related Papers (5)

Thumbs up? Sentiment Classification using Machine Learning Techniques

Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews

Opinion Mining and Sentiment Analysis

Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis

A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts

Frequently Asked Questions (10)

Q1. What are the contributions in "Mining and summarizing customer reviews" ?

Q2. What are the future works in "Mining and summarizing customer reviews" ?

Q3. What is the key step in the process of finding the features of a product?

Q4. What are the two techniques for finding the key features of a term?

Q5. how to add adverbs with orientation to seed list?

Q6. How many positive and negative reviews do you get?

Q7. What does Cardie et al propose to do?

Q8. what is the orientation of an opinion sentence?

Q9. What is the procedure for predicting the orientation of adjectives?

Q10. What is the procedure used to extract infrequent features?