scispace - formally typeset
Search or ask a question

Showing papers on "Sentiment analysis published in 2004"


Proceedings ArticleDOI
22 Aug 2004
TL;DR: This research aims to mine and to summarize all the customer reviews of a product, and proposes several novel techniques to perform these tasks.
Abstract: Merchants selling products on the Web often ask their customers to review the products that they have purchased and the associated services. As e-commerce is becoming more and more popular, the number of customer reviews that a product receives grows rapidly. For a popular product, the number of reviews can be in hundreds or even thousands. This makes it difficult for a potential customer to read them to make an informed decision on whether to purchase the product. It also makes it difficult for the manufacturer of the product to keep track and to manage customer opinions. For the manufacturer, there are additional difficulties because many merchant sites may sell the same product and the manufacturer normally produces many kinds of products. In this research, we aim to mine and to summarize all the customer reviews of a product. This summarization task is different from traditional text summarization because we only mine the features of the product on which the customers have expressed their opinions and whether the opinions are positive or negative. We do not summarize the reviews by selecting a subset or rewrite some of the original sentences from the reviews to capture the main points as in the classic text summarization. Our task is performed in three steps: (1) mining product features that have been commented on by customers; (2) identifying opinion sentences in each review and deciding whether each opinion sentence is positive or negative; (3) summarizing the results. This paper proposes several novel techniques to perform these tasks. Our experimental results using reviews of a number of products sold online demonstrate the effectiveness of the techniques.

7,330 citations


Proceedings ArticleDOI
Bo Pang1, Lillian Lee1
21 Jul 2004
TL;DR: This paper proposed a machine learning method that applies text-categorization techniques to just the subjective portions of the document, extracting these portions can be implemented using efficient techniques for finding minimum cuts in graphs; this greatly facilitates incorporation of cross-sentence contextual constraints.
Abstract: Sentiment analysis seeks to identify the viewpoint(s) underlying a text span; an example application is classifying a movie review as "thumbs up" or "thumbs down". To determine this sentiment polarity, we propose a novel machine-learning method that applies text-categorization techniques to just the subjective portions of the document. Extracting these portions can be implemented using efficient techniques for finding minimum cuts in graphs; this greatly facilitates incorporation of cross-sentence contextual constraints.

3,459 citations


Proceedings ArticleDOI
23 Aug 2004
TL;DR: A system that, given a topic, automatically finds the people who hold opinions about that topic and the sentiment of each opinion and another module for determining word sentiment and another for combining sentiments within a sentence is presented.
Abstract: Identifying sentiments (the affective parts of opinions) is a challenging problem. We present a system that, given a topic, automatically finds the people who hold opinions about that topic and the sentiment of each opinion. The system contains a module for determining word sentiment and another for combining sentiments within a sentence. We experiment with various models of classifying and combining sentiment at word and sentence levels, with promising results.

1,541 citations


Proceedings Article
01 Jan 2004
TL;DR: An approach to sentiment analysis which uses support vector machines (SVMs) to bring together diverse sources of potentially pertinent information, including several favorability measures for phrases and adjectives and, where available, knowledge of the topic of the text is introduced.
Abstract: This paper introduces an approach to sentiment analysis which uses support vector machines (SVMs) to bring together diverse sources of potentially pertinent information, including several favorability measures for phrases and adjectives and, where available, knowledge of the topic of the text. Models using the features introduced are further combined with unigram models which have been shown to be effective in the past (Pang et al., 2002) and lemmatized versions of the unigram models. Experiments on movie review data from the Internet Movie Database demonstrate that hybrid SVMs which combine unigram-style feature-based SVMs with those based on real-valued favorability measures obtain superior performance, producing the best results yet published using this data. Further experiments using a feature set enriched with topic information on a smaller dataset of music reviews hand-annotated for topic are also reported, the results of which suggest that incorporating topic information into such models may also yield improvement.

729 citations


Posted Content
Bo Pang1, Lillian Lee1
TL;DR: A novel machine-learning method is proposed that applies text-categorization techniques to just the subjective portions of the document, which greatly facilitates incorporation of cross-sentence contextual constraints.
Abstract: Sentiment analysis seeks to identify the viewpoint(s) underlying a text span; an example application is classifying a movie review as "thumbs up" or "thumbs down". To determine this sentiment polarity, we propose a novel machine-learning method that applies text-categorization techniques to just the subjective portions of the document. Extracting these portions can be implemented using efficient techniques for finding minimum cuts in graphs; this greatly facilitates incorporation of cross-sentence contextual constraints.

399 citations


Proceedings ArticleDOI
23 Aug 2004
TL;DR: A high-precision sentiment analysis system is developed at a low development cost, by making use of an existing transfer-based machine translation engine.
Abstract: This paper proposes a new paradigm for sentiment analysis: translation from text documents to a set of sentiment units. The techniques of deep language analysis for machine translation are applicable also to this kind of text mining task. We developed a high-precision sentiment analysis system at a low development cost, by making use of an existing transfer-based machine translation engine.

141 citations


01 Jan 2004
TL;DR: Error analysis suggests various approaches for improving classification accuracy: use of negation phrase, making inference from superficial words, and solving the problem of comments on parts.
Abstract: This paper reports a study in automatic sentiment classification, i.e., automatically classifying documents as expressing positive or negative sentiments/opinions. The study investigates the effectiveness of using SVM (Support Vector Machine) on various text features to classify product reviews into recommended (positive sentiment) and not recommended (negative sentiment). Compared with traditional topical classification, it was hypothesized that syntactic and semantic processing of text would be more important for sentiment classification. In the first part of this study, several different approaches, unigrams (individual words), selected words (such as verb, adjective, and adverb), and words labeled with part-of-speech tags were investigated. A sample of 1,800 various product reviews was retrieved from Review Centre (www.reviewcentre.com) for the study. 1,200 reviews were used for training, and 600 for testing. Using SVM, the baseline unigram approach obtained an accuracy rate of around 76%. The use of selected words obtained a marginally better result of 77.33%. Error analysis suggests various approaches for improving classification accuracy: use of negation phrase, making inference from superficial words, and solving the problem of comments on parts. The second part of the study that is in progress investigates the use of negation phrase through simple linguistic processing to improve classification accuracy. This approach increased the accuracy rate up to 79.33%.

85 citations


Proceedings ArticleDOI
14 Sep 2004
TL;DR: A phrase pattern-based method in classifying sentiment orientation of text to analyze whether the text expresses a favorable or unfavorable sentiment for a specific subject and achieves an accuracy rate of 86% when used to evaluate sports reviews from some Websites.
Abstract: This paper presents a phrase pattern-based method in classifying sentiment orientation of text. That is to analyze whether the text expresses a favorable or unfavorable sentiment for a specific subject. In our method, we construct some phrase patterns and calculate their sentiment orientation by unsupervised learning algorithm. When we classify a document, we first add special tags to some words in the text, then match the tags within a sentence with some phrase patterns to get the sentiment orientation of the sentence. At last, we add up the sentiment orientation of each sentence. We classify the text according to this summation. The method achieves an accuracy rate of 86% when used to evaluate sports reviews from some Websites.

45 citations


Proceedings ArticleDOI
21 Jul 2004
TL;DR: The results of these experiments suggest ways in which incorporating topic information into such models may yield improvement over models which do not use topic information.
Abstract: This paper reports experiments in classifying texts based upon their favorability towards the subject of the text using a feature set enriched with topic information on a small dataset of music reviews hand-annotated for topic. The results of these experiments suggest ways in which incorporating topic information into such models may yield improvement over models which do not use topic information.

14 citations



01 Jan 2004
TL;DR: The landscape of applying technologies for assessing academic texts as a cube is outlined where a particular application can be identified as its position indicated by the three dimensions of the purpose of assessment, the kind of feedback, and the type of contribution.
Abstract: In this paper, we use the term academic text to refer to any free text composed in an academic setting, covering the whole spectrum from first year students’ reviews of available scientific texts up to a scientist’s struggling, but still fragile text concerning his or her new discoveries. These academic texts can be processed by various information technologies. We can now outline the landscape of applying technologies for assessing academic texts as a cube (see Figure 1) where a particular application can be identified as its position indicated by the three dimensions of the purpose of assessment, the kind of feedback, and the type of contribution.