Self-training from labeled features for sentiment analysis
read more
Citations
Sentiment analysis algorithms and applications: A survey
Sentiment analysis
Arabic sentiment analysis: Lexicon-based and corpus-based
A survey on classification techniques for opinion mining and sentiment analysis
A comprehensive survey on sentiment analysis: Approaches, challenges and trends
References
Thumbs up? Sentiment Classiflcation using Machine Learning Techniques
Thumbs up? Sentiment Classification using Machine Learning Techniques
Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews
Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews
A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts
Related Papers (5)
Frequently Asked Questions (13)
Q2. What is the main reason for the rapid evolution of user-generated contents?
the rapid evolution of user-generated contents demands sentiment classifiers that can easily adapt to new domains with minimum supervision.
Q3. How many polarity words are matched in the MR dataset?
By filtering the polarity words that occurred less than 5 times in the corpus, the number of matched polarity words drops dramatically with only about 500 matched words for Books and DVDs, and 160 for Electronics and Kitchen.
Q4. What is the purpose of sentiment analysis?
With the explosion of people’s attitudes and opinions expressed in social media including blogs, discussion forums, tweets, etc, detecting sentiment or opinion from the Web is becoming an increasingly popular way of interpreting data.
Q5. What is Turney’s work on sentiment classification that does not require labeled data?
The pioneer work on sentiment classification that does not require labeled data is that of Turney’s (Turney, 2002) which classifies a document as positive or negative by the average semantic orientation of the phrases in the document that contain adjectives or adverbs.
Q6. What are some other domain-specific terms for the MR dataset?
The authors also observe other domain-specific terms for the MR dataset, such as the actress name winslet (kate Winslet) with positive polarity and the movie name batman bearing negative polarity.
Q7. How many polarity words are in the MR dataset?
As mentioned earlier, Books and DVDs are larger corpora and thus the number of matched polarity words without filter is about 2000.
Q8. What is the definition of a criterion that penalizes the divergence?
By adding a normalization term zk = ∑D d=1 δ(k ∈ wd) into fjk, the feature expectation becomes the predicted label distribution on the set of instances containing feature k, i.e.P̃ (j|k; Λ) = ∑Dd=1 δ(sd = j)δ(k ∈ wd) zk(4)The authors define a criterion that minimizes the KL divergence of the expected label distribution and a target expectation f̂ , which is essentially an instance ofgeneralized expectation criteria that penalizes the divergence of a specific model expectation from a target value.
Q9. What is the target expectation of a feature having its prior polarity?
Since the authors are dealing with the binary classification problem here,the target expectation of a feature having its prior polarity (or associated class label) is 0.9 and 0.1 for its non-associated class.•
Q10. What is the sentiment classification algorithm?
Instead of incorporating prior information into model learning through sentiment lexicons, Dasgupta and Ng (2009) proposed an unsupervised sentiment classification algorithm where user feedbacks are provided in the spectral clustering process in an interactive manner to ensure that text are clustered along the sentiment dimension.
Q11. What is the proposed framework for sentiment classifier learning?
An initial classifier is trained by incorporating prior information from the sentiment lexicon which consists of a list of words marked with their respective polarity.
Q12. How does the polarity word frequency cutoff work?
It performs fairly stable and only drops dramatically when too few polarity words were incorporated as prior knowledge, for example, when there are only 23 polarity words were selected at the cutoff point 40 for the Kitchen dataset.
Q13. How did Dasgupta and Ng (2009) propose a weakly-supervised?
More recently, Dasgupta and Ng (2009) proposed a weakly-supervised sentiment classification algorithm by integrating user feedbacks into a spectral clustering algorithm.