scispace - formally typeset
Open AccessBook

Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data

TLDR
The authors provides an in-depth examination of core text mining and link detection algorithms and operations, and examines advanced pre-processing techniques, knowledge representation considerations, and visualization approaches for text mining.
Abstract
Providing an in-depth examination of core text mining and link detection algorithms and operations, this text examines advanced pre-processing techniques, knowledge representation considerations, and visualization approaches.

read more

Citations
More filters
Proceedings ArticleDOI

Detecting large-scale system problems by mining console logs

TL;DR: In this article, a general methodology to mine this rich source of information to automatically detect system runtime problems was proposed, combining source code analysis with information retrieval to create composite features and then analyze these features using machine learning to detect operational problems.
Proceedings Article

From throw-away traffic to bots: detecting the rise of DGA-based malware

TL;DR: A new technique to detect randomly generated domains without reversing is presented, finding that most of the DGA-generated domains that a bot queries would result in Non-Existent Domain (NXDomain) responses, and that bots from the same bot-net (with the same DGA algorithm) would generate similar NXDomain traffic.
Journal ArticleDOI

The power of social media analytics

TL;DR: How to use, and influence, consumer social communications to improve business performance, reputation, and profit.
Journal ArticleDOI

Friendship prediction and homophily in social media

TL;DR: This analysis suggests that users with similar interests are more likely to be friends, and therefore topical similarity measures among users based solely on their annotation metadata should be predictive of social links.
Journal ArticleDOI

The MADlib analytics library: or MAD skills, the SQL

TL;DR: The MADlib project is introduced, including the background that led to its beginnings, and the motivation for its open-source nature, and an overview of the library's architecture and design patterns is provided, and a description of various statistical methods in that context is provided.