Open AccessBook
Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data
Ronen Feldman,James Sanger +1 more
TLDR
The authors provides an in-depth examination of core text mining and link detection algorithms and operations, and examines advanced pre-processing techniques, knowledge representation considerations, and visualization approaches for text mining.Abstract:
Providing an in-depth examination of core text mining and link detection algorithms and operations, this text examines advanced pre-processing techniques, knowledge representation considerations, and visualization approaches.read more
Citations
More filters
Proceedings ArticleDOI
Detecting large-scale system problems by mining console logs
TL;DR: In this article, a general methodology to mine this rich source of information to automatically detect system runtime problems was proposed, combining source code analysis with information retrieval to create composite features and then analyze these features using machine learning to detect operational problems.
Proceedings Article
From throw-away traffic to bots: detecting the rise of DGA-based malware
Manos Antonakakis,Roberto Perdisci,Yacin Nadji,Nikolaos Vasiloglou,Saeed Abu-Nimeh,Wenke Lee,David Dagon +6 more
TL;DR: A new technique to detect randomly generated domains without reversing is presented, finding that most of the DGA-generated domains that a bot queries would result in Non-Existent Domain (NXDomain) responses, and that bots from the same bot-net (with the same DGA algorithm) would generate similar NXDomain traffic.
Journal ArticleDOI
The power of social media analytics
Weiguo Fan,Michael D. Gordon +1 more
TL;DR: How to use, and influence, consumer social communications to improve business performance, reputation, and profit.
Journal ArticleDOI
Friendship prediction and homophily in social media
Luca Maria Aiello,Alain Barrat,Rossano Schifanella,Ciro Cattuto,Benjamin Markines,Filippo Menczer +5 more
TL;DR: This analysis suggests that users with similar interests are more likely to be friends, and therefore topical similarity measures among users based solely on their annotation metadata should be predictive of social links.
Journal ArticleDOI
The MADlib analytics library: or MAD skills, the SQL
Joseph M. Hellerstein,Christopher Ré,Florian Schoppmann,Daisy Zhe Wang,Eugene Fratkin,Aleksander Gorajek,Kee Siong Ng,Caleb E. Welton,Xixuan Feng,Kun Li,Arun Kumar +10 more
TL;DR: The MADlib project is introduced, including the background that led to its beginnings, and the motivation for its open-source nature, and an overview of the library's architecture and design patterns is provided, and a description of various statistical methods in that context is provided.