Author

Qian Xu

Bio: Qian Xu is an academic researcher from Baidu. The author has contributed to research in topics: Spambot & Spamming. The author has an hindex of 1, co-authored 1 publications receiving 87 citations.

Topics: Spambot, Spamming, Electronic mail, Short Message Service ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

SMS Spam Detection Using Noncontent Features

[...]

Qian Xu¹, Evan Wei Xiang¹, Qiang Yang², Jiachun Du², Jieping Zhong² - Show less +1 more•Institutions (2)

Baidu¹, Huawei²

01 Nov 2012-IEEE Intelligent Systems

TL;DR: This service-side solution uses graph data mining to distinguish spammers from nonspammers and detect spam without checking a message's contents.

...read moreread less

Abstract: Short Message Service text messages are indispensable, but they face a serious problem from spamming. This service-side solution uses graph data mining to distinguish spammers from nonspammers and detect spam without checking a message's contents.

...read moreread less

90 citations

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Suspicious Behavior Detection: Current Trends and Future Directions

[...]

Meng Jiang¹, Peng Cui¹, Christos Faloutsos²•Institutions (2)

Tsinghua University¹, Carnegie Mellon University²

01 Jan 2016-IEEE Intelligent Systems

TL;DR: Different real-world applications have varying definitions of suspicious behaviors, and detection methods often look for the most suspicious parts of the data by optimizing scores, but quantifying the suspiciousness of a behavioral pattern is still an open issue.

...read moreread less

Abstract: Different real-world applications have varying definitions of suspicious behaviors. Detection methods often look for the most suspicious parts of the data by optimizing scores, but quantifying the suspiciousness of a behavioral pattern is still an open issue.

...read moreread less

116 citations

Proceedings Article•

Discovering spammers in social networks

[...]

Yin Zhu¹, Xiao Wang, Erheng Zhong¹, Nanthan Nan Liu¹, He Li, Qiang Yang¹ - Show less +2 more•Institutions (1)

Hong Kong University of Science and Technology¹

22 Jul 2012

TL;DR: A Supervised Matrix Factorization method with Social Regularization (SMFSR) for spammer detection in social networks that exploits both social activities as well as users' social relations in an innovative and highly scalable manner is proposed.

...read moreread less

Abstract: As the popularity of the social media increases, as evidenced in Twitter, Facebook and China's Renren, spamming activities also picked up in numbers and variety. On social network sites, spammers often disguise themselves by creating fake accounts and hijacking normal users' accounts for personal gains. Different from the spammers in traditional systems such as SMS and email, spammers in social media behave like normal users and they continue to change their spamming strategies to fool anti-spamming systems. However, due to the privacy and resource concerns, many social media websites cannot fully monitor all the contents of users, making many of the previous approaches, such as topology-based and content-classification-based methods, infeasible to use. In this paper, we propose a Supervised Matrix Factorization method with Social Regularization (SMFSR) for spammer detection in social networks that exploits both social activities as well as users' social relations in an innovative and highly scalable manner. The proposed method detects spammers collectively based on users' social actions and social relations. We have empirically tested our method on data from Renren.com, which is one of the largest social networks in China, and demonstrated that our new method can improve the detection performance significantly.

...read moreread less

109 citations

Journal Article•DOI•

Finding influential users of online health communities: a new metric based on sentiment influence

[...]

Kang Zhao¹, John Yen², Greta E. Greer³, Baojun Qiu⁴, Prasenjit Mitra², Kenneth M. Portier³ - Show less +2 more•Institutions (4)

University of Iowa¹, Penn State College of Information Sciences and Technology², American Cancer Society³, eBay⁴

01 Oct 2014-Journal of the American Medical Informatics Association

TL;DR: Using the dataset from a popular OHC, the research demonstrated that the proposed metric is highly effective in identifying influential users and combining the metric with other traditional measures further improves the identification of influential users.

...read moreread less

106 citations

Towards SMS Spam Filtering: Results under a New Dataset

[...]

Federal University of São Carlos¹

31 Mar 2013

TL;DR: The results indicate that the procedure followed to build the collection does not lead to near-duplicates and, regarding the classifiers, the Support Vector Machines outperforms other evaluated techniques and, hence, it can be used as a good baseline for further comparison.

...read moreread less

Abstract: The growth of mobile phone users has lead to a dramatic increasing of SMS spam messages. Recent reports clearly indicate that the volume of mobile phone spam is dramatically increasing year by year. In practice, fighting such plague is difficult by several factors, including the lower rate of SMS that has allowed many users and service providers to ignore the issue, and the limited availability of mobile phone spam-filtering software. Probably, one of the major concerns in academic settings is the scarcity of public SMS spam datasets, that are sorely needed for validation and comparison of different classifiers. Moreover, traditional content-based filters may have their performance seriously degraded since SMS messages are fairly short and their text is generally rife with idioms and abbreviations. In this paper, we present details about a new real, public and non-encoded SMS spam collection that is the largest one as far as we know. Moreover, we offer a comprehensive analysis of such dataset in order to ensure that there are no duplicated messages coming from previously existing datasets, since it may ease the task of learning SMS spam classifiers and could compromise the evaluation of methods. Additionally, we compare the performance achieved by several established machine learning techniques. Im summary, the results indicate that the procedure followed to build the collection does not lead to near-duplicates and, regarding the classifiers, the Support Vector Machines outperforms other evaluated techniques and, hence, it can be used as a good baseline for further comparison.

...read moreread less

84 citations

Journal Article•DOI•

Text normalization and semantic indexing to enhance Instant Messaging and SMS spam filtering

[...]

Tiago A. Almeida¹, Tiago Pinho da Silva¹, Igor Santos², José María Gómez Hidalgo•Institutions (2)

Federal University of São Carlos¹, University of Deusto²

15 Sep 2016-Knowledge Based Systems

TL;DR: The proposed text processing approach is based on lexicographic and semantic dictionaries along with state-of-the-art techniques for semantic analysis and context detection and aims to alleviate factors that can degrade the algorithms performance, such as redundancies and inconsistencies.

...read moreread less

Abstract: The rapid popularization of smartphones has contributed to the growth of online Instant Messaging and SMS usage as an alternative way of communication The increasing number of users, along with the trust they inherently have in their devices, makes such messages a propitious environment for spammers In fact, reports clearly indicate that volume of spam over Instant Messaging and SMS is dramatically increasing year by year It represents a challenging problem for traditional filtering methods nowadays, since such messages are usually fairly short and normally rife with slangs, idioms, symbols and acronyms that make even tokenization a difficult task In this scenario, this paper proposes and then evaluates a method to normalize and expand original short and messy text messages in order to acquire better attributes and enhance the classification performance The proposed text processing approach is based on lexicographic and semantic dictionaries along with state-of-the-art techniques for semantic analysis and context detection This technique is used to normalize terms and create new attributes in order to change and expand original text samples aiming to alleviate factors that can degrade the algorithms performance, such as redundancies and inconsistencies We have evaluated our approach with a public, real and non-encoded data-set along with several established machine learning methods Our experiments were diligently designed to ensure statistically sound results which indicate that the proposed text processing techniques can in fact enhance Instant Messaging and SMS spam filtering

...read moreread less

80 citations

Collapse