scispace - formally typeset
Search or ask a question

Showing papers by "Ling Huang published in 2018"


Proceedings ArticleDOI
Guosai Wang1, Shiyang Xiang, Yitao Duan, Ling Huang, Wei Xu1 
27 Jun 2018
TL;DR: This work defines the "anti-data-reselling" problem and proposes a new systematic method that combines feature engineering and machine learning models to provide a solution.
Abstract: Data providers have a profound contribution to many fields such as finance, economy, and academia by serving people with both web-based and API-based query service of specialized data. Among the data users, there are data resellers who abuse the query APIs to retrieve and resell the data to make a profit, which harms the data provider's interests and causes copyright infringement. In this work, we define the "anti-data-reselling" problem and propose a new systematic method that combines feature engineering and machine learning models to provide a solution. We apply our method to a real query log of over 9,000 users with limited labels provided by a large financial data provider and get reasonable results, insightful observations, and real deployments.

1 citations


Posted Content
TL;DR: This work builds an extensible fraud detection framework, Badlink, to support multimodal datasets with different data types and distributions in a scalable way and demonstrates the state-of-the-art performance of BadLink, even with sophisticated camouflage traffic.
Abstract: Frauds severely hurt many kinds of Internet businesses. Group-based fraud detection is a popular methodology to catch fraudsters who unavoidably exhibit synchronized behaviors. We combine both graph-based features (e.g. cluster density) and information-theoretical features (e.g. probability for the similarity) of fraud groups into two intuitive metrics. Based on these metrics, we build an extensible fraud detection framework, BadLink, to support multimodal datasets with different data types and distributions in a scalable way. Experiments on real production workload, as well as extensive comparison with existing solutions demonstrate the state-of-the-art performance of BadLink, even with sophisticated camouflage traffic.

1 citations