J
Justin Ma
Researcher at University of California, San Diego
Publications - 11
Citations - 2431
Justin Ma is an academic researcher from University of California, San Diego. The author has contributed to research in topics: Semantic URL & The Internet. The author has an hindex of 10, co-authored 11 publications receiving 2281 citations. Previous affiliations of Justin Ma include University of California, Berkeley.
Papers
More filters
Proceedings ArticleDOI
Beyond blacklists: learning to detect malicious web sites from suspicious URLs
TL;DR: This paper describes an approach to this problem based on automated URL classification, using statistical methods to discover the tell-tale lexical and host-based properties of malicious Web site URLs.
Proceedings ArticleDOI
Identifying suspicious URLs: an application of large-scale online learning
TL;DR: It is demonstrated that recently-developed online algorithms can be as accurate as batch techniques, achieving classification accuracies up to 99% over a balanced data set.
Journal ArticleDOI
Scalability, fidelity, and containment in the potemkin virtual honeyfarm
Michael Vrable,Justin Ma,Jay Chen,David Moore,Erik Vandekieft,Alex C. Snoeren,Geoffrey M. Voelker,Stefan Savage +7 more
TL;DR: This paper has built a prototype honeyfarm system, called Potemkin, that exploits virtual machines, aggressive memory sharing, and late binding of resources to achieve the goal of improving honeypot scalability while still closely emulating the execution behavior of individual Internet hosts.
Journal ArticleDOI
Learning to detect malicious URLs
TL;DR: This article develops a real-time system for gathering URL features and is able to train an online classifier that detects malicious Web sites with 99% accuracy over a balanced dataset.
Proceedings ArticleDOI
Unexpected means of protocol inference
TL;DR: This work analyzes three alternative mechanisms using statistical and structural content models for automatically identifying traffic that uses the same application-layer protocol, relying solely on flow content, and evaluates each mechanism's classification performance using real-world traffic traces from multiple sites.