scispace - formally typeset
Proceedings ArticleDOI

ACAS: automated construction of application signatures

Reads0
Chats0
TLDR
This paper applies three statistical machine learning algorithms to automatically identify signatures for a range of applications and finds that this approach is highly accurate and scales to allow online application identification on high speed links.
Abstract
An accurate mapping of traffic to applications is important for a broad range of network management and measurement tasks. Internet applications have traditionally been identified using well-known default server network-port numbers in the TCP or UDP headers. However this approach has become increasingly inaccurate. An alternate, more accurate technique is to use specific application-level features in the protocol exchange to guide the identification. Unfortunately deriving the signatures manually is very time consuming and difficult.In this paper, we explore automatically extracting application signatures from IP traffic payload content. In particular we apply three statistical machine learning algorithms to automatically identify signatures for a range of applications. The results indicate that this approach is highly accurate and scales to allow online application identification on high speed links. We also discovered that content signatures still work in the presence of encryption. In these cases we were able to derive content signature for unencrypted handshakes negotiating the encryption parameters of a particular connection.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

A survey of techniques for internet traffic classification using machine learning

TL;DR: This survey paper looks at emerging research into the application of Machine Learning techniques to IP traffic classification - an inter-disciplinary blend of IP networking and data mining techniques.
Proceedings ArticleDOI

Traffic classification using clustering algorithms

TL;DR: This work considers two unsupervised clustering algorithms, namely K-Means and DBSCAN, that have previously not been used for network traffic classification and evaluates these two algorithms and compares them to the previously used AutoClass algorithm, using empirical Internet traces.
Journal ArticleDOI

A comprehensive survey on machine learning for networking: evolution, applications and research opportunities

TL;DR: This survey delineates the limitations, give insights, research challenges and future opportunities to advance ML in networking, and jointly presents the application of diverse ML techniques in various key areas of networking across different network technologies.
Proceedings ArticleDOI

Internet traffic classification demystified: myths, caveats, and the best practices

TL;DR: This work critically revisit traffic classification by conducting a thorough evaluation of three classification approaches, based on transport layer ports, host behavior, and flow features, and extracts insights and recommendations for both the study and practical application of traffic classification.
Proceedings ArticleDOI

Polyglot: automatic extraction of protocol message format using dynamic binary analysis

TL;DR: This paper proposes a new approach to protocol reverse engineering using program binaries, shadowing, which uses dynamic analysis and is based on a unique intuition - the way that an implementation of the protocol processes the received application data reveals a wealth of information about the protocol message format.
References
More filters
Journal ArticleDOI

A maximum entropy approach to natural language processing

TL;DR: A maximum-likelihood approach for automatically constructing maximum entropy models is presented and how to implement this approach efficiently is described, using as examples several problems in natural language processing.

An empirical study of the naive Bayes classifier

Irina Rish
TL;DR: This work analyzes the impact of the distribution entropy on the classification error, showing that low-entropy feature distributions yield good performance of naive Bayes and demonstrates that naive Baye works well for certain nearlyfunctional feature dependencies.
Book ChapterDOI

The Boosting Approach to Machine Learning An Overview

TL;DR: This chapter overviews some of the recent work on boosting including analyses of AdaBoost's training error and generalization error; boosting’s connection to game theory and linear programming; the relationship between boosting and logistic regression; extensions of Ada boost for multiclass classification problems; methods of incorporating human knowledge into boosting; and experimental and applied work using boosting.
Proceedings ArticleDOI

Accurate, scalable in-network identification of p2p traffic using application signatures

TL;DR: In this article, the authors identify the application level signatures by examining some available documentations, and packet-level traces, and then utilize the identified signatures to develop online filters that can efficiently and accurately track the P2P traffic even on high-speed network links.
Book ChapterDOI

Toward the accurate identification of network applications

TL;DR: This work uses a full payload packet trace collected from an Internet site to identify the types of errors that may result from port-based classification and quantify them for the specific trace under study and devise a classification methodology that relies on the full packet payload.