scispace - formally typeset
Open AccessJournal ArticleDOI

Improving K-NN Internet Traffic Classification Using Clustering and Principle Component Analysis

Adi Suryaputra Paramita
- 01 Jun 2017 - 
- Vol. 6, Iss: 2, pp 159-165
TLDR
The combination method of classification, clustering and feature selection of internet traffic dataset was successfully modeled internet traffic classification method that higher accuracy and faster performance.
Abstract
K-Nearest Neighbour (K-NN) is one of the popular classification algorithm, in this research K-NN use to classify internet traffic, the K-NN is appropriate for huge amounts of data and have more accurate classification, K-NN algorithm has a disadvantages in computation process because K-NN algorithm calculate the distance of all existing data in dataset. Clustering is one of the solution to conquer the K-NN weaknesses, clustering process should be done before the K-NN classification process, the clustering process does not need high computing time to conqest the data which have same characteristic, Fuzzy C-Mean is the clustering algorithm used in this research. The Fuzzy C-Mean algorithm no need to determine the first number of clusters to be formed, clusters that form on this algorithm will be formed naturally based datasets be entered. The Fuzzy C-Mean has weakness in clustering results obtained are frequently not same even though the input of dataset was same because the initial dataset that of the Fuzzy C-Mean is less optimal, to optimize the initial datasets needs feature selection algorithm. Feature selection is a method to produce an optimum initial dataset Fuzzy C-Means. Feature selection algorithm in this research is Principal Component Analysis (PCA). PCA can reduce non significant attribute or feature to create optimal dataset and can improve performance for clustering and classification algorithm. The resultsof this research is the combination method of classification, clustering and feature selection of internet traffic dataset was successfully modeled internet traffic classification method that higher accuracy and faster performance.

read more

Citations
More filters
Journal ArticleDOI

Sequential Message Characterization for Early Classification of Encrypted Internet Traffic

TL;DR: This paper first collects extensive traffic flows at the exit router of a university and finds that each application type has distinct sequential message features, and develops a system, named SMC (Sequential Message Characterization), which can perform online traffic classification with the sequential size information of a few message segments.
Proceedings ArticleDOI

Leveraging Inner-Connection of Message Sequence for Traffic Classification: A Deep Learning Approach

TL;DR: The proposed traffic classification algorithm essentially adopts a Long Short-Term Memory (LSTM) neural network to output a classifier with message sequence vector (not necessarily covering all messages) of a traffic flow as the training input, to conduct online traffic flow classification.
Related Papers (5)