Home
/
Authors
/
Dhruba K. Bhattacharyya

Author

Dhruba K. Bhattacharyya

Other affiliations: Jorhat Engineering College

Bio: Dhruba K. Bhattacharyya is an academic researcher from Tezpur University. The author has contributed to research in topics: Cluster analysis & Biclustering. The author has an hindex of 31, co-authored 212 publications receiving 4516 citations. Previous affiliations of Dhruba K. Bhattacharyya include Jorhat Engineering College.

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2000
1999
1998

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Network Anomaly Detection: Methods, Systems and Tools

[...]

Monowar H. Bhuyan¹, Dhruba K. Bhattacharyya¹, Jugal Kalita²•Institutions (2)

Tezpur University¹, University of Colorado Colorado Springs²

21 Jan 2014-IEEE Communications Surveys and Tutorials

TL;DR: This paper provides a structured and comprehensive overview of various facets of network anomaly detection so that a researcher can become quickly familiar with every aspect of network anomalies detection.

...read moreread less

Abstract: Network anomaly detection is an important and dynamic research area. Many network intrusion detection methods and systems (NIDS) have been proposed in the literature. In this paper, we provide a structured and comprehensive overview of various facets of network anomaly detection so that a researcher can become quickly familiar with every aspect of network anomaly detection. We present attacks normally encountered by network intrusion detection systems. We categorize existing network anomaly detection methods and systems based on the underlying computational techniques used. Within this framework, we briefly describe and compare a large number of network anomaly detection methods and systems. In addition, we also discuss tools that can be used by network defenders and datasets that researchers in network anomaly detection can use. We also highlight research directions in network anomaly detection.

...read moreread less

971 citations

Journal Article•DOI•

MIFS-ND: A mutual information-based feature selection method

[...]

Nazrul Hoque¹, Dhruba K. Bhattacharyya¹, Jugal Kalita²•Institutions (2)

Tezpur University¹, University of Colorado Colorado Springs²

15 Oct 2014-Expert Systems With Applications

TL;DR: A greedy feature selection method using mutual information that combines both feature–feature mutual information and feature–class mutual information to find an optimal subset of features to minimize redundancy and to maximize relevance among features is introduced.

...read moreread less

Abstract: Feature selection is used to choose a subset of relevant features for effective classification of data. In high dimensional data classification, the performance of a classifier often depends on the feature subset used for classification. In this paper, we introduce a greedy feature selection method using mutual information. This method combines both feature–feature mutual information and feature–class mutual information to find an optimal subset of features to minimize redundancy and to maximize relevance among features. The effectiveness of the selected feature subset is evaluated using multiple classifiers on multiple datasets. The performance of our method both in terms of classification accuracy and execution time performance, has been found significantly high for twelve real-life datasets of varied dimensionality and number of instances when compared with several competing feature selection techniques.

...read moreread less

302 citations

Journal Article•DOI•

A Survey of Outlier Detection Methods in Network Anomaly Identification

[...]

Prasanta Gogoi¹, Dhruba K. Bhattacharyya¹, Bhogeswar Borah¹, Jugal Kalita²•Institutions (2)

Tezpur University¹, University of Colorado Colorado Springs²

01 Apr 2011-The Computer Journal

TL;DR: A comprehensive survey of well-known distance-based, density-based and other techniques for outlier detection and compare them is presented and definitions of outliers are provided and their detection based on supervised and unsupervised learning in the context of network anomaly detection are discussed.

...read moreread less

Abstract: The detection of outliers has gained considerable interest in data mining with the realization that outliers can be the key discovery to be made from very large databases. Outliers arise due to various reasons such as mechanical faults, changes in system behavior, fraudulent behavior, human error and instrument error. Indeed, for many applications the discovery of outliers leads to more interesting and useful results than the discovery of inliers. Detection of outliers can lead to identification of system faults so that administrators can take preventive measures before they escalate. It is possible that anomaly detection may enable detection of new attacks. Outlier detection is an important anomaly detection approach. In this paper, we present a comprehensive survey of well-known distance-based, density-based and other techniques for outlier detection and compare them. We provide definitions of outliers and discuss their detection based on supervised and unsupervised learning in the context of network anomaly detection.

...read moreread less

217 citations

Journal Article•DOI•

Botnet in DDoS Attacks: Trends and Challenges

[...]

Nazrul Hoque¹, Dhruba K. Bhattacharyya¹, Jugal Kalita²•Institutions (2)

Tezpur University¹, University of Colorado Boulder²

16 Jul 2015-IEEE Communications Surveys and Tutorials

TL;DR: This survey presents a comprehensive overview of DDoS attacks, their causes, types with a taxonomy, and technical details of various attack launching tools.

...read moreread less

Abstract: Threats of distributed denial of service (DDoS) attacks have been increasing day-by-day due to rapid development of computer networks and associated infrastructure, and millions of software applications, large and small, addressing all varieties of tasks. Botnets pose a major threat to network security as they are widely used for many Internet crimes such as DDoS attacks, identity theft, email spamming, and click fraud. Botnet based DDoS attacks are catastrophic to the victim network as they can exhaust both network bandwidth and resources of the victim machine. This survey presents a comprehensive overview of DDoS attacks, their causes, types with a taxonomy, and technical details of various attack launching tools. A detailed discussion of several botnet architectures, tools developed using botnet architectures, and pros and cons analysis are also included. Furthermore, a list of important issues and research challenges is also reported.

...read moreread less

206 citations

Proceedings Article•DOI•

An improved sampling-based DBSCAN for large spatial databases

[...]

Bhogeswar Borah¹, Dhruba K. Bhattacharyya•Institutions (1)

Tezpur University¹

24 Aug 2004

TL;DR: This paper presents an improved sampling-based DBSCAN which can cluster large-scale spatial databases effectively and outperforms DBS CAN as well as its other counterparts, in terms of execution time, without losing the quality of clustering.

...read moreread less

Abstract: Spatial data clustering is one of the important data mining techniques for extracting knowledge from large amount of spatial data collected in various applications, such as remote sensing, GIS, computer cartography, environmental assessment and planning, etc. Several useful and popular spatial data clustering algorithms have been proposed in the past decade. DBSCAN is one of them, which can discover clusters of any arbitrary shape and can handle the noise points effectively. However, DBSCAN requires large volume of memory support because it operates on the entire database. This paper presents an improved sampling-based DBSCAN which can cluster large-scale spatial databases effectively. Experimental results included to establish that the proposed sampling-based DBSCAN outperforms DBSCAN as well as its other counterparts, in terms of execution time, without losing the quality of clustering.

...read moreread less

182 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Machine learning

[...]

Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

01 Dec 1996-ACM Computing Surveys

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

...read moreread less

Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

...read moreread less

13,246 citations

[신간의 별자리x] 우리/미술, 그리고 ‘슬픔의 박물관’

[...]

이화영

01 Jan 2015

12,972 citations

Pattern Recognition and Machine Learning

[...]

Christopher M. Bishop¹•Institutions (1)

Microsoft¹

01 Jan 2006

TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.

...read moreread less

Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

...read moreread less

10,141 citations

Data Mining - Concepts and Techniques.

[...]

Petra Perner

01 Jan 2002

9,314 citations

KEGG(Kyoto Encyclopedia of Genes and Genomes)〔和文〕 (特集ゲノム医学の現在と未来--基礎と臨床) -- (データベース)

[...]

光輝中尾, 實金久

01 Jan 2000

3,536 citations