scispace - formally typeset
Search or ask a question
Author

Chittaranjan Hota

Bio: Chittaranjan Hota is an academic researcher from Birla Institute of Technology and Science. The author has contributed to research in topics: Overlay network & Botnet. The author has an hindex of 13, co-authored 105 publications receiving 830 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: The authors build up on the progress of open source tools like Hadoop, Hive and Mahout to provide a scalable implementation of quasi-real-time intrusion detection system used to detect Peer-to-Peer Botnet attacks using machine learning approach.

216 citations

Proceedings ArticleDOI
01 Dec 2010
TL;DR: This work proposes a modified Diffie-Hellman key exchange protocol between cloud service provider and the user for secretly sharing a symmetric key for secure data access that alleviates the problem of key distribution and management at cloud service service provider.
Abstract: Data security and access control is one of the most challenging ongoing research work in cloud computing, because of users outsourcing their sensitive data to cloud providers Existing solutions that use pure cryptographic techniques to mitigate these security and access control problems suffer from heavy computational overhead on the data owner as well as the cloud service provider for key distribution and management This paper addresses this challenging open problem using capability based access control technique that ensures only valid users will access the outsourced data This work also proposes a modified Diffie-Hellman key exchange protocol between cloud service provider and the user for secretly sharing a symmetric key for secure data access that alleviates the problem of key distribution and management at cloud service provider The simulation run and analysis shows that the proposed approach is highly efficient and secure under existing security models

80 citations

Proceedings ArticleDOI
17 May 2014
TL;DR: This paper proposes PeerShark, a novel methodology to detect P2P botnet traffic and differentiate it from benign P1P traffic in a network which is port-oblivious, protocol-ob oblivious and does not require Deep Packet Inspection.
Abstract: The decentralized nature of Peer-to-Peer (P2P) botnets makes them difficult to detect. Their distributed nature also exhibits resilience against take-down attempts. Moreover, smarter bots are stealthy in their communication patterns, and elude the standard discovery techniques which look for anomalous network or communication behavior. In this paper, we propose Peer Shark, a novel methodology to detect P2P botnet traffic and differentiate it from benign P2P traffic in a network. Instead of the traditional 5-tuple 'flow-based' detection approach, we use a 2-tuple 'conversation-based' approach which is port-oblivious, protocol-oblivious and does not require Deep Packet Inspection. Peer Shark could also classify different P2P applications with an accuracy of more than 95%.

52 citations

Journal ArticleDOI
TL;DR: In this era, more information is being created by individuals than by business houses, living in an on-command, on-demand Big data world.
Abstract: Information is increasingly becoming important in our daily lives. We have become information dependents of the twenty-first century, living in an on-command, on-demand Big data world. In this era, more information is being created by individuals than by business houses. In the past, we had to stand in a queue at a railway reservation counter to book our tickets, had to visit a cash counter in a bank to do our transactions, had to arrange a get together at a physical location within our town to meet and socialize with our friends, had to visit a theater to watch a movie, and so on. We now have Information and Communication Technology (ICT) to help us do all these by sitting in front of a computer and with a few mouse clicks. Also, all these advancements are possible because we are drenched in a flood of data today. It is important to distinguish between data, information and knowledge. Data is a set of facts about events.

30 citations

Proceedings ArticleDOI
22 Aug 2013
TL;DR: This research work presents preliminary results of comparison of performance of three different feature selection algorithms - Correlation based feature selection, Consistency based subset evaluation and Principal component analysis-on three different Machine learning techniques- namely Decision trees, Naïve Bayes classifier, and Bayesian Network classifier for the detection of Peer-to-Peer based botnet traffic.
Abstract: The use of anomaly-based classification of intrusions has increased significantly for Intrusion Detection Systems. Large number of training data samples and a good 'feature set' are two primary requirements to build effective classification models with machine learning algorithms. Since the amount of data available for malicious traffic will often be small compared to the available traces of benign traffic, extraction of 'good' features which enable detection of malicious traffic is a challenging area of work.This research work presents preliminary results of comparison of performance of three different feature selection algorithms - Correlation based feature selection, Consistency based subset evaluation and Principal component analysis-on three different Machine learning techniques- namely Decision trees, Naive Bayes classifier, and Bayesian Network classifier. These algorithms are evaluated for the detection of Peer-to-Peer (P2P) based botnet traffic.

30 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.
Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

13,246 citations

Journal ArticleDOI
TL;DR: The definition, characteristics, and classification of big data along with some discussions on cloud computing are introduced, and research challenges are investigated, with focus on scalability, availability, data integrity, data transformation, data quality, data heterogeneity, privacy, legal and regulatory issues, and governance.

2,141 citations

01 Jan 2003
TL;DR: A super-peer is a node in a peer-to-peer network that operates both as a server to a set of clients, and as an equal in a network of super-peers.
Abstract: A super-peer is a node in a peer-to-peer network that operates both as a server to a set of clients, and as an equal in a network of super-peers. Super-peer networks strike a balance between the efficiency of centralized search, and the autonomy, load balancing and robustness to attacks provided by distributed search. Furthermore, they take advantage of the heterogeneity of capabilities (e.g., bandwidth, processing power) across peers, which recent studies have shown to be enormous. Hence, new and old P2P systems like KaZaA and Gnutella are adopting super-peers in their design. Despite their growing popularity, the behavior of super-peer networks is not well understood. For example, what are the potential drawbacks of super-peer networks? How can super-peers be made more reliable? How many clients should a super-peer take on to maximize efficiency? we examine super-peer networks in detail, gaming an understanding of their fundamental characteristics and performance tradeoffs. We also present practical guidelines and a general procedure for the design of an efficient super-peer network.

916 citations