Journal•ISSN: 2005-4270

International journal of database theory and application

NADIA

About: International journal of database theory and application is an academic journal. The journal publishes majorly in the area(s): Cluster analysis & Cloud computing. It has an ISSN identifier of 2005-4270. Over the lifetime, 568 publications have been published receiving 2530 citations.

...read moreread less

Topics: Cluster analysis, Cloud computing, Support vector machine, Big data, Canopy clustering algorithm ...read more

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

KNN based Machine Learning Approach for Text and Document Mining

[...]

Vishwanath Bijalwan, Vinay Kumar, Pinki Kumari, Jordan Pascual

28 Feb 2014-International journal of database theory and application

TL;DR: This paper first categorize the documents using KNN based machine learning approach and then return the most relevant documents to solve the text categorization problem.

...read moreread less

Abstract: Text Categorization (TC), also known as Text Classification, is the task of automatically classifying a set of text documents into different categories from a predefined set. If a document belongs to exactly one of the categories, it is a single-label classification task; otherwise, it is a multi-label classification task. TC uses several tools from Information Retrieval (IR) and Machine Learning (ML) and has received much attention in the last years from both researchers in the academia and industry developers. In this paper, we first categorize the documents using KNN based machine learning approach and then return the most relevant documents.

...read moreread less

197 citations

Journal Article•DOI•

Mining Educational Data to Predict Student’s academic Performance using Ensemble Methods

[...]

Elaf Abu Amrieh, Thair Hamtini¹, Ibrahim Aljarah²•Institutions (2)

University of Jordan¹, Jacksonville University²

31 Aug 2016-International journal of database theory and application

TL;DR: There is a strong relationship between learner’s behaviors and their academic achievement, and the proposed model based on data mining techniques with new data attributes/features, which are called student's behavioral features proves the reliability of this proposed model.

...read moreread less

Abstract: Educational data mining has received considerable attention in the last few years. Many data mining techniques are proposed to extract the hidden knowledge from educational data. The extracted knowledge helps the institutions to improve their teaching methods and learning process. All these improvements lead to enhance the performance of the students and the overall educational outputs. In this paper, we propose a new student’s performance prediction model based on data mining techniques with new data attributes/features, which are called student’s behavioral features. These type of features are related to the learner’s interactivity with the e-learning management system. The performance of student’s predictive model is evaluated by set of classifiers, namely; Artificial Neural Network, Naive Bayesian and Decision tree. In addition, we applied ensemble methods to improve the performance of these classifiers. We used Bagging, Boosting and Random Forest (RF), which are the common ensemble methods used in the literature. The obtained results reveal that there is a strong relationship between learner’s behaviors and their academic achievement. The accuracy of the proposed model using behavioral features achieved up to 22.1% improvement comparing to the results when removing such features and it achieved up to 25.8% accuracy improvement using ensemble methods. By testing the model using newcomer students, the achieved accuracy is more than 80%. This result proves the reliability of the proposed model.

...read moreread less

195 citations

Journal Article•DOI•

A MapReduce Implementation of C4.5 Decision Tree Algorithm

[...]

Wei Dai, Wei Ji

28 Feb 2014-International journal of database theory and application

TL;DR: This work proposes to implement a typical decision tree algorithm, C4.5, using MapReduce programming model, and transforms the traditional algorithm into a series of Map and Reduce procedures, showing both time efficiency and scalability.

...read moreread less

Abstract: Recent years have witness the development of cloud computing and the big data era, which brings up challenges to traditional decision tree algorithms. First, as the size of dataset becomes extremely big, the process of building a decision tree can be quite time consuming. Second, because the data cannot fit in memory any more, some computation must be moved to the external storage and therefore increases the I/O cost. To this end, we propose to implement a typical decision tree algorithm, C4.5, using MapReduce programming model. Specifically, we transform the traditional algorithm into a series of Map and Reduce procedures. Besides, we design some data structures to minimize the communication cost. We also conduct extensive experiments on a massive dataset. The results indicate that our algorithm exhibits both time efficiency and scalability.

...read moreread less

145 citations

Journal Article•DOI•

A Comprehensive Survey on Support Vector Machine in Data Mining Tasks: Applications & Challenges

[...]

Janmenjoy Nayak, Bighnaraj Naik, Himansu Sekhar Behera

28 Feb 2015-International journal of database theory and application

TL;DR: The main aim of this paper is to extrapolate the various areas of SVM with a basis of understanding the technique and a comprehensive survey, while offering researchers a modernized picture of the depth and breadth in both the theory and applications.

...read moreread less

Abstract: During the last two decades, a substantial amount of research efforts has been intended for support vector machine at the application of various data mining tasks. Data Mining is a pioneering and attractive research area due to its huge application areas and task primitives. Support Vector Machine (SVM) is playing a decisive role as it provides techniques those are especially well suited to obtain results in an efficient way and with a good level of quality. In this paper, we survey the role of SVM in various data mining tasks like classification, clustering, prediction, forecasting and others applications. In broader point of view, we have reviewed the number of research publications that have been contributed in various internationally reputed journals for the data mining applications and also suggested a possible no. of issues of SVM. The main aim of this paper is to extrapolate the various areas of SVM with a basis of understanding the technique and a comprehensive survey, while offering researchers a modernized picture of the depth and breadth in both the theory and applications.

...read moreread less

107 citations

Journal Article•DOI•

Analysis of KDD CUP 99 Dataset using Clustering based Data Mining

[...]

Mohammad Khubeb Siddiqui, Shams Naahid

31 Oct 2013-International journal of database theory and application

TL;DR: An analysis of 10% of KDD cup’99 training dataset based on intrusion detection establishes a relationship between the attack types and the protocol used by the hackers, using clustered data.

...read moreread less

Abstract: The KDD Cup 99 dataset has been the point of attraction for many researchers in the field of intrusion detection from the last decade. Many researchers have contributed their efforts to analyze the dataset by different techniques. Analysis can be used in any type of industry that produces and consumes data, of course that includes security. This paper is an analysis of 10% of KDD cup’99 training dataset based on intrusion detection. We have focused on establishing a relationship between the attack types and the protocol used by the hackers, using clustered data. Analysis of data is performed using k-means clustering; we have used the Oracle 10g data miner as a tool for the analysis of dataset and build 1000 clusters to segment the 494,020 records. The investigation revealed many interesting results about the protocols and attack types preferred by the hackers for intruding the networks. Keyword: KDD 99 dataset, clustering, k-means, intrusion detection

...read moreread less

93 citations

Collapse

Network Information

Related Journals (5)

Multimedia Tools and Applications

16K papers, 185.7K citations

9K papers, 53.8K citations

76% related

Journal of Software

6.7K papers, 42.4K citations

26.6K papers, 157.4K citations

2.8K papers, 23.5K citations

75% related

Performance

Metrics

568

Papers

2,949

Citations

No. of papers from the Journal in previous years
Year	Papers
2017	26
2016	240
2015	159
2014	105
2013	31
2012	2