scispace - formally typeset
Open AccessBook ChapterDOI

Weka4WS: a WSRF-enabled weka toolkit for distributed data mining on grids

Reads0
Chats0
TLDR
The paper describes the design and the implementation of Weka4WS, a framework that extends the Weka toolkit for supporting distributed data mining on Grid environments using a first release of the WSRF library.
Abstract
This paper presents Weka4WS, a framework that extends the Weka toolkit for supporting distributed data mining on Grid environments. Weka4WS adopts the emerging Web Services Resource Framework (WSRF) for accessing remote data mining algorithms and managing distributed computations. The Weka4WS user interface is a modified Weka Explorer environment that supports the execution of both local and remote data mining tasks. On every computing node, a WSRF-compliant Web Service is used to expose all the data mining algorithms provided by the Weka library. The paper describes the design and the implementation of Weka4WS using a first release of the WSRF library. To evaluate the efficiency of the proposed system, a performance analysis of Weka4WS for executing distributed data mining tasks in different network scenarios is presented.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

The WEKA data mining software: an update

TL;DR: This paper provides an introduction to the WEKA workbench, reviews the history of the project, and, in light of the recent 3.6 stable release, briefly discusses what has been added since the last stable version (Weka 3.4) released in 2003.
Journal ArticleDOI

Distributed data mining: a survey

TL;DR: The-state-of-the-art algorithms and applications in distributed data mining are surveyed and the future research opportunities are discussed.
Journal ArticleDOI

Active learning for sentiment analysis on data streams: Methodology and workflow implementation in the ClowdFlows platform

TL;DR: ClowdFlows, a cloud-based scientific workflow platform, and its extensions enabling the analysis of data streams and active learning are described, using active learning with a linear Support Vector Machine for learning sentiment classification models to be applied to microblogging data streams.
Journal ArticleDOI

Workflow Systems for Science: Concepts and Tools

TL;DR: This paper discusses basic concepts of scientific workflows and presents workflow system tools and frameworks used today for the implementation of application in science and engineering on high-performance computers and distributed systems.
Journal ArticleDOI

A semantic framework for automatic generation of computational workflows using distributed data and component catalogues

TL;DR: A novel approach to automating a new aspect of the process: the selection of application components and data sources, which assumes a distributed architecture where data and component catalogues are separate from the workflow system.
References
More filters
Book

Data Mining

Ian Witten
TL;DR: In this paper, generalized estimating equations (GEE) with computing using PROC GENMOD in SAS and multilevel analysis of clustered binary data using generalized linear mixed-effects models with PROC LOGISTIC are discussed.
Journal ArticleDOI

Data mining: practical machine learning tools and techniques with Java implementations

TL;DR: This presentation discusses the design and implementation of machine learning algorithms in Java, as well as some of the techniques used to develop and implement these algorithms.
Book

Grid Computing: Making the Global Infrastructure a Reality

TL;DR: The Grid Computing: Features contributions from the major players in the field Covers all aspects of grid technology from motivation to applications provided an extensive state-of-the-art guide in grid computing as mentioned in this paper.
Book ChapterDOI

Keys with Upward Wildcards for XML

TL;DR: The paper provides a sound and complete set of inference rules and a cubic time algorithm for determining implication of the keys in a key constraint language for XML.
Book ChapterDOI

Application Overview for the Book: Grid Computing – Making the Global Infrastructure a Reality

TL;DR: Grid Computing: Features contributions from the major players in the field covers all aspects of grid technology from motivation to applications and provides an extensive state-of-the-art guide in grid computing.