scispace - formally typeset
Open Access

Understanding System and Architecture for Big Data

Reads0
Chats0
TLDR
Preliminary infrastructure tuning results in sorting 1TB data in 14 minutes 1 on 10 Power 730 machines running IBM InfoSphere BigInsights and further improvement is expected, among other factors, on the new IBM PowerLinux TM 7R2 systems.
Abstract
The use of Big Data underpins critical activities in all sectors of our society. Achieving the full transformative potential of Big Data in this increasingly digital world requires both new data analysis algorithms and a new class of systems to handle the dramatic data growth, the demand to integrate structured and unstructured data analytics, and the increasing computing needs of massive-scale analytics. In this paper, we discuss several Big Data research activities at IBM Research: (1) Big Data benchmarking and methodology; (2) workload optimized systems for Big Data; (3) case study of Big Data workloads on IBM Power systems. In (3), we show that preliminary infrastructure tuning results in sorting 1TB data in 14 minutes 1 on 10 Power 730 machines running IBM InfoSphere BigInsights. Further improvement is expected, among other factors, on the new IBM PowerLinux TM 7R2 systems.

read more

Citations
More filters
Proceedings ArticleDOI

Strategic Alignment of Cloud-Based Architectures for Big Data

TL;DR: A framework is developed that enumerates the alternatives for implementing Big Data applications using cloud-services and identifies the strategic goals supported by these Alternatives, which clarifies the options for Big Data initiatives usingcloud-computing and thus improves the strategic alignment of Big data applications.
Proceedings ArticleDOI

Guiding the Introduction of Big Data in Organizations: A Methodology with Business- and Data-Driven Ideation and Enterprise Architecture Management-Based Implementation

TL;DR: A methodology based on IT value theory and workgroup ideation guiding big data idea generation, idea assessment and implementation management is described.
Proceedings ArticleDOI

Memory system characterization of big data workloads

TL;DR: This paper develops an analysis methodology to understand how conventional optimizations such as caching, prediction, and prefetching may apply to Hadoop and noSQL big data workloads, and discusses the implications on software and system design.

Towards a big data reference architecture

TL;DR: The proposed reference architecture and a survey of the current state of art in ‘big data’ technologies guides designers in the creation of systems, which create new value from existing, but also previously under-used data.
Proceedings ArticleDOI

Towards a Framework for Enterprise Architecture Analytics

TL;DR: This work is introducing an approach for complementing the existing top-down approach for the creation of enterprise architecture with a bottom approach, and uses the architectural information contained in many infrastructures to provide architectural information.
References
More filters
Proceedings ArticleDOI

The HiBench benchmark suite: Characterization of the MapReduce-based data analysis

TL;DR: This paper presents the benchmarking, evaluation and characterization of Hadoop, an open-source implementation of MapReduce, and introduces HiBench, a new benchmark suite for Hadoops, which evaluates and characterize theHadoop framework in terms of speed, throughput, and system resource utilizations.
Related Papers (5)