scispace - formally typeset
Book ChapterDOI

Technologies for Big Data

Reads0
Chats0
TLDR
This chapter provides a review and analysis of several key Big Data technologies, including Map-Reduce, NOSQL technology, MPP (Massively Parallel Processing), and In Memory Databases technologies.
Abstract
This chapter provides a review and analysis of several key Big Data technologies. Currently, there are many Big Data technologies in development and implementation; hence, a comprehensive review of all of these technologies is beyond the scope of this chapter. This chapter focuses on the most popularly accepted technologies. The key Big Data technologies to be discussed include: Map-Reduce, NOSQL technology, MPP (Massively Parallel Processing), and In Memory Databases technologies. For each of these Big Data technologies, the following subtopics are discussed: the history and genesis of the Big Data technologies, problem set that this technology solves for Big Data analytics, the details of the technologies, including components, technical architecture, and theory of operations. This is followed by technical operation and infrastructure (compute, storage, and network), design considerations, and performance benchmarks. Finally, this chapter provides an integrated approach to the above-mentioned Big Data technologies. INTRODUCTION: THE CHALLENGE OF BIG DATA The amount of data in the world is being collected and stored at unprecedented rates. A study by IDC Gantz & Reinsel, (2011) indicates that the world’s information is doubling every two years. Also the IDC study by Gantz & Reinsel (2011), mentions that the world created a staggering 1.8 zettabytes of information (a zettabyte is 1000 exabytes), and projections suggest that by 2020, we’ll generate will generate 50 times that amount. Big Data has been defined as, when data sets get so large, that traditional technologies, Kapil Bakshi Cisco Systems Inc., USA

read more

Citations
More filters
Proceedings ArticleDOI

Big Data with Ten Big Characteristics

TL;DR: The proposed approach in this paper might facilitate the research and development of big data, big data analytics, business intelligence, and business analytics.
Book ChapterDOI

Proposal of Analytical Model for Business Problems Solving in Big Data Environment

TL;DR: This chapter proposes a new analytical approach that consolidates the traditional analytical approach for solving problems such as churn detection, fraud detection, building predictive models, segmentation modeling with data sources, and analytical techniques from the big data area.
Book ChapterDOI

Efficient Risk Profiling Using Bayesian Networks and Particle Swarm Optimization Algorithm

TL;DR: This chapter introduces usage of particle swarm optimization algorithm and explained methodology, as a tool for discovering customer profiles based on previously developed Bayesian network, a common known method for risk modelling.
Journal ArticleDOI

Big database technologies: shaping the future world

John M
TL;DR: Big data analytics are the protagonists of the IT market in a massive way in all sectors, and in the coming years the job market will require many experts in this sector, and many professions will be transformed.
Journal ArticleDOI

A Service-Oriented Foundation for Big Data

TL;DR: The article looks at each level of the proposed framework for big data from a service-oriented perspective to help organizations and researchers understand how the 10 big characteristics relate to big opportunities, big challenges, and big impacts arising from big data.
References
More filters
Journal ArticleDOI

MapReduce: simplified data processing on large clusters

TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.
Journal ArticleDOI

The Google file system

TL;DR: This paper presents file system interface extensions designed to support distributed applications, discusses many aspects of the design, and reports measurements from both micro-benchmarks and real world use.
Proceedings Article

Bigtable: A Distributed Storage System for Structured Data (Awarded Best Paper!).

TL;DR: Bigtable as mentioned in this paper is a distributed storage system for managing structured data that is designed to scale to a very large size: petabytes of data across thousands of commodity servers, including web indexing, Google Earth and Google Finance.
Proceedings ArticleDOI

Dynamo: amazon's highly available key-value store

TL;DR: D Dynamo is presented, a highly available key-value storage system that some of Amazon's core services use to provide an "always-on" experience and makes extensive use of object versioning and application-assisted conflict resolution in a manner that provides a novel interface for developers to use.
Proceedings ArticleDOI

Benchmarking cloud serving systems with YCSB

TL;DR: This work presents the "Yahoo! Cloud Serving Benchmark" (YCSB) framework, with the goal of facilitating performance comparisons of the new generation of cloud data serving systems, and defines a core set of benchmarks and reports results for four widely used systems.
Related Papers (5)