scispace - formally typeset
Book ChapterDOI

Big Data Ingestion and Streaming Patterns

Nitin Sawant, +1 more
- pp 29-42
Reads0
Chats0
TLDR
This chapter deals with the following topics: multistructured data in the form of social media and audio/video, and the way data is ingested, preprocessed, validated, and/or cleansed and integrated or co-related with nontextual formats.
Abstract
Traditional business intelligence (BI) and data warehouse (DW) solutions use structured data extensively Database platforms such as Oracle, Informatica, and others had limited capabilities to handle and manage unstructured data such as text, media, video, and so forth, although they had a data type called CLOB and BLOB; which were used to store large amounts of text, and accessing data from these platforms was a problem With the advent of multistructured (aka unstructured) data in the form of social media and audio/video, there has to be a change in the way data is ingested, preprocessed, validated, and/or cleansed and integrated or co-related with nontextual formats This chapter deals with the following topics:

read more

Citations
More filters
Journal ArticleDOI

Device Data Ingestion for Industrial Big Data Platforms with a Case Study

TL;DR: A heterogeneous device data ingestion model for an industrial big data platform that includes device templates and four strategies for data synchronization, data slicing, data splitting and data indexing is presented.
Patent

Dataset engine for use within a cognitive environment

TL;DR: An apparatus for use within a cognitive information processing system environment comprising a dataset engine, the dataset engine coupled to receive data from a plurality of data sources, and the dataset engines processing the data from the plurality of datasets to establish and maintain a dynamic data ingestion and enrichment pipeline is described in this article.
Dissertation

A benchmark suite for distributed stream processing systems

TL;DR: A framework was created with an API to generalize the application development and collect metrics, with the possibility of extending it to support other platforms in the future, and the usefulness of the benchmark suite was demonstrated in comparing these systems.
Related Papers (5)