Big-Data Processing Techniques and Their Challenges in Transport Domain

doi:10.3969/J.ISSN.1673-5188.2015.01.007

Home
/
Papers
/
Big-Data Processing Techniques and Their Challenges in Transport Domain

Journal Article•DOI•

Big-Data Processing Techniques and Their Challenges in Transport Domain

Aftab Ahmed Chandio, Nikos Tziritas, Cheng-Zhong Xu

25 Mar 2015-ZTE communications-Vol. 13, Iss: 1, pp 50-59

TL;DR: The strengths and weaknesses of various big-data cloud processing techniques are highlighted in order to help the big- data community select the appropriate processing technique.

read less

Abstract: This paper describes the fundamentals of cloud computing and current big-data key technologies. We categorize big-data processing as batch-based, stream-based, graph-based, DAG-based, interactive-based, or visual-based according to the processing technique. We highlight the strengths and weaknesses of various big-data cloud processing techniques in order to help the big-data community select the appropriate processing technique. We also provide big data research challenges and future directions in aspect to transportation management systems.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

A survey on scholarly data

[...]

Samiya Khan¹, Xiufeng Liu², Kashish Ara Shakil¹, Mansaf Alam¹•Institutions (2)

Jamia Millia Islamia¹, Technical University of Denmark²

01 Jul 2017-Information Processing and Management

TL;DR: This research paper investigates the current trends and identifies the existing challenges in development of a big scholarly data platform, with specific focus on directions for future research and maps them to the different phases of the big data lifecycle.

...read moreread less

Abstract: Survey of big scholarly data with respect to the different phases of the big data lifecycle.Identifies the different big data tools and technologies that can be used for development of scholarly applications.Investigates research challenges and limitations specific to big scholarly data and its applications.Provides research directions and paves way towards the development of a generic and comprehensive big scholarly data platform. Recently, there has been a shifting focus of organizations and governments towards digitization of academic and technical documents, adding a new facet to the concept of digital libraries. The volume, variety and velocity of this generated data, satisfies the big data definition, as a result of which, this scholarly reserve is popularly referred to as big scholarly data. In order to facilitate data analytics for big scholarly data, architectures and services for the same need to be developed. The evolving nature of research problems has made them essentially interdisciplinary. As a result, there is a growing demand for scholarly applications like collaborator discovery, expert finding and research recommendation systems, in addition to several others. This research paper investigates the current trends and identifies the existing challenges in development of a big scholarly data platform, with specific focus on directions for future research and maps them to the different phases of the big data lifecycle.

...read moreread less

104 citations

Journal Article•DOI•

Railway Assets: A Potential Domain for Big Data Analytics

[...]

Adithya Thaduri¹, Diego Galar¹, Uday Kumar¹•Institutions (1)

Luleå University of Technology¹

01 Jan 2015-Procedia Computer Science

TL;DR: An overview of Big Data technologies in context of transportation with specific to Railways is given and insight on how the existing data modules from the transport authority combines Big Data and how can be incorporated in providing maintenance decision making is given.

...read moreread less

102 citations

Journal Article•DOI•

A Big Data smart library recommender system for an educational institution

[...]

Aleksandar Simović

19 Apr 2018-Library Hi Tech

TL;DR: The paper presents a Big Data smart library system that has the potential to create new values and data-driven decisions by incorporating multiple sources of differential data.

...read moreread less

Abstract: With the exponential growth of the amount of data, the most sophisticated systems of traditional libraries are not able to fulfill the demands of modern business and user needs. The purpose of this paper is to present the possibility of creating a Big Data smart library as an integral and enhanced part of the educational system that will improve user service and increase motivation in the continuous learning process through content-aware recommendations.,This paper presents an approach to the design of a Big Data system for collecting, analyzing, processing and visualizing data from different sources to a smart library specifically suitable for application in educational institutions.,As an integrated recommender system of the educational institution, the practical application of Big Data smart library meets the user needs and assists in finding personalized content from several sources, resulting in economic benefits for the institution and user long-term satisfaction.,The need for continuous education alters business processes in libraries with requirements to adopt new technologies, business demands, and interactions with users. To be able to engage in a new era of business in the Big Data environment, librarians need to modernize their infrastructure for data collection, data analysis, and data visualization.,A unique value of this paper is its perspective of the implementation of a Big Data solution for smart libraries as a part of a continuous learning process, with the aim to improve the results of library operations by integrating traditional systems with Big Data technology. The paper presents a Big Data smart library system that has the potential to create new values and data-driven decisions by incorporating multiple sources of differential data.

...read moreread less

47 citations

Book Chapter•DOI•

The Nexus Between Big Data and Decision-Making: A Study of Big Data Techniques and Technologies

[...]

Rabab Naqvi, Tariq Rahim Soomro, Haitham M. Alzoubi¹, Taher M. Ghazal¹, Muhammad Alshurideh² - Show less +1 more•Institutions (2)

Skyline University College¹, University of Jordan²

28 Jun 2021

TL;DR: In this paper, the authors highlight the impression and effect on decision-making through big data and discuss applications of big data-influenced decision making, along with state-of-the-art big data techniques and technologies.

...read moreread less

Abstract: Big Data (BD) has shifted the paradigm of conventional data analysis with the exploitation of emerging technologies. Analysis using BD contributes to foreseeing and pulling out value from large data, exposing covert information, and expediting the decision-making process. This study highlights the impression and effect on decision-making through BD. The investigation’s rationale is to dig deep insight into the buzzword to enable stakeholders to understand the challenges and opportunities that BD has bought in the current business scenarios. It also discusses applications of BD-influenced decision-making, along with state-of-the-art BD techniques and technologies. The study is a review article based on the research articles, conference proceedings, books, and web articles available on Google Scholar and Google from the period between 2010 and 2020. Due to BD’s extreme importance, the available techniques and technologies should facilitate effective data collection, storage, analysis, and visualization. Every opportunity comes with greater challenges; this paper summarizes the strengths and weaknesses of different tools associated with three broad categories of BD technologies. This enables researchers to quickly glance at the available tools’ pros and cons in one only place. This emerging field is still very young and premature. Various techniques and technologies have been designed to deal with such humungous data, but they still offer minimal efficacy to deal with BD problems completely. This is high time now that technologists, researchers, and governments pay significant attention to this vast and evolving field by investing their time and money in developing efficient tools that maximize value from it. BD also means big opportunities, big challenges, and big systems; therefore, it also requires big attention from researchers to overcome the research gaps that exist in this big field.

...read moreread less

42 citations

Journal Article•DOI•

Industry 4.0: Some Challenges and Opportunities for Reliability Engineering

[...]

Mohammad Farsi, Enrico Zio¹•Institutions (1)

Polytechnic University of Milan¹

01 Jun 2019

TL;DR: The principle of Industry 4.0 is presented, new directions for research in system modeling, big data analysis, health management, cyber-physical system, human-machine interaction, uncertainty, jointly optimization, communication, and interfaces are proposed, and some of these challenges and opportunities for reliability engineering are discussed.

...read moreread less

Abstract: According to the development of Industry 4.0 and increase the integration of digital, physical and human worlds, reliability engineering must evolve for addressing the existing and future challenges about that. In this paper, the principle of Industry 4.0 is presented and some of these challenges and opportunities for reliability engineering are discussed. New directions for research in system modeling, big data analysis, health management, cyber-physical system, human-machine interaction, uncertainty, jointly optimization, communication, and interfaces are proposed. Each topic can be investigated individually, but this paper summarizes them and prepared a vision about reliability engineering for consideration and discussion by the interested scientific community.

...read moreread less

24 citations

Cites background or methods from "Big-Data Processing Techniques and ..."

...Cloud-Based Big-Data processing techniques as an interesting topic have been investigated in the past decades, different computing models have been proposed based on different platform and focus, such as batchbased, stream-based, graph-based, directed acyclic graph based, Interactive-Based and Visual-Based processing [43]....
[...]
...Neural networks are a useful tool in reliability engineering, especially for remain useful lifetime prediction [43], Farsi and Hosseini used ANN to reduce noise effect and estimated a bearing lifetime [44]....
[...]

1
2
3
4
…
5
6

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

MapReduce: simplified data processing on large clusters

[...]

Jeffrey Dean¹, Sanjay Ghemawat¹•Institutions (1)

Google¹

06 Dec 2004

TL;DR: This paper presents the implementation of MapReduce, a programming model and an associated implementation for processing and generating large data sets that runs on a large cluster of commodity machines and is highly scalable.

...read moreread less

Abstract: MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key. Many real world tasks are expressible in this model, as shown in the paper. Programs written in this functional style are automatically parallelized and executed on a large cluster of commodity machines. The run-time system takes care of the details of partitioning the input data, scheduling the program's execution across a set of machines, handling machine failures, and managing the required inter-machine communication. This allows programmers without any experience with parallel and distributed systems to easily utilize the resources of a large distributed system. Our implementation of MapReduce runs on a large cluster of commodity machines and is highly scalable: a typical MapReduce computation processes many terabytes of data on thousands of machines. Programmers find the system easy to use: hundreds of MapReduce programs have been implemented and upwards of one thousand MapReduce jobs are executed on Google's clusters every day.

...read moreread less

20,309 citations

Journal Article•DOI•

MapReduce: simplified data processing on large clusters

[...]

Jeffrey Dean¹, Sanjay Ghemawat¹•Institutions (1)

Google¹

01 Jan 2008-Communications of The ACM

TL;DR: This presentation explains how the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks.

...read moreread less

Abstract: MapReduce is a programming model and an associated implementation for processing and generating large datasets that is amenable to a broad variety of real-world tasks. Users specify the computation in terms of a map and a reduce function, and the underlying runtime system automatically parallelizes the computation across large-scale clusters of machines, handles machine failures, and schedules inter-machine communication to make efficient use of the network and disks. Programmers find the system easy to use: more than ten thousand distinct MapReduce programs have been implemented internally at Google over the past four years, and an average of one hundred thousand MapReduce jobs are executed on Google's clusters every day, processing a total of more than twenty petabytes of data per day.

...read moreread less

17,663 citations

Report•DOI•

The NIST Definition of Cloud Computing

[...]

Peter Mell, Timothy Grance

28 Sep 2011

TL;DR: This cloud model promotes availability and is composed of five essential characteristics, three service models, and four deployment models.

...read moreread less

Abstract: Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. This cloud model promotes availability and is composed of five essential characteristics, three service models, and four deployment models.

...read moreread less

15,145 citations

Journal Article•DOI•

Cloud computing and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility

[...]

Rajkumar Buyya¹, Chee Shin Yeo¹, Srikumar Venugopal¹, James Broberg¹, Ivona Brandic² - Show less +1 more•Institutions (2)

University of Melbourne¹, Vienna University of Technology²

01 Jun 2009-Future Generation Computer Systems

TL;DR: This paper defines Cloud computing and provides the architecture for creating Clouds with market-oriented resource allocation by leveraging technologies such as Virtual Machines (VMs), and provides insights on market-based resource management strategies that encompass both customer-driven service management and computational risk management to sustain Service Level Agreement (SLA) oriented resource allocation.

...read moreread less

5,850 citations

"Big-Data Processing Techniques and ..." refers background in this paper

...SOA services are flexible, scalable, and loosely coupled [17]....
[...]
...In an SOA, services are interoperable, which means that distributed systems can communicate and exchange data with each another [17]....
[...]

Journal Article•DOI•

A bridging model for parallel computation

[...]

Leslie G. Valiant¹•Institutions (1)

Harvard University¹

01 Aug 1990-Communications of The ACM

TL;DR: The bulk-synchronous parallel (BSP) model is introduced as a candidate for this role, and results quantifying its efficiency both in implementing high-level language features and algorithms, as well as in being implemented in hardware.

...read moreread less

Abstract: The success of the von Neumann model of sequential computation is attributable to the fact that it is an efficient bridge between software and hardware: high-level languages can be efficiently compiled on to this model; yet it can be effeciently implemented in hardware. The author argues that an analogous bridge between software and hardware in required for parallel computation if that is to become as widely used. This article introduces the bulk-synchronous parallel (BSP) model as a candidate for this role, and gives results quantifying its efficiency both in implementing high-level language features and algorithms, as well as in being implemented in hardware.

...read moreread less

3,885 citations

"Big-Data Processing Techniques and ..." refers methods in this paper

...Bulk Synchronous Parallel (BSP) computing paradigm was introduced by Valiant and Leslie in [21]....
[...]
...A BSP algorithm [21], [22] generates a series of super-steps, each of which executes a user-defined function in parallel that performs computations asynchronously....
[...]