scispace - formally typeset
Search or ask a question

Showing papers presented at "Testbeds and Research Infrastructures for the DEvelopment of NeTworks and COMmunities in 2018"


Book ChapterDOI
16 Nov 2018
TL;DR: The design and implementation of an upgraded version of Indriya,Indriya2, is presented, with the following improvements, namely support for heterogeneous sensor devices, support for higher data rate through the infrastructure, and support for multiple users to schedule jobs over non-overlapping set of heterogeneous nodes at the same time.
Abstract: Wireless sensor network testbeds are important elements of sensor network/IoT research. The Indriya testbed has been serving the sensor network community for the last 8 years. Researchers from more than a hundred institutions around the world have been actively using the testbed in their work. However, given that Indriya has been deployed for over 8 years, it has a number of limitations. For example, it lacks support for heterogeneous devices and the ability to handle data generated by the testbed with no loss, even at a relatively low sampling rate. In this paper, we present the design and implementation of an upgraded version of Indriya, Indriya2, with the following improvements, namely (1) support for heterogeneous sensor devices, (2) support for higher data rate through the infrastructure, (3) support for multiple users to schedule jobs over non-overlapping set of heterogeneous nodes at the same time, and (4) a real-time publish/subscribe architecture to send/receive data to/from the testbed nodes.

25 citations


Book ChapterDOI
16 Nov 2018
TL;DR: A DETERlab-based IoT botnet testbed is presented, built in a secure contained environment and includes ancillary services such as DHCP, DNS as well as botnet infrastructure including CnC and scanListen/loading servers.
Abstract: Many security issues have come to the fore with the increasingly widespread adoption of Internet-of-Things (IoT) devices. The Mirai attack on Dyn DNS service, in which vulnerable IoT devices such as IP cameras, DVRs and routers were infected and used to propagate large-scale DDoS attacks, is one of the more prominent recent examples. IoT botnets, consisting of hundreds-of-thousands of bots, are currently present “in-the-wild” at least and are only expected to grow in the future, with the potential to cause significant network downtimes and financial losses to network companies. We propose, therefore, to build testbeds for evaluating IoT botnets and design suitable mitigation techniques against them. A DETERlab-based IoT botnet testbed is presented in this work. The testbed is built in a secure contained environment and includes ancillary services such as DHCP, DNS as well as botnet infrastructure including CnC and scanListen/loading servers. Developing an IoT botnet testbed presented us with some unique challenges which are different from those encountered in non-IoT botnet testbeds and we highlight them in this paper. Further, we point out the important features of our testbed and illustrate some of its capabilities through experimental results.

8 citations


Book ChapterDOI
16 Nov 2018
TL;DR: This work proposes regression-based throughput profiles by aggregating measurements from sites of the infrastructure, with RTT as the independent variable, and presents projection and difference operators, and coefficients of throughput profiles to characterize the performance of infrastructure and its parts, including sites and file transfer tools.
Abstract: To support increasingly distributed scientific and big-data applications, powerful data transfer infrastructures are being built with dedicated networks and software frameworks customized to distributed file systems and data transfer nodes. The data transfer performance of such infrastructures critically depends on the combined choices of file, disk, and host systems as well as network protocols and file transfer software, all of which may vary across sites. The randomness of throughput measurements makes it challenging to assess the impact of these choices on the performance of infrastructure or its parts. We propose regression-based throughput profiles by aggregating measurements from sites of the infrastructure, with RTT as the independent variable. The peak values and convex-concave shape of a profile together determine the overall throughput performance of memory and file transfers, and its variations show the performance differences among the sites. We then present projection and difference operators, and coefficients of throughput profiles to characterize the performance of infrastructure and its parts, including sites and file transfer tools. In particular, the utilization-concavity coefficient provides a value in the range [0, 1] that reflects overall transfer effectiveness. We present results of measurements collected using (i) testbed experiments over dedicated 0–366 ms 10 Gbps connections with combinations of TCP versions, file systems, host systems and transfer tools, and (ii) Globus GridFTP transfers over production infrastructure with varying site configurations.

5 citations


Book ChapterDOI
16 Nov 2018
TL;DR: In this article, the authors proposed a formalization of the elements of DIKW architecture and proposed a solution framework for security concerns centering Type transitions in Graph, Information Graph and Knowledge Graph.
Abstract: Currently the content of security protection has been expanded multiple sources. The security protection especially of the implicit content from multiple sources poses new challenges to the collection, identification, customization of protection strategies, modeling, etc. We are enlightened by the potential of DIKW (Data, Information, Knowledge, Wisdom) architecture to express semantic of natural language content and human intention. But currently there lacks formalized semantics for the DIKW architecture by itself which poses a challenge for building conceptual models on top of this architecture. We proposed a formalization of the elements of DIKW. The formalization centers the ideology of modeling Data as multiple dimensional hierarchical Types related to observable existence of the Sameness, Information as identification of Data with explicit Difference, Knowledge as applying Completeness of the Type, and Wisdom as variability prediction. Based on this formalization, we propose a solution framework for security concerns centering Type transitions in Graph, Information Graph and Knowledge Graph.

5 citations


Proceedings ArticleDOI
08 Jan 2018
TL;DR: This paper studied the possible technical development trend of software defined wireless networking technologies toward the design and implementation of a smart home router based on the Intel Galileo Gen 2 programmable platform, and presented some preliminary results of the smart home network testbed.
Abstract: The emerging software defined networking (SDN) has great potential for enabling novel networking solutions to improve performance and management of distributed systems such as smart homes. In this paper, we studied the possible technical development trend of software defined wireless networking (SDWN) technologies toward the design and implementation of a smart home router based on the Intel Galileo Gen 2 programmable platform. We instrumented this platform by integrating various open-source software projects such as OpenWrt into a home router to support intelligent home wireless networking and provide lowcost connectivity solutions for the Internet-of-Things using WiFi and another cutting-edge wireless communication technology, “Bluetooth Low Energy. We conducted a series of experiments on our router and presented some preliminary results of our smart home network testbed. Our experiment study may provide empirical experiences into constructing evolvable and cost-effective software defined smart home routers with a good trade-off between performance, flexibility and cost.

3 citations


Book ChapterDOI
16 Nov 2018
TL;DR: The fundamental goal of the Extensible Components Framework is to capture the knowledge of domain experts and turn this knowledge into off-the-shelf models that end-users can easily utilize as first-class testbed objects.
Abstract: Recreating real-world network scenarios on testbeds is common in validating security solutions, but modeling networks correctly requires a good deal of expertise in multiple domains. A testbed user must understand the solution being validated, the real-world deployment environments, in addition to understanding what features in these environments matter and how to model these features correctly in a testbed. As real-world scenarios and the security solutions we design become more diverse and complex, it becomes less likely that the testbed user is able to be a domain expert in their technology, a field expert in the deploy environments for their technology, and an expert in how to model these environments on the testbed. Without the proper expertise from multiple domains, testbed users produce overly simplified and inappropriate test environments, which do not provide adequate validation. To address this pressing need to share domain knowledge in the testbed community, we introduce our Extensible Components Framework for testbed network modeling. Our framework enables multiple experts to contribute to a complex network model without needing to explicitly collaborate or translate between domains. The fundamental goal of our Extensible Components is to capture the knowledge of domain experts and turn this knowledge into off-the-shelf models that end-users can easily utilize as first-class testbed objects. We demonstrate the design and use of our Extensible Components Framework through implementing Click Modular Router [10] based Extensible Components on the DETER testbed, and advocate that our framework can be applied to other environments. We focus on wired network models, but outline how Extensible Components can be used to model other types of networks such as wireless. (This material is based on research sponsored by DARPA under agreement number HR0011-15-C-0096. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright notation thereon. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of DARPA or the U.S. Government.)

3 citations


Book ChapterDOI
16 Nov 2018
TL;DR: An energy-efficient computation offloading method of multimedia workflow with multi-objective optimization based on cloudlet using Differential Evolution (DE) algorithm is proposed to optimize the energy consumption of the mobile devices with time constraints.
Abstract: In recent years, mobile cloud computing (MCC) is utilized to process multimedia workflows due to the limitation of battery capacity of mobile devices, which influences the experience of multimedia applications on the mobile devices. Computation offloading based on cloudlet is introduced as a novel paradigm to relieve the high latency which offloading computation to remote cloud causes. However, it is still a challenge for mobile devices to offload computation of multimedia workflows in cloudlet-based cloud computation environment to reduce energy consumption, which meets time constraints at the same time. In view of the challenge, an energy-efficient computation offloading method of multimedia workflow with multi-objective optimization is proposed in this paper. Technically, an offloading method based on cloudlet using Differential Evolution (DE) algorithm is proposed to optimize the energy consumption of the mobile devices with time constraints. Finally, massive experimental evaluations and comparison analysis validate the efficiency of our proposed method.

2 citations


Proceedings ArticleDOI
08 Jan 2018
TL;DR: A virtualized IoT testbed is applied to optimize an existing network testbed, adopt VMNet to emulate wireless sensors in the IoT experiment, and improve the scalability of the experiment to support large-scale IoT assessments.
Abstract: With the rapid development of information technology, the Internet of Things (IoT) is getting more attentions, which has promoted a new wave of information and industrial tide. In order to reduce the CAPEX, OPEX and TIEX of IoT systems, before the deployment, people often evaluate IoT architectures, protocols, as well as configurations on testbeds. As physical resources of any testbed are limited, it is challenging to conduct large-scale IoT experiments. In this paper, we apply the virtualization technology to optimize an existing network testbed, adopt VMNet to emulate wireless sensors in the IoT experiment, and improve the scalability of the experiment to support large-scale IoT assessments. Our scheme leverages a multi-host collaborative with static multi-sink architecture to solve the bottleneck problem of the largescale IoT emulation and multiple interpolation algorithms to supplement the time continuity and spatial integrity of the sensed data for enhanced fidelity of the IoT experiment. According to our experiments, the virtualized IoT testbed not only reduced the TIEX of IoT emulation sharply, but also enhanced the scalability of IoT experiments.

2 citations


Proceedings ArticleDOI
08 Jan 2018
TL;DR: A new type of drone, which has the function of flight and vehicle can move less power consumption is developed, which extends high range of mobility to drone.
Abstract: In recent years, unmanned aerial vehicle (UAV) technologies are developing rapidly. Drone, one type of the UAVs, is used in many industrial fields, such as photography, delivery and agriculture. However, the commercial drone can flying only about 20 minutes at one charge. Furthermore, the drone prohibits flying at the limited area, and it also can’t work in bad weather. Due to the development of drone technologies, we must reduce energy consumption, and realize high range movement. In order to solve these limitations, we develop a new type of drone, which has the function of flight and vehicle can move less power consumption. It extends high range of mobility to drone. Moreover, it can be used to pass through the limitation area or bad weather condition by sliding.

2 citations


Book ChapterDOI
16 Nov 2018
TL;DR: This paper investigates the impact of live service migration within a Vehicular Ad-hoc Network environment by making use of the results collected from a real experimental test-bed, and introduces a new proactive service migration model which considers both the mobility of the user and the service migration time for different services.
Abstract: Mobile edge clouds have great potential to address the challenges in vehicular networks by transferring storage and computing functions to the cloud. This brings many advantages of the cloud closer to the mobile user, by installing small cloud infrastructures at the network edge. However, it is still a challenge to efficiently utilize heterogeneous communication and edge computing architectures. In this paper, we investigate the impact of live service migration within a Vehicular Ad-hoc Network environment by making use of the results collected from a real experimental test-bed. A new proactive service migration model which considers both the mobility of the user and the service migration time for different services is introduced. Results collected from a real experimental test-bed of connected vehicles show that there is a need to explore proactive service migration based on the mobility of users. This can result in better resource usage and better Quality of Service for the mobile user. Additionally, a study on the performance of the transport protocol and its impact in the context of live service migration for highly mobile environments is presented with results in terms of latency, bandwidth, and burst and their potential effect on the time it takes to migrate services.

2 citations


Book ChapterDOI
16 Nov 2018
TL;DR: A balanced cloudlet management method, named BCM, is proposed to address the above challenge and the Simple Additive Weighting (SAW) and Multiple Criteria Decision Making (MCDM) techniques are applied to optimize virtual machine scheduling strategy.
Abstract: With the rapid development of wireless communication technology, cloudlet-based wireless metropolitan area network, which provides people with more convenient network services, has become an effiective paradigm to meet the growing demand for requirements of wireless cloud computing. Currently, the energy consumption of cloudlets can be reduced by migrating tasks, but how to jointly optimize the time consumption and energy consumption in the process of migrations is still a significant problem. In this paper, a balanced cloudlet management method, named BCM, is proposed to address the above challenge. Technically, the Simple Additive Weighting (SAW) and Multiple Criteria Decision Making (MCDM) techniques are applied to optimize virtual machine scheduling strategy. Finally, simulation results demonstrate the effectiveness of our proposed method.

Book ChapterDOI
16 Nov 2018
TL;DR: A formal approach that allows to discover Flow-tables misconfigurations using inference systems and proposes automatic method to deal with set-field action of flow entries.
Abstract: Software-Defined Networking (SDN) brings a significant flexibility and visibility to networking, but at the same time creates new security challenges. SDN allows networks to keep pace with the speed of change by facilitating frequent modifications to the network configuration. However, these changes may introduce misconfigurations by writing inconsistent rules for Flow-tables. Misconfigurations can arise also between firewalls and Flow-tables in OpenFlow-based networks. Problems arising from these misconfigurations are common and have dramatic consequences for networks operations. Therefore, there is a need of automatic methods to detect and fix these misconfigurations. Given these issues, some methods have been proposed. Though these methods are useful for managing Flow-tables rules, they still have limitations in term of low granularity level and the lack of precise details of analyzed flow entries. To address these challenges, we present in this paper a formal approach that allows to discover Flow-tables misconfigurations using inference systems. The contributions of our work are the following: automatically identifying Flow-tables anomalies, using the Firewall to bring out real misconfigurations and proposing automatic method to deal with set-field action of flow entries.

Proceedings ArticleDOI
08 Jan 2018
TL;DR: The paper explores the area of recommendation based on large-scale implicit feedback, where only positive feedback is available and carried on the empirical research on the Implicit Feedback Recommendation Model.
Abstract: With the development of the internet age, information overload problem is imminent. At now, almost of recommended models use the explicit feedback. But lots of implicit feedback data are missing. The paper explores the area of recommendation based on large-scale implicit feedback, Where only positive feedback is available. Further, the paper carried on the empirical research on the Implicit Feedback Recommendation Model. By maximized the probability of the user's choices, IFR mean the progress task into optimization problems In the way, the experiment results confirm the superiority of the model. However, the model is insufficient about online research and a lack of details.

Proceedings ArticleDOI
08 Jan 2018
TL;DR: The proposed data quality assessment framework consist of feature extraction based on multisensor data fusion and multi-level wavelet transform, as well as a semisupervised learning based classification algorithm, which can extract feasible features and solve the unbalanced label problems.
Abstract: Big data technique is considered as a powerful tool to exploit all the potential of the Internet of Things and the smart cities. The development of internet of Vehicles (IoV) and wireless communication technologies have boosted diverse applications related to smart cities and Cyber-Physical Systems, but the data quality of vehicular sensors is an important issue due to the high-speed mobile wireless communication environment and physical sensor noise. This paper presents our experiences for big data analytics based on a vehicular network testbed, in terms of sensors data management, multi-dimension data fusion and data quality assessment for the vehicular sensor data. The proposed data quality assessment framework consist of feature extraction based on multisensor data fusion and multi-level wavelet transform, as well as a semisupervised learning based classification algorithm. The comparison experiment shows that the proposed framework and approaches can extract feasible features and solve the unbalanced label problems, which achieve a better assessment effect.

Book ChapterDOI
16 Nov 2018
TL;DR: Grain density has important influence on the interpolation accuracy, and the higher gauge density means the higher accuracy, which shows that the uncertainty is progressively stable with the increasing of rainfall intensity.
Abstract: Uncertainty analysis have attracted increasing attention of both theory and application over the last decades. Owing to the complex of surrounding, uncertainty analysis of rainfall in urban area is very little. Existing literatures on uncertainty analysis paid less attention on gauge density and rainfall intensity. Therefore, this study focuses on urban area, which a good complement to uncertainty research. In this study, gauge density was investigated with carefully selecting of gauge to covering evenly. Rainfall intensity data were extracted from one rainfall event at begin, summit and ending phases of rainfall process. Three traditional methods (Ordinary Kriging, RBF and IDW) and three machine methods (RF, ANN and SVM) were investigated for the uncertainty analysis. The result shows that (1) gauge density has important influence on the interpolation accuracy, and the higher gauge density means the higher accuracy. (2) The uncertainty is progressively stable with the increasing of rainfall intensity. (3) Geostatistic methods has better result than the IDW and RBF owing to considering spatial variability. The selected machine learning methods have good performance than traditional methods. However, the complex training processing and without spatial variability may reduce its practicability in modern flood management. Therefore, the combining of traditional methods and machine learning will be the good paradigm for spatial interpolation and uncertainty analysis.

Proceedings ArticleDOI
08 Jan 2018
TL;DR: This work proposes a novel group sequential recommender system, called “DineTogether”, for a group of people dinning together, that is able to capture comprehensive contextual, social factors when make recommendations.
Abstract: Group recommendations become important in many practical scenarios, e.g., couples would like to share movies together, families dine together in a restaurant, a group of friends plan to spend vacation in some points of interest. Although some previous research efforts have been done in this area, most of them only consider a small portion of contextual factors but ignore the time series features of the user historical behavior and next recommended location/event to do, that makes the system performance not as satisfactory as expected. Here, we propose a novel group sequential recommender system, called “DineTogether”, for a group of people dinning together. It is able to capture comprehensive contextual, social factors when make recommendations. We design a computational model, called “Social-and-Time-Aware (STA)” model, and a novel algorithm, Generalized Random Walk with Restart (GRWR). Experiment results show that our approach outperforms the state-of-the-art group recommendation approaches.

Proceedings ArticleDOI
08 Jan 2018
TL;DR: Simulation results show that edge cache and local D2D cooperative cache can achieve higher hit rate, effectively reduce the delay of users’ access to content, and reduce the core network traffic of 5G.
Abstract: The ever increasing demand for content is straining operators networks, which encourages the development of content delivery mechanisms. Mobile edge caching and device-to-device technologies constitute a promising solution for reducing the effects of demand growth by placing popular content files in proximity to uses. Mobile Edge caching enables users obtain the requested contents from small cellular or other user equipments, rather than the content server through the mobile core networks, so as to alleviate the bottleneck of backhual in 5G network and reduce the delay. In this paper, we simulate the caching scenario in 5G networks by fusing with content centric network using OPNET Modeler, the obtained simulation results show that edge cache and local D2D cooperative cache can achieve higher hit rate, effectively reduce the delay of users’ access to content, and reduce the core network traffic of 5G.

Proceedings ArticleDOI
08 Jan 2018
TL;DR: This paper mainly discusses the emotion communication problems in the spiritual world and introduces Smart Clothing, a comfortable and intelligent wearable device used to collect the raw data that is the closest to the human bodies, and the emotion-care robots.
Abstract: The functions of the multi-networked machines far exceed those of independent and isolated ones. The machines will upload the human-related signals and environmental signals that they have collected to the cloud through the Internet, then the cloud will offer powerful intelligent processing which will process the data and send it back to another machine (i.e. the robot), and the robot will be used to serve the people, thus a circuit will be formed. If the robot is able to understand peoples emotions with the help of the emotional big data analysis, it will make it possible for the emotion communication. Briefly, our core concept is to start from people to Smart Clothing, then from Smart Clothing to the cloud, next from the cloud to the robot, and at last from the robot to people again. So the important elements of this system include Smart Clothing, Health Big Data Cloud and robots. The Internet of Things interconnects things to things, which realizes the communication in the physical world, and then this paper mainly discusses the emotion communication problems in the spiritual world. In order to solve this challenging problem, first, well take the application of the big data in the healthcare field as an example, and put forward four strategies and steps to solve the problem through the big data. Then were going to introduce Smart Clothing, a comfortable and intelligent wearable device used to collect the raw data that is the closest to the human bodies. Finally, we will introduce the emotion-care robots. 1 The Internet of Things: Communications among Things Our world is interconnected with countless things through the Internet and mobile phones, so the demand for the seamless integration between the physical world and the information world is becoming higher and higher. Meanwhile, we can predict that our human society as well as the interconnection between the physical world and the spiritual world will become the development trend of the future network [1]. When we talk about the emotion communication, we first need to understand how the connected machines communicate with each other [2, 3, 4]. The communication and interconnection among plenty of connected TRIDENTCOM 2017, September 28-29, Dalian, People's Republic of China Copyright © 2018 EAI DOI 10.4108/eai.28-9-2017.2273366 2 Jeungeun Song, et al. machines are also called the Internet of Things, which includes a lot of technology such as wireless sensor network, body area network, pervasive computing, M2M, cloud computing, etc [5, 6, 7]. However, the Internet of Things isn’t a static frame or concept; instead, it has evolved into various kinds of advanced technology, such as 5G Network and the big data [8, 9]. Many people divide the architecture of the Internet of Things into four layers, namely the sensing layer, network layer, analysis layer and application layer. We think that there are different interactions among these layers. For example, the interaction between the analysis layer and the sensing layer can realize the energy consumption optimization of the sensor network based on the cloud integration, that is to say, the wisdom gained from the cloud will make the wireless sensor network more intelligent. Sometimes, if the users interact with the lower layers, they will get the user-centered data analysis, personalized sensing Internet of Things, etc [10]. Generally, the Internet of Things includes four logic chains: first, we need different sensors, equipments and machines to perceive the data and the data in the sensors will be collected to the cloud through various kinds of wireless networks and heterogeneous network technology, and finally the data will be analyzed in the cloud [11, 12]. Through the big data analysis, we can learn knowledge and find optimized solutions to change the environment or predict the things that will happen in the end. The whole procedure involves machines, people, cloud computing, big data, and various kinds of learning and analysis technology. Based on the framework of the Internet of Things, we are able to dig into the deeper-level problems: how can we design plenty of powerful sensors on single equipment, how can we monitor any system anytime anywhere and how can we find out valuable information and make predictions in real time; in order to realize this dream, we need to have the cloud, big data, cloud analysis, machine learning, inference and control. 2 Emotion Communications in the Spiritual World Emotion communication means that a system is able to collect humans emotion data, understand humans emotions, care for peoples feelings and have an effective interaction with the users. Emotion care is a specific field in the healthcare domain, which faces a lot of challenging problems. The emotion communication system put forward in this paper involves medical big data, Smart Clothing and emotion care robot technology. The Smart Clothing constitutes the front-end raw material collection, and upload the data to the data center through mobile phones for processing, and then the robots will realize the action feedbacks. Emotion Communications 3

Proceedings ArticleDOI
08 Jan 2018
TL;DR: The experimental results show that the proposed algorithm can efficiently improve recommendation quality, but different users forgetting rules are not the same, it is the inadequacy of this algorithm.
Abstract: With the rapid development of the Internet, the recommendation system is to solve the problem of information overload and the emergence of a personalized information filtering technology. Collaborative Filtering is the most successful and widely technologies to date. But user’s interests are not static, it will vary over time, and in which the social network as well. This paper selects a single forgetting function algorithm to improve research, but different users forgetting rules are not the same, it is the inadequacy of this algorithm. At the same time, the selection of experimental data object is not targeted, it cannot fully considering the characteristic of social platform for all users. The experimental results show that the proposed algorithm can efficiently improve recommendation quality.

Proceedings ArticleDOI
08 Jan 2018
TL;DR: An opportunistic computation offloading (OPPOCO) is proposed to enable a more energy-efficient and intelligent strategy for computation offload to improve mobile users’ Quality of Service (QoS) and Quality of Experience (QoE).
Abstract: Abstract. Nowadays, the advanced mobile devices provide considerable computation capacity. However, due to the instinct limitation of resources, mobile devices have to offload computation by Mobile Cloud Computing (MCC) and Ad-Hoc Cloudlet for improving the performance and prolonging the battery life. However, efficient model for ad-hoc cloudlet-assisted computation offloading is remaining open issues. In this article, we provide an overview of existing computation offloading modes, e.g. remote cloud service mode and ad-hoc cloudlet-assisted service mode, and propose an opportunistic computation offloading (OPPOCO) to enable a more energy-efficient and intelligent strategy for computation offloading. Moreover, the simulation by OPNET verifies our proposal is available and practical to improve mobile users’ Quality of Service (QoS) and Quality of Experience (QoE).

Proceedings ArticleDOI
08 Jan 2018
TL;DR: A Spark platform based transcoding system is proposed, and the RDD programming framework based distributed transcoding scheme is proposed to conduct fast conversion on video files so as to adapt to HTML5 video labels.
Abstract: The HTML5 based videos play an important role in promoting the communication on national culture with the rapid development of the mobile internet. However, considering that the HTML5 based videos support Theora, H.264 and MPEG4 video coding formats only and there are various existing video formats on national culture, it is needed to conduct fast conversion on video files so as to adapt to HTML5 video labels. Therefore, a Spark platform based transcoding system is proposed in this article. The HDFS is adopted for storage, and the RDD (Resilient Distributed Dataset) and FFMPEG of Spark are utilized for distributed transcoding. It conducts detailed discussion on segmentation strategy for the distributed storage of videos, and makes comparisons on the thought of the MapReduce and that of the RDD. In addition, it proposes the RDD programming framework based distributed transcoding scheme. According to the comparisons on time consumed for transcoding between the MapReduce framework and the Spark framework with the same size of file block and cluster, compared with the MapReduce transcoding, the time used for transcoding of the Spark framework can be reduced by 25%.