scispace - formally typeset
Search or ask a question

Showing papers on "Scalability published in 2017"


Journal ArticleDOI
TL;DR: A five-video playlist demonstrating proof-of-concept implementations for three tasks: assembling 2D Lego models, freehand sketching, and playing Ping-Pong is demonstrated.
Abstract: Industry investment and research interest in edge computing, in which computing and storage nodes are placed at the Internet's edge in close proximity to mobile devices or sensors, have grown dramatically in recent years. This emerging technology promises to deliver highly responsive cloud services for mobile computing, scalability and privacy-policy enforcement for the Internet of Things, and the ability to mask transient cloud outages. The web extra at www.youtube.com/playlist?list=PLmrZVvFtthdP3fwHPy_4d61oDvQY_RBgS includes a five-video playlist demonstrating proof-of-concept implementations for three tasks: assembling 2D Lego models, freehand sketching, and playing Ping-Pong.

1,690 citations


Posted Content
TL;DR: This paper proposes to search for an architectural building block on a small dataset and then transfer the block to a larger dataset and introduces a new regularization technique called ScheduledDropPath that significantly improves generalization in the NASNet models.
Abstract: Developing neural network image classification models often requires significant architecture engineering. In this paper, we study a method to learn the model architectures directly on the dataset of interest. As this approach is expensive when the dataset is large, we propose to search for an architectural building block on a small dataset and then transfer the block to a larger dataset. The key contribution of this work is the design of a new search space (the "NASNet search space") which enables transferability. In our experiments, we search for the best convolutional layer (or "cell") on the CIFAR-10 dataset and then apply this cell to the ImageNet dataset by stacking together more copies of this cell, each with their own parameters to design a convolutional architecture, named "NASNet architecture". We also introduce a new regularization technique called ScheduledDropPath that significantly improves generalization in the NASNet models. On CIFAR-10 itself, NASNet achieves 2.4% error rate, which is state-of-the-art. On ImageNet, NASNet achieves, among the published works, state-of-the-art accuracy of 82.7% top-1 and 96.2% top-5 on ImageNet. Our model is 1.2% better in top-1 accuracy than the best human-invented architectures while having 9 billion fewer FLOPS - a reduction of 28% in computational demand from the previous state-of-the-art model. When evaluated at different levels of computational cost, accuracies of NASNets exceed those of the state-of-the-art human-designed models. For instance, a small version of NASNet also achieves 74% top-1 accuracy, which is 3.1% better than equivalently-sized, state-of-the-art models for mobile platforms. Finally, the learned features by NASNet used with the Faster-RCNN framework surpass state-of-the-art by 4.0% achieving 43.1% mAP on the COCO dataset.

1,342 citations


Journal ArticleDOI
TL;DR: In this paper, the authors propose a simulator, called iFogSim, to model IoT and fog environments and measure the impact of resource management techniques in latency, network congestion, energy consumption, and cost.
Abstract: Summary Internet of Things (IoT) aims to bring every object (eg, smart cameras, wearable, environmental sensors, home appliances, and vehicles) online, hence generating massive volume of data that can overwhelm storage systems and data analytics applications. Cloud computing offers services at the infrastructure level that can scale to IoT storage and processing requirements. However, there are applications such as health monitoring and emergency response that require low latency, and delay that is caused by transferring data to the cloud and then back to the application can seriously impact their performances. To overcome this limitation, Fog computing paradigm has been proposed, where cloud services are extended to the edge of the network to decrease the latency and network congestion. To realize the full potential of Fog and IoT paradigms for real-time analytics, several challenges need to be addressed. The first and most critical problem is designing resource management techniques that determine which modules of analytics applications are pushed to each edge device to minimize the latency and maximize the throughput. To this end, we need an evaluation platform that enables the quantification of performance of resource management policies on an IoT or Fog computing infrastructure in a repeatable manner. In this paper we propose a simulator, called iFogSim, to model IoT and Fog environments and measure the impact of resource management techniques in latency, network congestion, energy consumption, and cost. We describe two case studies to demonstrate modeling of an IoT environment and comparison of resource management policies. Moreover, scalability of the simulation toolkit of RAM consumption and execution time is verified under different circumstances.

1,085 citations


Journal ArticleDOI
TL;DR: In this paper, the covariance function is expressed as a mixture of complex exponentials, without requiring evenly spaced observations or uniform noise, which can be used for probabilistic inference of stellar rotation periods, asteroseismic oscillation spectra and transiting planet parameters.
Abstract: The growing field of large-scale time domain astronomy requires methods for probabilistic data analysis that are computationally tractable, even with large data sets. Gaussian processes (GPs) are a popular class of models used for this purpose, but since the computational cost scales, in general, as the cube of the number of data points, their application has been limited to small data sets. In this paper, we present a novel method for GPs modeling in one dimension where the computational requirements scale linearly with the size of the data set. We demonstrate the method by applying it to simulated and real astronomical time series data sets. These demonstrations are examples of probabilistic inference of stellar rotation periods, asteroseismic oscillation spectra, and transiting planet parameters. The method exploits structure in the problem when the covariance function is expressed as a mixture of complex exponentials, without requiring evenly spaced observations or uniform noise. This form of covariance arises naturally when the process is a mixture of stochastically driven damped harmonic oscillators-providing a physical motivation for and interpretation of this choice-but we also demonstrate that it can be a useful effective model in some other cases. We present a mathematical description of the method and compare it to existing scalable GP methods. The method is fast and interpretable, with a range of potential applications within astronomical data analysis and beyond. We provide well-tested and documented open-source implementations of this method in C++, Python, and Julia.

611 citations


Journal ArticleDOI
TL;DR: An optimization model is proposed to characterize the behavior of one type of FDI attack that compromises the limited number of state measurements of the power system for electricity theft and achieves high accuracy.
Abstract: Application of computing and communications intelligence effectively improves the quality of monitoring and control of smart grids However, the dependence on information technology also increases vulnerability to malicious attacks False data injection (FDI), that attack on the integrity of data, is emerging as a severe threat to the supervisory control and data acquisition system In this paper, we exploit deep learning techniques to recognize the behavior features of FDI attacks with the historical measurement data and employ the captured features to detect the FDI attacks in real-time By doing so, our proposed detection mechanism effectively relaxes the assumptions on the potential attack scenarios and achieves high accuracy Furthermore, we propose an optimization model to characterize the behavior of one type of FDI attack that compromises the limited number of state measurements of the power system for electricity theft We illustrate the performance of the proposed strategy through the simulation by using IEEE 118-bus test system We also evaluate the scalability of our proposed detection mechanism by using IEEE 300-bus test system

574 citations


Posted Content
TL;DR: In this article, the authors present new and efficient protocols for privacy preserving machine learning for linear regression, logistic regression and neural network training using the stochastic gradient descent method, where data owners distribute their private data among two non-colluding servers who train various models on the joint data using secure two-party computation.
Abstract: Machine learning is widely used in practice to produce predictive models for applications such as image processing, speech and text recognition. These models are more accurate when trained on large amount of data collected from different sources. However, the massive data collection raises privacy concerns. In this paper, we present new and efficient protocols for privacy preserving machine learning for linear regression, logistic regression and neural network training using the stochastic gradient descent method. Our protocols fall in the two-server model where data owners distribute their private data among two non-colluding servers who train various models on the joint data using secure two-party computation (2PC). We develop new techniques to support secure arithmetic operations on shared decimal numbers, and propose MPC-friendly alternatives to non-linear functions such as sigmoid and softmax that are superior to prior work. We implement our system in C++. Our experiments validate that our protocols are several orders of magnitude faster than the state of the art implementations for privacy preserving linear and logistic regressions, and scale to millions of data samples with thousands of features. We also implement the first privacy preserving system for training neural networks.

568 citations


Journal ArticleDOI
Orestis Georgiou1, Usman Raza1
TL;DR: In this paper, the authors provide a stochastic geometry framework for modeling the performance of a single gateway LoRa network, a leading LPWA technology, and show that the coverage probability drops exponentially as the number of end-devices grows due to interfering signals using the same spreading sequence.
Abstract: Low power wide area (LPWA) networks are making spectacular progress from design, standardization, to commercialization. At this time of fast-paced adoption, it is of utmost importance to analyze how well these technologies will scale as the number of devices connected to the Internet of Things inevitably grows. In this letter, we provide a stochastic geometry framework for modeling the performance of a single gateway LoRa network, a leading LPWA technology. Our analysis formulates the unique peculiarities of LoRa, including its chirp spread-spectrum modulation technique, regulatory limitations on radio duty cycle, and use of ALOHA protocol on top, all of which are not as common in today’s commercial cellular networks. We show that the coverage probability drops exponentially as the number of end-devices grows due to interfering signals using the same spreading sequence. We conclude that this fundamental limiting factor is perhaps more significant toward LoRa scalability than for instance spectrum restrictions. Our derivations for co-spreading factor interference found in LoRa networks enables rigorous scalability analysis of such networks.

562 citations


Proceedings ArticleDOI
05 Jun 2017
TL;DR: In this paper, the authors proposed distributed deep neural networks (DDNNs) over distributed computing hierarchies, consisting of the cloud, the edge (fog) and end devices.
Abstract: We propose distributed deep neural networks (DDNNs) over distributed computing hierarchies, consisting of the cloud, the edge (fog) and end devices. While being able to accommodate inference of a deep neural network (DNN) in the cloud, a DDNN also allows fast and localized inference using shallow portions of the neural network at the edge and end devices. When supported by a scalable distributed computing hierarchy, a DDNN can scale up in neural network size and scale out in geographical span. Due to its distributed nature, DDNNs enhance sensor fusion, system fault tolerance and data privacy for DNN applications. In implementing a DDNN, we map sections of a DNN onto a distributed computing hierarchy. By jointly training these sections, we minimize communication and resource usage for devices and maximize usefulness of extracted features which are utilized in the cloud. The resulting system has built-in support for automatic sensor fusion and fault tolerance. As a proof of concept, we show a DDNN can exploit geographical diversity of sensors to improve object recognition accuracy and reduce communication cost. In our experiment, compared with the traditional method of offloading raw sensor data to be processed in the cloud, DDNN locally processes most sensor data on end devices while achieving high accuracy and is able to reduce the communication cost by a factor of over 20x.

489 citations


Journal ArticleDOI
TL;DR: In this paper, the authors argue for network slicing as an efficient solution that addresses the diverse requirements of 5G mobile networks, thus providing the necessary flexibility and scalability associated with future network implementations.
Abstract: We argue for network slicing as an efficient solution that addresses the diverse requirements of 5G mobile networks, thus providing the necessary flexibility and scalability associated with future network implementations. We elaborate on the challenges that emerge when designing 5G networks based on network slicing. We focus on the architectural aspects associated with the coexistence of dedicated as well as shared slices in the network. In particular, we analyze the realization options of a flexible radio access network with focus on network slicing and their impact on the design of 5G mobile networks. In addition to the technical study, this article provides an investigation of the revenue potential of network slicing, where the applications that originate from this concept and the profit capabilities from the network operator�s perspective are put forward.

457 citations


Journal ArticleDOI
TL;DR: This survey paper comprehensively survey and summarize the characterizations and taxonomy of state-of-the-art studies in SDN control plane scalability, and outlines the potential challenges and open problems that need to be addressed further for more scalableSDN control planes.

438 citations


Journal ArticleDOI
TL;DR: This paper presents a digital twin architecture reference model for the cloud-based CPS, C2PS, where the model helps in identifying various degrees of basic and hybrid computation-interaction modes in this paradigm.
Abstract: Cyber-physical system (CPS) is a new trend in the Internet-of-Things related research works, where physical systems act as the sensors to collect real-world information and communicate them to the computation modules (i.e. cyber layer), which further analyze and notify the findings to the corresponding physical systems through a feedback loop. Contemporary researchers recommend integrating cloud technologies in the CPS cyber layer to ensure the scalability of storage, computation, and cross domain communication capabilities. Though there exist a few descriptive models of the cloud-based CPS architecture, it is important to analytically describe the key CPS properties: computation, control, and communication. In this paper, we present a digital twin architecture reference model for the cloud-based CPS, C2PS, where we analytically describe the key properties of the C2PS. The model helps in identifying various degrees of basic and hybrid computation-interaction modes in this paradigm. We have designed C2PS smart interaction controller using a Bayesian belief network, so that the system dynamically considers current contexts. The composition of fuzzy rule base with the Bayes network further enables the system with reconfiguration capability. We also describe analytically, how C2PS subsystem communications can generate even more complex system-of-systems. Later, we present a telematics-based prototype driving assistance application for the vehicular domain of C2PS, VCPS, to demonstrate the efficacy of the architecture reference model.

Posted Content
TL;DR: This work argues for distributing RL components in a composable way by adapting algorithms for top-down hierarchical control, thereby encapsulating parallelism and resource requirements within short-running compute tasks, through RLlib: a library that provides scalable software primitives for RL.
Abstract: Reinforcement learning (RL) algorithms involve the deep nesting of highly irregular computation patterns, each of which typically exhibits opportunities for distributed computation. We argue for distributing RL components in a composable way by adapting algorithms for top-down hierarchical control, thereby encapsulating parallelism and resource requirements within short-running compute tasks. We demonstrate the benefits of this principle through RLlib: a library that provides scalable software primitives for RL. These primitives enable a broad range of algorithms to be implemented with high performance, scalability, and substantial code reuse. RLlib is available at this https URL.

Journal ArticleDOI
TL;DR: A new architecture for the implementation of IoT to store and process scalable sensor data (big data) for health care applications and uses MapReduce based prediction model to predict the heart diseases is proposed.

Book ChapterDOI
06 Jan 2017
TL;DR: DecDecentralized data fusion (DDF) is one of the most important areas of research in the eld of control and estimation as mentioned in this paper. But it cannot be achieved using traditional centralized architectures.
Abstract: References 342Decentralized (or distributed) data fusion (DDF) is one of the most important areas of research in the eld of control and estimation. The motivation for decentralization is that it provides a degree of scalability and robustness that cannot be achieved using traditional centralized architectures. In industrial applications, decentralization offers the possibility of producing plug-and-play systems in which sensors of different types and capabilities can be slotted in and out, optimizing trade-offs such as price, power consumption, and performance. This has signi cant implications for military systems as well because it can dramatically reduce the time required to incorporate new computational and sensing components into ghter aircraft, ships, and other types of platforms.

Journal ArticleDOI
TL;DR: This paper proposes a heuristic offloading decision algorithm (HODA), which is semidistributed and jointly optimizes the offload decision, and communication and computation resources to maximize system utility, a measure of quality of experience based on task completion time and energy consumption of a mobile device.
Abstract: Proximate cloud computing enables computationally intensive applications on mobile devices, providing a rich user experience. However, remote resource bottlenecks limit the scalability of offloading, requiring optimization of the offloading decision and resource utilization. To this end, in this paper, we leverage the variability in capabilities of mobile devices and user preferences. Our system utility metric is a measure of quality of experience (QoE) based on task completion time and energy consumption of a mobile device. We propose a heuristic offloading decision algorithm (HODA), which is semidistributed and jointly optimizes the offloading decision, and communication and computation resources to maximize system utility. Our main contribution is to reduce the problem to a submodular maximization problem and prove its NP-hardness by decomposing it into two subproblems: 1) optimization of communication and computation resources solved by quasiconvex and convex optimization and 2) offloading decision solved by submodular set function optimization. HODA reduces the complexity of finding the local optimum to $O(K^{3})$ , where $K$ is the number of mobile users. Simulation results show that HODA performs within 5% of the optimal on average. Compared with other solutions, HODA's performance is significantly superior as the number of users increases.

Journal ArticleDOI
TL;DR: The results of the evaluation show that DistBlockNet is capable of detecting attacks in the IoT network in real time with low performance overheads and satisfying the design principles required for the future IoT network.
Abstract: The rapid increase in the number and diversity of smart devices connected to the Internet has raised the issues of flexibility, efficiency, availability, security, and scalability within the current IoT network. These issues are caused by key mechanisms being distributed to the IoT network on a large scale, which is why a distributed secure SDN architecture for IoT using the blockchain technique (DistBlockNet) is proposed in this research. It follows the principles required for designing a secure, scalable, and efficient network architecture. The DistBlockNet model of IoT architecture combines the advantages of two emerging technologies: SDN and blockchains technology. In a verifiable manner, blockchains allow us to have a distributed peer-to-peer network where non-confident members can interact with each other without a trusted intermediary. A new scheme for updating a flow rule table using a blockchains technique is proposed to securely verify a version of the flow rule table, validate the flow rule table, and download the latest flow rules table for the IoT forwarding devices. In our proposed architecture, security must automatically adapt to the threat landscape, without administrator needs to review and apply thousands of recommendations and opinions manually. We have evaluated the performance of our proposed model architecture and compared it to the existing model with respect to various metrics. The results of our evaluation show that DistBlockNet is capable of detecting attacks in the IoT network in real time with low performance overheads and satisfying the design principles required for the future IoT network.

Journal ArticleDOI
TL;DR: This article first proposes a transparent computing based IoT architecture, and clearly identifies its advantages and associated challenges, and presents a case study to clearly show how to build scalable lightweight wearables with the proposed architecture.
Abstract: By moving service provisioning from the cloud to the edge, edge computing becomes a promising solution in the era of IoT to meet the delay requirements of IoT applications, enhance the scalability and energy efficiency of lightweight IoT devices, provide contextual information processing, and mitigate the traffic burdens of the backbone network. However, as an emerging field of study, edge computing is still in its infancy and faces many challenges in its implementation and standardization. In this article, we study an implementation of edge computing, which exploits transparent computing to build scalable IoT platforms. Specifically, we first propose a transparent computing based IoT architecture, and clearly identify its advantages and associated challenges. Then, we present a case study to clearly show how to build scalable lightweight wearables with the proposed architecture. Some future directions are finally pointed out to foster continued research efforts.

Journal ArticleDOI
TL;DR: The results show that the newly designed single‐thread‐based TENG, with the advantage of interactive, responsive, sewable, and conformal features, can meet application needs of a vast variety of fields, ranging from wearable and stretchable energy harvesters to smart cloth‐based articles.
Abstract: The development of wearable and large-area fabric energy harvester and sensor has received great attention due to their promising applications in next-generation autonomous and wearable healthcare technologies. Here, a new type of “single” thread-based triboelectric nanogenerator (TENG) and its uses in elastically textile-based energy harvesting and sensing have been demonstrated. The energy-harvesting thread composed by one silicone-rubber-coated stainless-steel thread can extract energy during contact with skin. With sewing the energy-harvesting thread into a serpentine shape on an elastic textile, a highly stretchable and scalable TENG textile is realized to scavenge various kinds of human-motion energy. The collected energy is capable to sustainably power a commercial smart watch. Moreover, the simplified single triboelectric thread can be applied in a wide range of thread-based self-powered and active sensing uses, including gesture sensing, human-interactive interfaces, and human physiological signal monitoring. After integration with microcontrollers, more complicated systems, such as wireless wearable keyboards and smart beds, are demonstrated. These results show that the newly designed single-thread-based TENG, with the advantage of interactive, responsive, sewable, and conformal features, can meet application needs of a vast variety of fields, ranging from wearable and stretchable energy harvesters to smart cloth-based articles.

Journal ArticleDOI
TL;DR: In this paper, a Parallel Random Forest (PRF) algorithm for big data on the Apache Spark platform is presented. And the PRF algorithm is optimized based on a hybrid approach combining dataparallel and task-parallel optimization, and a dual parallel approach is carried out in the training process of RF and a task Directed Acyclic Graph (DAG) is created according to the parallel training process.
Abstract: With the emergence of the big data age, the issue of how to obtain valuable knowledge from a dataset efficiently and accurately has attracted increasingly attention from both academia and industry. This paper presents a Parallel Random Forest (PRF) algorithm for big data on the Apache Spark platform. The PRF algorithm is optimized based on a hybrid approach combining data-parallel and task-parallel optimization. From the perspective of data-parallel optimization, a vertical data-partitioning method is performed to reduce the data communication cost effectively, and a data-multiplexing method is performed is performed to allow the training dataset to be reused and diminish the volume of data. From the perspective of task-parallel optimization, a dual parallel approach is carried out in the training process of RF, and a task Directed Acyclic Graph (DAG) is created according to the parallel training process of PRF and the dependence of the Resilient Distributed Datasets (RDD) objects. Then, different task schedulers are invoked for the tasks in the DAG. Moreover, to improve the algorithm's accuracy for large, high-dimensional, and noisy data, we perform a dimension-reduction approach in the training process and a weighted voting approach in the prediction process prior to parallelization. Extensive experimental results indicate the superiority and notable advantages of the PRF algorithm over the relevant algorithms implemented by Spark MLlib and other studies in terms of the classification accuracy, performance, and scalability. With the expansion of the scale of the random forest model and the Spark cluster, the advantage of the PRF algorithm is more obvious.

Proceedings ArticleDOI
Wenyan Lu1, Guihai Yan1, Jiajun Li1, Shijun Gong1, Yinhe Han1, Xiaowei Li1 
01 Feb 2017
TL;DR: This paper proposes aflexible dataflow architecture (FlexFlow) that can leverage the complementary effects among feature map, neuron, and synapse parallelism to mitigate the mismatch between the parallel types supported by computing engine and the dominant parallel types of CNN workloads.
Abstract: Convolutional Neural Networks (CNN) are verycomputation-intensive. Recently, a lot of CNN accelerators based on the CNN intrinsic parallelism are proposed. However, we observed that there is a big mismatch between the parallel types supported by computing engine and the dominant parallel types of CNN workloads. This mismatch seriously degrades resource utilization of existing accelerators. In this paper, we propose aflexible dataflow architecture (FlexFlow) that can leverage the complementary effects among feature map, neuron, and synapse parallelism to mitigate the mismatch. We evaluated our design with six typical practical workloads, it acquires 2-10x performance speedup and 2.5-10x power efficiency improvement compared with three state-of-the-art accelerator architectures. Meanwhile, FlexFlow is highly scalable with growing computing engine scale.

Journal ArticleDOI
TL;DR: An SDN-enabled network architecture assisted by MEC, which integrates different types of access technologies, is proposed, which can decrease data transmission time and enhance quality of user experience in latency-sensitive applications.
Abstract: Connected vehicles provide advanced transformations and attractive business opportunities in the automotive industry. Presently, IEEE 802.11p and evolving 5G are the mainstream radio access technologies in the vehicular industry, but neither of them can meet all requirements of vehicle communication. In order to provide low-latency and high-reliability communication, an SDN-enabled network architecture assisted by MEC, which integrates different types of access technologies, is proposed. MEC technology with its on-premises feature can decrease data transmission time and enhance quality of user experience in latency-sensitive applications. Therefore, MEC plays as important a role in the proposed architecture as SDN technology. The proposed architecture was validated by a practical use case, and the obtained results have shown that it meets application- specific requirements and maintains good scalability and responsiveness.

Journal ArticleDOI
TL;DR: In this article, a LoRa error model is constructed from extensive complex baseband bit error rate simulations and used as an interference model in an ns-3 module that enables to study multichannel, multispreading factor, multigateway, bi-directional LoRaWAN networks with thousands of end devices.
Abstract: As LoRaWAN networks are actively being deployed in the field, it is important to comprehend the limitations of this low power wide area network technology. Previous work has raised questions in terms of the scalability and capacity of LoRaWAN networks as the number of end devices grows to hundreds or thousands per gateway. Some works have modeled LoRaWAN networks as pure ALOHA networks, which fails to capture important characteristics such as the capture effect and the effects of interference. Other works provide a more comprehensive model by relying on empirical and stochastic techniques. This paper uses a different approach where a LoRa error model is constructed from extensive complex baseband bit error rate simulations and used as an interference model. The error model is combined with the LoRaWAN MAC protocol in an ns-3 module that enables to study multichannel, multispreading factor, multigateway, bi-directional LoRaWAN networks with thousands of end devices. Using the LoRaWAN ns-3 module, a scalability analysis of LoRaWAN shows the detrimental impact downstream traffic has on the delivery ratio of confirmed upstream traffic. The analysis shows that increasing gateway density can ameliorate but not eliminate this effect, as stringent duty cycle requirements for gateways continue to limit downstream opportunities.

Journal ArticleDOI
23 May 2017-Sensors
TL;DR: This study investigates the scalability in terms of the number of end devices per gateway of single-gateway LoRaWAN deployments, and determines the intra-technology interference behavior with two physical end nodes, by checking the impact of an interfering node on a transmitting node.
Abstract: LoRa is a long-range, low power, low bit rate and single-hop wireless communication technology It is intended to be used in Internet of Things (IoT) applications involving battery-powered devices with low throughput requirements A LoRaWAN network consists of multiple end nodes that communicate with one or more gateways These gateways act like a transparent bridge towards a common network server The amount of end devices and their throughput requirements will have an impact on the performance of the LoRaWAN network This study investigates the scalability in terms of the number of end devices per gateway of single-gateway LoRaWAN deployments First, we determine the intra-technology interference behavior with two physical end nodes, by checking the impact of an interfering node on a transmitting node Measurements show that even under concurrent transmission, one of the packets can be received under certain conditions Based on these measurements, we create a simulation model for assessing the scalability of a single gateway LoRaWAN network We show that when the number of nodes increases up to 1000 per gateway, the losses will be up to 32% In such a case, pure Aloha will have around 90% losses However, when the duty cycle of the application layer becomes lower than the allowed radio duty cycle of 1%, losses will be even lower We also show network scalability simulation results for some IoT use cases based on real data

Journal ArticleDOI
TL;DR: This paper proposes a cloud-supported cyber–physical localization system for patient monitoring using smartphones to acquire voice and electroencephalogram signals in a scalable, real-time, and efficient manner and uses Gaussian mixture modeling for localization to outperform other similar methods in terms of error estimation.
Abstract: The potential of cloud-supported cyber–physical systems (CCPSs) has drawn a great deal of interest from academia and industry. CCPSs facilitate the seamless integration of devices in the physical world (e.g., sensors, cameras, microphones, speakers, and GPS devices) with cyberspace. This enables a range of emerging applications or systems such as patient or health monitoring, which require patient locations to be tracked. These systems integrate a large number of physical devices such as sensors with localization technologies (e.g., GPS and wireless local area networks) to generate, sense, analyze, and share huge quantities of medical and user-location data for complex processing. However, there are a number of challenges regarding these systems in terms of the positioning of patients, ubiquitous access, large-scale computation, and communication. Hence, there is a need for an infrastructure or system that can provide scalability and ubiquity in terms of huge real-time data processing and communications in the cyber or cloud space. To this end, this paper proposes a cloud-supported cyber–physical localization system for patient monitoring using smartphones to acquire voice and electroencephalogram signals in a scalable, real-time, and efficient manner. The proposed approach uses Gaussian mixture modeling for localization and is shown to outperform other similar methods in terms of error estimation.

Journal ArticleDOI
01 Aug 2017
TL;DR: The experiments show that Samza handles state efficiently, improving latency and throughput by more than 100X compared to using a remote storage; provides recovery time independent of state size; scales performance linearly with number of containers; and supports reprocessing of the data stream quickly and with minimal interference on real-time traffic.
Abstract: Distributed stream processing systems need to support stateful processing, recover quickly from failures to resume such processing, and reprocess an entire data stream quickly. We present Apache Samza, a distributed system for stateful and fault-tolerant stream processing. Samza utilizes a partitioned local state along with a low-overhead background changelog mechanism, allowing it to scale to massive state sizes (hundreds of TB) per application. Recovery from failures is sped up by re-scheduling based on Host Affinity. In addition to processing infinite streams of events, Samza supports processing a finite dataset as a stream, from either a streaming source (e.g., Kafka), a database snapshot (e.g., Databus), or a file system (e.g. HDFS), without having to change the application code (unlike the popular Lambda-based architectures which necessitate maintenance of separate code bases for batch and stream path processing).Samza is currently in use at LinkedIn by hundreds of production applications with more than 10, 000 containers. Samza is an open-source Apache project adopted by many top-tier companies (e.g., LinkedIn, Uber, Netflix, TripAdvisor, etc.). Our experiments show that Samza: a) handles state efficiently, improving latency and throughput by more than 100X compared to using a remote storage; b) provides recovery time independent of state size; c) scales performance linearly with number of containers; and d) supports reprocessing of the data stream quickly and with minimal interference on real-time traffic.

Journal ArticleDOI
TL;DR: This paper employs the Gale–Shapley algorithm to match D2D pairs with cellular UEs, which is proved to be stable and weak Pareto optimal, and extends the algorithm to address scalability issues in large-scale networks by developing tie-breaking and preference-deletion-based matching rules.
Abstract: Energy-efficiency (EE) is critical for device-to-device (D2D) enabled cellular networks due to limited battery capacity and severe cochannel interference. In this paper, we address the EE optimization problem by adopting a stable matching approach. The NP-hard joint resource allocation problem is formulated as a one-to-one matching problem under two-sided preferences, which vary dynamically with channel states and interference levels. A game-theoretic approach is employed to analyze the interactions and correlations among user equipments (UEs), and an iterative power allocation algorithm is developed to establish mutual preferences based on nonlinear fractional programing. We then employ the Gale–Shapley algorithm to match D2D pairs with cellular UEs, which is proved to be stable and weak Pareto optimal. We provide a theoretical analysis and description for implementation details and algorithmic complexity. We also extend the algorithm to address scalability issues in large-scale networks by developing tie-breaking and preference-deletion-based matching rules. Simulation results validate the theoretical analysis and demonstrate that significant performance gains of average EE and matching satisfactions can be achieved by the proposed algorithm.

Proceedings ArticleDOI
24 Jun 2017
TL;DR: SCALEDEEP is a dense, scalable server architecture, whose processing, memory and interconnect subsystems are specialized to leverage the compute and communication characteristics of DNNs, and primarily targets DNN training, as opposed to only inference or evaluation.
Abstract: Deep Neural Networks (DNNs) have demonstrated state-of-the-art performance on a broad range of tasks involving natural language, speech, image, and video processing, and are deployed in many real world applications. However, DNNs impose significant computational challenges owing to the complexity of the networks and the amount of data they process, both of which are projected to grow in the future. To improve the efficiency of DNNs, we propose ScaleDeep, a dense, scalable server architecture, whose processing, memory and interconnect subsystems are specialized to leverage the compute and communication characteristics of DNNs. While several DNN accelerator designs have been proposed in recent years, the key difference is that ScaleDeep primarily targets DNN training, as opposed to only inference or evaluation. The key architectural features from which ScaleDeep derives its efficiency are: (i) heterogeneous processing tiles and chips to match the wide diversity in computational characteristics (FLOPs and Bytes/FLOP ratio) that manifest at different levels of granularity in DNNs, (ii) a memory hierarchy and 3-tiered interconnect topology that is suited to the memory access and communication patterns in DNNs, (iii) a low-overhead synchronization mechanism based on hardware data-flow trackers, and (iv) methods to map DNNs to the proposed architecture that minimize data movement and improve core utilization through nested pipelining. We have developed a compiler to allow any DNN topology to be programmed onto ScaleDeep, and a detailed architectural simulator to estimate performance and energy. The simulator incorporates timing and power models of ScaleDeep's components based on synthesis to Intel's 14nm technology. We evaluate an embodiment of ScaleDeep with 7032 processing tiles that operates at 600 MHz and has a peak performance of 680 TFLOPs (single precision) and 1.35 PFLOPs (half-precision) at 1.4KW. Across 11 state-of-the-art DNNs containing 0.65M-14.9M neurons and 6.8M-145.9M weights, including winners from 5 years of the ImageNet competition, ScaleDeep demonstrates 6x-28x speedup at iso-power over the state-of-the-art performance on GPUs.

Journal ArticleDOI
TL;DR: A regional cooperative fog-computing-based intelligent vehicular network (CFC-IoV) architecture for dealing with big IoV data in the smart city is proposed, including mobility control, multi-source data acquisition, distributed computation and storage, and multi-path data transmission.
Abstract: As vehicle applications, mobile devices and the Internet of Things are growing fast, and developing an efficient architecture to deal with the big data in the Internet of Vehicles (IoV) has been an important concern for the future smart city. To overcome the inherent defect of centralized data processing in cloud computing, fog computing has been proposed by offloading computation tasks to local fog servers (LFSs). By considering factors like latency, mobility, localization, and scalability, this article proposes a regional cooperative fog-computing-based intelligent vehicular network (CFC-IoV) architecture for dealing with big IoV data in the smart city. Possible services for IoV applications are discussed, including mobility control, multi-source data acquisition, distributed computation and storage, and multi-path data transmission. A hierarchical model with intra-fog and inter-fog resource management is presented, and energy efficiency and packet dropping rates of LFSs in CFC-IoV are optimized.

Journal ArticleDOI
TL;DR: Results show that as the number of applications demanding real-time service increases, the MIST fog-based scheme outperforms traditional cloud computing.

Journal ArticleDOI
TL;DR: The proposed framework provides a more secure and controlled way to share knowledge and services, thereby supports the company to develop scalable and flexible business at a lower cost, and ultimately improves the overall quality, efficiency, and effectiveness of manufacturing services.
Abstract: Purpose The purpose of this paper is to propose a cross-enterprises framework to achieve a higher level of sharing of knowledge and services in manufacturing ecosystems. Design/methodology/approach The authors describe the development of the emerging open manufacturing and discuss the model of knowledge creation processes of manufacturers. The authors present a decentralized framework based on blockchain and edge computing technologies, which consists of a customer layer, an enterprise layer, an application layer, an intelligence layer, a data layer, and an infrastructure layer. And a case study is provided to illustrate the effectiveness of the framework. Findings The authors discuss that the manufacturing ecosystem is changing from integrated and centralized systems to shared and distributed systems. The proposed framework incorporates the recent development in blockchain and edge computing that can meet the secure and distributed requirements for the sharing of knowledge and services in manufacturing ecosystems. Practical implications The proposed framework provides a more secure and controlled way to share knowledge and services, thereby supports the company to develop scalable and flexible business at a lower cost, and ultimately improves the overall quality, efficiency, and effectiveness of manufacturing services. Originality/value The proposed framework incorporates the recent development in edge computing technologies to achieve a flexible and distributed network. With the blockchain technology, it provides standards and protocols for implementing the framework and ensures the security issues. Not only information can be shared, but the framework also supports in the exchange of knowledge and services so that the parties can contribute their parts.