scispace - formally typeset
Search or ask a question

Showing papers by "Albert Y. Zomaya published in 2016"


Journal ArticleDOI
TL;DR: Challenges and issues faced in virtualization of CPU, memory, I/O, interrupt, and network interfaces are highlighted and various performance parameters are presented in a detailed comparative analysis to quantify the efficiency of mobile virtualization techniques and solutions.
Abstract: Recent growth in the processing and memory resources of mobile devices has fueled research within the field of mobile virtualization. Mobile virtualization enables multiple persona on a single mobile device by hosting heterogeneous operating systems (OSs) concurrently. However, adding a virtualization layer to resource-constrained mobile devices with real-time requirements can lead to intolerable performance overheads. Hardware virtualization extensions that support efficient virtualization have been incorporated in recent mobile processors. Prior to hardware virtualization extensions, virtualization techniques that are enabled by performance prohibitive and resource consuming software were adopted for mobile devices. Moreover, mobile virtualization solutions lack standard procedures for device component sharing and interfacing between multiple OSSs. The objective of this article is to survey software- and hardware-based mobile virtualization techniques in light of the recent advancements fueled by the hardware support for mobile virtualization. Challenges and issues faced in virtualization of CPU, memory, I/O, interrupt, and network interfaces are highlighted. Moreover, various performance parameters are presented in a detailed comparative analysis to quantify the efficiency of mobile virtualization techniques and solutions.

407 citations


Journal ArticleDOI
TL;DR: The main aim of this paper is to identify open challenges associated with energy efficient resource allocation and outline the problem and existing hardware and software-based techniques available for this purpose based on the energy-efficient research dimension taxonomy.
Abstract: In a cloud computing paradigm, energy efficient allocation of different virtualized ICT resources (servers, storage disks, and networks, and the like) is a complex problem due to the presence of heterogeneous application (e.g., content delivery networks, MapReduce, web applications, and the like) workloads having contentious allocation requirements in terms of ICT resource capacities (e.g., network bandwidth, processing speed, response time, etc.). Several recent papers have tried to address the issue of improving energy efficiency in allocating cloud resources to applications with varying degree of success. However, to the best of our knowledge there is no published literature on this subject that clearly articulates the research problem and provides research taxonomy for succinct classification of existing techniques. Hence, the main aim of this paper is to identify open challenges associated with energy efficient resource allocation. In this regard, the study, first, outlines the problem and existing hardware and software-based techniques available for this purpose. Furthermore, available techniques already presented in the literature are summarized based on the energy-efficient research dimension taxonomy. The advantages and disadvantages of the existing techniques are comprehensively analyzed against the proposed research dimension taxonomy namely: resource adaption policy, objective function, allocation method, allocation operation, and interoperability.

303 citations


Journal ArticleDOI
TL;DR: A hierarchical clustering method that clusters the documents based on the minimum relevance threshold, and then partitions the resulting clusters into sub-clusters until the constraint on the maximum size of cluster is reached and has an advantage over the traditional method in the rank privacy and relevance of retrieved documents.
Abstract: Cloud data owners prefer to outsource documents in an encrypted form for the purpose of privacy preserving. Therefore it is essential to develop efficient and reliable ciphertext search techniques. One challenge is that the relationship between documents will be normally concealed in the process of encryption, which will lead to significant search accuracy performance degradation. Also the volume of data in data centers has experienced a dramatic growth. This will make it even more challenging to design ciphertext search schemes that can provide efficient and reliable online information retrieval on large volume of encrypted data. In this paper, a hierarchical clustering method is proposed to support more search semantics and also to meet the demand for fast ciphertext search within a big data environment. The proposed hierarchical approach clusters the documents based on the minimum relevance threshold, and then partitions the resulting clusters into sub-clusters until the constraint on the maximum size of cluster is reached. In the search phase, this approach can reach a linear computational complexity against an exponential size increase of document collection. In order to verify the authenticity of search results, a structure called minimum hash sub-tree is designed in this paper. Experiments have been conducted using the collection set built from the IEEE Xplore. The results show that with a sharp increase of documents in the dataset the search time of the proposed method increases linearly whereas the search time of the traditional method increases exponentially. Furthermore, the proposed method has an advantage over the traditional method in the rank privacy and relevance of retrieved documents.

133 citations


Journal ArticleDOI
01 Mar 2016
TL;DR: A new communication-aware model of cloud computing applications, called CA-DAG, based on Directed Acyclic Graphs that in addition to computing vertices include separate vertices to represent communications, which allows making separate resource allocation decisions.
Abstract: This paper addresses performance issues of resource allocation in cloud computing We review requirements of different cloud applications and identify the need of considering communication processes explicitly and equally to the computing tasks Following this observation, we propose a new communication-aware model of cloud computing applications, called CA-DAG This model is based on Directed Acyclic Graphs that in addition to computing vertices include separate vertices to represent communications Such a representation allows making separate resource allocation decisions: assigning processors to handle computing jobs, and network resources for information transmissions The proposed CA-DAG model creates space for optimization of a number of existing solutions to resource allocation and for developing novel scheduling schemes of improved efficiency

83 citations


Journal ArticleDOI
TL;DR: An innovative intrusion detection approach to detect SCADA tailored attacks is presented based on a data-driven clustering technique of process parameters, which automatically identifies the normal and critical states of a given system and extracts proximity-based detection rules from the identified states for monitoring purposes.
Abstract: Supervisory control and data acquisition (SCADA) systems have become a salient part in controlling critical infrastructures, such as power plants, energy grids, and water distribution systems. In the past decades, these systems were isolated and use proprietary software, operating systems, and protocols. In recent years, SCADA systems have been interfaced with enterprise systems, which therefore exposed them to the vulnerabilities of the Internet and the security threats. Traditional security solutions (e.g., firewalls, antivirus software, and intrusion detection systems) cannot fully protect SCADA systems, because they have different requirements. This paper presents an innovative intrusion detection approach to detect SCADA tailored attacks. This is based on a data-driven clustering technique of process parameters, which automatically identifies the normal and critical states of a given system. Later, it extracts proximity-based detection rules from the identified states for monitoring purposes. The effectiveness of the proposed approach is tested by conducting experiments on eight data sets that consist of process parameters’ values. The empirical results demonstrated an average accuracy of 98% in automatically identifying the critical states, while facilitating the monitoring of the SCADA system.

77 citations


Journal ArticleDOI
TL;DR: GA-ETI is proved to reduce makespan of executing workflows between 11% and 85% when compared to three up-do-date scheduling algorithms without increasing the monetary cost.

75 citations


Journal ArticleDOI
TL;DR: This paper considers the problem from a game theoretic perspective and formulate it into a non-cooperative game among the multiple cloud users, in which each cloud user is informed with incomplete information of other users, to design a utility function which combines the net profit with time efficiency and try to maximize its value.
Abstract: In this paper, we focus on price bidding strategies of multiple users competition for resource usage in cloud computing. We consider the problem from a game theoretic perspective and formulate it into a non-cooperative game among the multiple cloud users, in which each cloud user is informed with incomplete information of other users. For each user, we design a utility function which combines the net profit with time efficiency and try to maximize its value. We design a mechanism for the multiple users to evaluate their utilities and decide whether to use the cloud service. Furthermore, we propose a framework for each cloud user to compute an appropriate bidding price. At the beginning, by relaxing the condition that the allocated number of servers can be fractional, we prove the existence of Nash equilibrium solution set for the formulated game. Then, we propose an iterative algorithm ( $\mathcal {IA}$ ), which is designed to compute a Nash equilibrium solution. The convergency of the proposed algorithm is also analyzed and we find that it converges to a Nash equilibrium if several conditions are satisfied. Finally, we revise the obtained solution and propose a near-equilibrium price bidding algorithm ( $\mathcal {NPBA}$ ) to characterize the whole process of our proposed framework. The experimental results show that the obtained near-equilibrium solution is close to the equilibrium one.

75 citations


Journal ArticleDOI
TL;DR: The CLF security requirements, vulnerability points, and challenges are identified to tolerate different cloud log susceptibilities and are introduced to highlight open research areas of CLF for motivating investigators, academicians, and researchers to investigate them.
Abstract: Cloud log forensics (CLF) mitigates the investigation process by identifying the malicious behavior of attackers through profound cloud log analysis. However, the accessibility attributes of cloud logs obstruct accomplishment of the goal to investigate cloud logs for various susceptibilities. Accessibility involves the issues of cloud log access, selection of proper cloud log file, cloud log data integrity, and trustworthiness of cloud logs. Therefore, forensic investigators of cloud log files are dependent on cloud service providers (CSPs) to get access of different cloud logs. Accessing cloud logs from outside the cloud without depending on the CSP is a challenging research area, whereas the increase in cloud attacks has increased the need for CLF to investigate the malicious activities of attackers. This paper reviews the state of the art of CLF and highlights different challenges and issues involved in investigating cloud log data. The logging mode, the importance of CLF, and cloud log-as-a-service are introduced. Moreover, case studies related to CLF are explained to highlight the practical implementation of cloud log investigation for analyzing malicious behaviors. The CLF security requirements, vulnerability points, and challenges are identified to tolerate different cloud log susceptibilities. We identify and introduce challenges and future directions to highlight open research areas of CLF for motivating investigators, academicians, and researchers to investigate them.

67 citations


Journal ArticleDOI
TL;DR: This paper formulates a minimum weighted flow provisioning ( MWFP) problem with an objective of minimizing the total cost of TCAM occupation and remote packet processing and proposes two online algorithms with guaranteed competitive ratios.
Abstract: Software-defined networking (SDN) is an emerging network paradigm that simplifies network management by decoupling the control plane and data plane, such that switches become simple data forwarding devices and network management is controlled by logically centralized servers. In SDN-enabled networks, network flow is managed by a set of associated rules that are maintained by switches in their local Ternary Content Addressable Memories (TCAMs) which support high-speed parallel lookup on wildcard patterns. Since TCAM is an expensive hardware and extremely power-hungry, each switch has only limited TCAM space and it is inefficient and even infeasible to maintain all rules at local switches. On the other hand, if we eliminate TCAM occupation by forwarding all packets to the centralized controller for processing, it results in a long delay and heavy processing burden on the controller. In this paper, we strive for the fine balance between rule caching and remote packet processing by formulating a minimum weighted flow provisioning ( MWFP ) problem with an objective of minimizing the total cost of TCAM occupation and remote packet processing. We propose an efficient offline algorithm if the network traffic is given, otherwise, we propose two online algorithms with guaranteed competitive ratios. Finally, we conduct extensive experiments by simulations using real network traffic traces. The simulation results demonstrate that our proposed algorithms can significantly reduce the total cost of remote controller processing and TCAM occupation, and the solutions obtained are nearly optimal.

57 citations


Journal ArticleDOI
TL;DR: The authors provide a comprehensive solution with a cryptographic role-based technique to distribute session keys to establish communications and information retrieval using the Kerberos protocol; location- and biometrics-based authentication to authorize users; and a wavelet-based steganographic technique to embed EHR data securely using ECG signals as the host in a trusted cloud storage.
Abstract: Cloud-based electronic health record (EHR) systems are next-generation big data systems for facilitating efficient and scalable storage and fostering collaborative care, clinical research, and development. Mobility and the use of multiple mobile devices in collaborative healthcare increases the need for robust privacy preservation. Thus, large-scale EHR systems require secure access to privacy-sensitive data, data storage, and management. The authors provide a comprehensive solution with a cryptographic role-based technique to distribute session keys to establish communications and information retrieval using the Kerberos protocol; location- and biometrics-based authentication to authorize users; and a wavelet-based steganographic technique to embed EHR data securely using electrocardiography (ECG) signals as the host in a trusted cloud storage. A comprehensive security analysis demonstrates that the model is scalable, secure, and reliable for accessing and managing EHR data.

51 citations


Journal ArticleDOI
TL;DR: A systematic literature survey on Shared Sensor Networks to provide the reader with the opportunity to understand what has been done and what remains as open issues in this field, as well as which are the pivotal factors of this evolutionary design and how this kind of design can be exploited by a wide range of WSN applications.
Abstract: While Wireless Sensor Networks (WSNs) have been traditionally tasked with single applications, in recent years we have witnessed the emergence of Shared Sensor Networks (SSNs) as integrated cyber-physical system infrastructures for a multitude of applications. Instead of assuming an application-specific network design, SSNs allow the underlying infrastructure to be shared among multiple applications that can potentially belong to different users. On one hand, a potential benefit of such a design approach is to increase the utilization of sensing and communication resources, whenever the underlying network infrastructure covers the same geographic area and the sensor nodes monitor the same physical variables of common interest for different applications. On the other hand, compared with the existing application-specific design, the SSNs approach poses several research challenges with regard to different aspects of WSNs. In this article, we present a systematic literature survey on SSNs. The main goal of the article is to provide the reader with the opportunity to understand what has been done and what remains as open issues in this field, as well as which are the pivotal factors of this evolutionary design and how this kind of design can be exploited by a wide range of WSN applications.

Proceedings ArticleDOI
01 Oct 2016
TL;DR: An online algorithm, unit-slot optimization, based on the technique of Lyapunov optimization is developed that is a quantified near optimal solution and can online adjust the tradeoff between average response time and average cost.
Abstract: The steep rise of Internet of Things (IoT) applications along with the limitations of Cloud Computing to address all IoT requirements promotes a new distributed computing paradigm called Fog Computing, which aims to process data at the edge of the network. With the help of Fog Computing, the transmission latency and monetary spending caused by Cloud Computing can be effectively reduced. However, executing all applications in fog nodes will increase the average response time since the processing capabilities of fog is not as powerful as cloud. A tradeoff issue needs to be addressed within such systems in terms of average response time and average cost. In this paper, we develop an online algorithm, unit-slot optimization, based on the technique of Lyapunov optimization. It is a quantified near optimal solution and can online adjust the tradeoff between average response time and average cost. We evaluate the performance of our proposed algorithm by a number of experiments. The experimental results not only match up the theoretical analyses properly, but also demonstrate that our proposed algorithm can provide cost-effective processing while guaranteeing average response time.

Journal ArticleDOI
TL;DR: This paper surveys data management and replication approaches that are developed by both industrial and research communities from 2007 to 2011 to discuss and characterize the existing approaches of data replication and management that tackle the resource usage and QoS provisioning with different levels of efficiencies.
Abstract: As we delve deeper into the `Digital Age', we witness an explosive growth in the volume, velocity, and variety of the data available on the Internet. For example, in 2012 about 2.5 quintillion bytes of data was created on a daily basis that originated from myriad of sources and applications including mobile devices, sensors, individual archives, social networks, Internet of Things, enterprises, cameras, software logs, etc. Such `Data Explosions' has led to one of the most challenging research issues of the current Information and Communication Technology era: how to optimally manage (e.g., store, replicated, filter, and the like) such large amount of data and identify new ways to analyze large amounts of data for unlocking information. It is clear that such large data streams cannot be managed by setting up on-premises enterprise database systems as it leads to a large up-front cost in buying and administering the hardware and software systems. Therefore, next generation data management systems must be deployed on cloud. The cloud computing paradigm provides scalable and elastic resources, such as data and services accessible over the Internet Every Cloud Service Provider must assure that data is efficiently processed and distributed in a way that does not compromise end-users' Quality of Service (QoS) in terms of data availability, data search delay, data analysis delay, and the like. In the aforementioned perspective, data replication is used in the cloud for improving the performance (e.g., read and write delay) of applications that access data. Through replication a data intensive application or system can achieve high availability, better fault tolerance, and data recovery. In this paper, we survey data management and replication approaches (from 2007 to 2011) that are developed by both industrial and research communities. The focus of the survey is to discuss and characterize the existing approaches of data replication and management that tackle the resource usage and QoS provisioning with different levels of efficiencies. Moreover, the breakdown of both influential expressions (data replication and management) to provide different QoS attributes is deliberated. Furthermore, the performance advantages and disadvantages of data replication and management approaches in the cloud computing environments are analyzed. Open issues and future challenges related to data consistency, scalability, load balancing, processing and placement are also reported.

Journal ArticleDOI
TL;DR: A novel collaboration- and fairness-aware big data management problem in distributed cloud environments that aims to maximize the system throughout, while minimizing the operational cost of service providers to achieve the system throughput, subject to resource capacity and user fairness constraints is studied.
Abstract: With the advancement of information and communication technology, data are being generated at an exponential rate via various instruments and collected at an unprecedented scale. Such large volume of data generated is referred to as big data, which now are revolutionizing all aspects of our life ranging from enterprises to individuals, from science communities to governments, as they exhibit great potentials to improve efficiency of enterprises and the quality of life. To obtain nontrivial patterns and derive valuable information from big data, a fundamental problem is how to properly place the collected data by different users to distributed clouds and to efficiently analyze the collected data to save user costs in data storage and processing, particularly the cost savings of users who share data. By doing so, it needs the close collaborations among the users, by sharing and utilizing the big data in distributed clouds due to the complexity and volume of big data. Since computing, storage and bandwidth resources in a distributed cloud usually are limited, and such resource provisioning typically is expensive, the collaborative users require to make use of the resources fairly. In this paper, we study a novel collaboration- and fairness-aware big data management problem in distributed cloud environments that aims to maximize the system throughout, while minimizing the operational cost of service providers to achieve the system throughput, subject to resource capacity and user fairness constraints. We first propose a novel optimization framework for the problem. We then devise a fast yet scalable approximation algorithm based on the built optimization framework. We also analyze the time complexity and approximation ratio of the proposed algorithm. We finally conduct experiments by simulations to evaluate the performance of the proposed algorithm. Experimental results demonstrate that the proposed algorithm is promising, and outperforms other heuristics.

Journal ArticleDOI
TL;DR: Important issues related to service computing on mobile and IoT devices are described, application patterns are proposed and the main challenges are analyzed from the perspectives of both service provision and service consumption.
Abstract: Service computing offers an exciting paradigm for service provision and consumption. The field is now embracing new opportunities in the mobile Internet era, which is characterized by ubiquitous wireless connectivity and powerful smart devices that let users consume or provide services anytime and anywhere. In this article, the authors describe important issues related to service computing on mobile and IoT devices. They then propose application patterns and analyze the main challenges from the perspectives of both service provision and service consumption.

Proceedings ArticleDOI
13 Oct 2016
TL;DR: This paper introduces a design and analysis process supported by a framework to assist IoT application engineers to precisely model IoT applications and verify their properties and analyzes the QoS property of reliability in a Building Energy Conservation (BEC) IoT application.
Abstract: The Internet of Things (IoT) is a new paradigm consisting of heterogeneous entities that communicate with each other by sending and receiving messages in heterogeneous formats through heterogeneous protocols to achieve a common goal. When designing IoT applications, there are two main challenges: the complexity to represent such heterogeneous entities, message formats, and protocols in an unambiguous manner, and the lack of methodologies to verify QoS (Quality of Service) properties. This paper introduces a design and analysis process supported by a framework to assist IoT application engineers to precisely model IoT applications and verify their properties. The framework is composed of the SysML4IoT, a SysML profile based on the IoT-A Reference Model, and the SysML2NuSMV, a model-to-text translator that converts the model and QoS properties specified on it to be executed by NuSMV, a mature model checker that allows entering a system model comprising a number of communicating Finite State Machines (FSM) and automatically checks its properties specified as Computational Tree Logic (CTL) or Linear Temporal Logic (LTL) formulas. Our approach is evaluated through a proof of concept implementation that analyzes the QoS property of reliability in a Building Energy Conservation (BEC) IoT application.

Posted Content
TL;DR: In this article, the coverage probability and the area spectral efficiency (ASE) for the uplink uplink of dense small cell networks (SCNs) considering a practical path loss model incorporating both line-of-sight (LoS) and non-line-ofsight (NLoS) transmissions were analyzed.
Abstract: In this paper, we analyse the coverage probability and the area spectral efficiency (ASE) for the uplink (UL) of dense small cell networks (SCNs) considering a practical path loss model incorporating both line-of-sight (LoS) and non-line-of-sight (NLoS) transmissions. Compared with the existing work, we adopt the following novel approaches in our study: (i) we assume a practical user association strategy (UAS) based on the smallest path loss, or equivalently the strongest received signal strength; (ii) we model the positions of both base stations (BSs) and the user equipments (UEs) as two independent Homogeneous Poisson point processes (HPPPs); and (iii) the correlation of BSs' and UEs' positions is considered, thus making our analytical results more accurate. The performance impact of LoS and NLoS transmissions on the ASE for the UL of dense SCNs is shown to be significant, both quantitatively and qualitatively, compared with existing work that does not differentiate LoS and NLoS transmissions. In particular, existing work predicted that a larger UL power compensation factor would always result in a better ASE in the practical range of BS density, i.e., 10^1-10^3 BSs/km^2. However, our results show that a smaller UL power compensation factor can greatly boost the ASE in the UL of dense SCNs, i.e., 10^2-10^3 BSs/km^2, while a larger UL power compensation factor is more suitable for sparse SCNs, i.e., 10^1-10^2 BSs/km^2.

Journal ArticleDOI
TL;DR: A genetic algorithm-based method, genetic algorithm for mashup creation (GA4MC), is proposed to select component services and deployment platforms in order to create service mashups with optimal cost performance.
Abstract: Service mashups are applications created by combining single-functional services (or APIs) dispersed over the web. With the development of cloud computing and web technologies, service mashups are becoming more and more widely used and a large number of mashup platforms have been produced. However, due to the proliferation of services on the web, how to select component services to create mashups has become a challenging issue. Most developers pay more attention to the quality of service (QoS) and cost of services. Beside service selection, mashup deployment is another pivotal process, as the platform can significantly affect the quality of mashups. In this paper, we focus on creating service mashups from the perspective of developers. A genetic algorithm-based method, genetic algorithm for mashup creation (GA4MC), is proposed to select component services and deployment platforms in order to create service mashups with optimal cost performance. A series of experiments are conducted to evaluate the performance of GA4MC. The results show that the GA4MC method can achieve mashups whose cost performance is extremely close to the optimal. Moreover, the execution time of GA4MC is in a low order of magnitude and the algorithm performs good scalability as the experimental scale increases.

Journal ArticleDOI
TL;DR: Simulation results demonstrate that the proposed hierarchical computing technique can effectively and efficiently detect cyberattacks, achieving the detection accuracy of above 98%, while improving the scalability.
Abstract: The concept of smart home has recently gained significant popularity. Despite that it offers improved convenience and cost reduction, the prevailing smart home infrastructure suffers from vulnerability due to cyberattacks. It is possible for hackers to launch cyberattacks at the community level while causing a large area power system blackout through cascading effects. In this paper, the cascading impacts of two cyberattacks on the predicted dynamic electricity pricing are analyzed. In the first cyberattack, the hacker manipulates the electricity price to form peak energy loads such that some transmission lines are overloaded. Those transmission lines are then tripped and the power system is separated into isolated islands due to the cascading effect. In the second cyberattack, the hacker manipulates the electricity price to increase the fluctuation of the energy load to interfere the frequency of the generators. The generators are then tripped by the protective procedures and cascading outages are induced in the transmission network. The existing technique only tackles overloading cyberattack while still suffering from the severe limitation in scalability. Therefore, based on partially observable Markov decision processes, a hierarchical detection framework exploring community decomposition and global policy optimization is proposed in this work. The simulation results demonstrate that our proposed hierarchical computing technique can effectively and efficiently detect those cyberattacks, achieving the detection accuracy of above 98%, while improving the scalability.

Journal ArticleDOI
TL;DR: The architecture of EvacSys is discussed, a scalable cloud-based emergency evacuation service that uses the power of cloud computing to process large volumes of real-time sensory data gathered during disaster; it then computes appropriate routes for evacuees, giving priority to emergency vehicles.
Abstract: Natural or man-made disasters wreak havoc, whether they're floods, earthquakes, tornados, hurricanes, or wild fires. One of the major challenges in emergency situations is to guide people through safe routes away from the disaster site on the basis of the available information. To do this, data from multiple sources-such as roadside sensory units, emergency vehicles, and satellite imagery-must be processed in real time to compute the appropriate routes. However, to process the huge volumes of sensory data in real time also requires higher computational resources. For many years, cloud computing has been well established as a reliable solution to meet higher data and computational demands. In this article, the authors discuss the architecture of EvacSys, a scalable cloud-based emergency evacuation service. EvacSys uses the power of cloud computing to process large volumes of real-time sensory data gathered during disaster; it then computes appropriate routes for evacuees, giving priority to emergency vehicles. The authors also present a case study testing the service on a real city transportation network.

Journal ArticleDOI
TL;DR: Simulation results indicate that the proposed CEVP algorithm can achieve energy savings, efficiently reduce the temperature cost, and significantly decrease the total number of the hot spots in the cloud systems, by comparing to the Ant Colony System-based algorithm.
Abstract: Big data trends have recently brought unrivalled opportunities to the cloud systems. Numerous virtual machines (VMs) have been widely deployed to enable the on-demand provisioning and pay-as-you-go services for customers. Due to the large complexity of the current cloud systems, promising VM placement algorithm are highly desirable. This paper focuses on the energy efficiency and thermal stability issues of the cloud systems. A Cross Entropy based VM Placement (CEVP) algorithm is proposed to simultaneously minimize the energy cost, total thermal cost and the number of hot spots in the data center. Simulation results indicate that the proposed CEVP algorithm can (1) achieve energy savings of 26.2 % on average, (2) efficiently reduce the temperature cost by up to 6.8 % and (3) significantly decrease the total number of the hot spots by 60.1 % on average in the cloud systems, by comparing to the Ant Colony System-based algorithm.

Journal ArticleDOI
TL;DR: Two compilers are presented that transform any two-party PAke protocol to a two-server PAKE protocol on the basis of the identity-based cryptography, called ID2SPAKE protocol, which can be proven to be secure without random oracles.
Abstract: In a two-server password-authenticated key exchange (PAKE) protocol, a client splits its password and stores two shares of its password in the two servers, respectively, and the two servers then cooperate to authenticate the client without knowing the password of the client. In case one server is compromised by an adversary, the password of the client is required to remain secure. In this paper, we present two compilers that transform any two-party PAKE protocol to a two-server PAKE protocol on the basis of the identity-based cryptography, called ID2S PAKE protocol. By the compilers, we can construct ID2S PAKE protocols which achieve implicit authentication. As long as the underlying two-party PAKE protocol and identity-based encryption or signature scheme have provable security without random oracles, the ID2S PAKE protocols constructed by the compilers can be proven to be secure without random oracles. Compared with the Katz et al.'s two-server PAKE protocol with provable security without random oracles, our ID2S PAKE protocol can save from 22 to 66 percent of computation in each server.

Proceedings ArticleDOI
01 Oct 2016
TL;DR: This work proposes an advanced scheduler for Apache Storm that provides improved performance with highly dynamic behavior and increases the overall resource utilization by 31% on average compared to the two others solutions, without significant negative impact on the QoS enforcement level.
Abstract: Apache Storm has recently emerged as an attractive fault-tolerant open-source distributed data processing platform that has been chosen by many industry leaders to develop real-time applications for processing a huge amount of data in a scalable manner. A key aspect to achieve the best performance in this system lies on the design of an efficient scheduler for component execution, called topology, on the available computing resources. In response to workload fluctuations, we propose an advanced scheduler for Apache Storm that provides improved performance with highly dynamic behavior. While enforcing the required Quality-of-Service (QoS) of individual data streams, the controller allocates computing resources based on decisions that consider the future states of non-controllable disturbance parameters, e.g. arriving rate of tuples or resource utilization in each worker node. The performance evaluation is carried out by comparing the proposed solution with two well-known alternatives, namely the Storm's default scheduler and the best-effort approach (i.e. the heuristic that is based on the first-fit decreasing approximation algorithm). Experimental results clearly show that the proposed controller increases the overall resource utilization by 31% on average compared to the two others solutions, without significant negative impact on the QoS enforcement level.

Book ChapterDOI
13 May 2016
TL;DR: Recently, cloud computing has increasingly been employed for a wide range of applications in various research domains, such as agriculture, smart grids, e-commerce, scientific applications, healthcare, and nuclear science.
Abstract: As cloud computing systems continue to grow in scale and complexity, it is of critical importance to ensure the stability, availability, and reliability of such systems. However, cloud failure could be induced by varying execution environments, addition and removal of system components, frequent updates and upgrades, online repairs, and intensive workload on servers. The reliability and availability of a cloud could be compromised easily if proactive measures are not taken to tackle the possible failures. To achieve higher reliability, and as a countermeasure for faults and failures, fault tolerance in cloud computing is vital. This chapter discuss fault tolerance in the cloud and illustrate various fault-tolerance strategies. It also categories fault tolerance measures on the basis of their methodology, programming framework, environment, and fault type they can detect.

Journal ArticleDOI
TL;DR: This work devise a resource allocation mechanism by formulating it as a Mixed-Integer programming model representing an optimization problem that incorporates predictable average performance as a unified performance metric and demonstrates that target performance is predictable and attainable for clusters of heterogeneous resources.
Abstract: Although most current cloud providers, such as Amazon Web Services (AWS) and Microsoft Azure offer different types of computing instances with different capacities, cloud users tend to hire a cluster of instances of particular type to ensure performance predictability for their applications. Nowadays, many large-scale applications including big data analytics applications feature workload patterns that have heterogeneous resource demands, for which, accounting for heterogeneity of virtual cloud instances to allocate would be highly advantageous to the application performance. However, performance predictability has been always an issue in such clusters of heterogeneous resources. In particular, to precisely decide on what instances from which types to enclose in a cluster, such that the desired performance is attained, remains an open question. To this end, we devise a resource allocation mechanism by formulating it as a Mixed-Integer programming model representing an optimization problem. Our resource allocation mechanism incorporates predictable average performance as a unified performance metric, which concerns two key performance-related issues: (a) Performance variation within same-type instances, and (b) Correlations of performance variabilities across different types. Our experimental results demonstrate that target performance is predictable and attainable for clusters of heterogeneous resources. Our mechanism constructs clusters whose performance is within 95 percent of the performance of optimal ones, hence deadlines are always met. By reoptimisation, our mechanism can react to performance mispredictions and support autoscaling for varying workloads. We experimentally verify our findings using clusters on Amazon EC2 with MapReduce workloads, and on a private cloud as well. We conduct comparison experiments with an existing recent resource allocation approach in literature.

Journal ArticleDOI
TL;DR: This work presents a decentralized algorithm for detecting damage in structures by using a WSAN that makes use of cooperative information fusion for calculating a damage coefficient and finds that its collaborative and information fusion-based approach ensures the accuracy of the algorithm.
Abstract: The unprecedented capabilities of monitoring and responding to stimuli in the physical world of wireless sensor and actuator networks (WSAN) enable these networks to provide the underpinning for several Smart City applications, such as structural health monitoring (SHM). In such applications, civil structures, endowed with wireless smart devices, are able to self-monitor and autonomously respond to situations using computational intelligence. This work presents a decentralized algorithm for detecting damage in structures by using a WSAN. As key characteristics, beyond presenting a fully decentralized (in-network) and collaborative approach for detecting damage in structures, our algorithm makes use of cooperative information fusion for calculating a damage coefficient. We conducted experiments for evaluating the algorithm in terms of its accuracy and efficient use of the constrained WSAN resources. We found that our collaborative and information fusion-based approach ensures the accuracy of our algorithm and that it can answer promptly to stimuli (1.091 s), triggering actuators. Moreover, for 100 nodes or less in the WSAN, the communication overhead of our algorithm is tolerable and the WSAN running our algorithm, operating system and protocols can last as long as 468 days.

Book ChapterDOI
05 Sep 2016
TL;DR: VMBBThrPred as discussed by the authors is an application-oblivious approach to predict performance of virtualized applications based on only basic Hypervisor level metrics, which is different from other approaches in the literature that usually either inject monitoring codes to VMs or use peripheral devices to directly report their actual throughput.
Abstract: In today’s ever computerized society, Cloud Data Centers are packed with numerous online services to promptly respond to users and provide services on demand. In such complex environments, guaranteeing throughput of Virtual Machines (VMs) is crucial to minimize performance degradation for all applications. vmBBThrPred, our novel approach in this work, is an application-oblivious approach to predict performance of virtualized applications based on only basic Hypervisor level metrics. vmBBThrPred is different from other approaches in the literature that usually either inject monitoring codes to VMs or use peripheral devices to directly report their actual throughput. vmBBThrPred, instead, uses sensitivity values of VMs to cloud resources (CPU, Mem, and Disk) to predict their throughput under various working scenarios (free or under contention); sensitivity values are calculated by vmBBProfiler that also uses only Hypervisor level metrics. We used a variety of resource intensive benchmarks to gauge efficiency of our approach in our VMware-vSphere based private cloud. Results proved accuracy of 95 % (on average) for predicting throughput of 12 benchmarks over 1200 h of operation.

Journal ArticleDOI
TL;DR: In this article, the authors address the potential threats to privacy posed by knowledge discovery activities in the IoT and cloud computing environments, and propose a solution to protect the privacy of users' data.
Abstract: Together, the Internet of Things (IoT) and cloud computing give us the ability to gather, process, and even trade data to better understand users' behaviors, habits, and preferences. However, future IoT applications must address the significant potential threats to privacy posed by such knowledge-discovery activities.

Journal ArticleDOI
TL;DR: Internet of Things (IoT), a part of Future Internet, comprises many billions of Internet connected Objects (ICOs) or ‘things’ where things can sense, communicate, compute and potentially actuate as well as have intelligence, multi-modal interfaces, physical/ virtual identities and attributes.
Abstract: Internet of Things (IoT), a part of Future Internet, comprises many billions of Internet connected Objects (ICOs) or ‘things’ where things can sense, communicate, compute and potentially actuate as well as have intelligence, multi-modal interfaces, physical/ virtual identities and attributes. The IoT vision has recently given rise to emerging IoT big data applications [2] , [3] e.g. smart energy grids, syndromic biosurveillance, environmental monitoring, emergency situation awareness, digital agriculture, and smart manufacturing that are capable of producing billions of data stream from geographically distributed data sources.

Journal ArticleDOI
TL;DR: A genetic algorithm (GA)‐based optimization technique, called GA‐ParFnt, is presented to find the Pareto frontier for optimizing data transfer versus job execution time in grids and provides invaluable insights into this formidable problem.
Abstract: This work presents a genetic algorithm GA-based optimization technique, called GA-ParFnt, to find the Pareto frontier for optimizing data transfer versus job execution time in grids. As the performance of a generic GA is not suitable to find such Pareto relationship, major modifications are applied to it so that it can efficiently discover such relationship. The frontier curve representing this relationship is then matched against performance of several scheduling techniques-for both data intensive and computationally intensive applications-to measure their overall performances. Results show that few of these algorithms are far from the Pareto front despite their claims of being efficient in optimizing their targeted objectives. Results also provide invaluable insights into this formidable problem and should aid in the design of future schedulers. Copyright © 2012 John Wiley & Sons, Ltd.