scispace - formally typeset
Search or ask a question

Showing papers in "The Journal of Supercomputing in 2013"


Journal ArticleDOI
TL;DR: The factors affecting Cloud computing adoption, vulnerabilities and attacks are surveyed, and relevant solution directives to strengthen security and privacy in the Cloud environment are identified.
Abstract: Cloud computing offers scalable on-demand services to consumers with greater flexibility and lesser infrastructure investment. Since Cloud services are delivered using classical network protocols and formats over the Internet, implicit vulnerabilities existing in these protocols as well as threats introduced by newer architectures raise many security and privacy concerns. In this paper, we survey the factors affecting Cloud computing adoption, vulnerabilities and attacks, and identify relevant solution directives to strengthen security and privacy in the Cloud environment.

376 citations


Journal ArticleDOI
TL;DR: This work is an overview of the preliminary experience in developing a high-performance iterative linear solver accelerated by GPU coprocessors and techniques for speeding up sparse matrix-vector product (SpMV) kernels and finding suitable preconditioning methods are discussed.
Abstract: This work is an overview of our preliminary experience in developing a high-performance iterative linear solver accelerated by GPU coprocessors. Our goal is to illustrate the advantages and difficulties encountered when deploying GPU technology to perform sparse linear algebra computations. Techniques for speeding up sparse matrix-vector product (SpMV) kernels and finding suitable preconditioning methods are discussed. Our experiments with an NVIDIA TESLA M2070 show that for unstructured matrices SpMV kernels can be up to 8 times faster on the GPU than the Intel MKL on the host Intel Xeon X5675 Processor. Overall performance of the GPU-accelerated Incomplete Cholesky (IC) factorization preconditioned CG method can outperform its CPU counterpart by a smaller factor, up to 3, and GPU-accelerated The incomplete LU (ILU) factorization preconditioned GMRES method can achieve a speed-up nearing 4. However, with better suited preconditioning techniques for GPUs, this performance can be further improved.

297 citations


Journal ArticleDOI
TL;DR: The hierarchical scheduling strategy is being implemented in the SwinDeW-C cloud workflow system and demonstrating satisfactory performance, and the experimental results show that the overall performance of ACO based scheduling algorithm is better than others on three basic measurements: the optimisations rate on makespan, the optimisation rate on cost and the CPU time.
Abstract: A cloud workflow system is a type of platform service which facilitates the automation of distributed applications based on the novel cloud infrastructure. One of the most important aspects which differentiate a cloud workflow system from its other counterparts is the market-oriented business model. This is a significant innovation which brings many challenges to conventional workflow scheduling strategies. To investigate such an issue, this paper proposes a market-oriented hierarchical scheduling strategy in cloud workflow systems. Specifically, the service-level scheduling deals with the Task-to-Service assignment where tasks of individual workflow instances are mapped to cloud services in the global cloud markets based on their functional and non-functional QoS requirements; the task-level scheduling deals with the optimisation of the Task-to-VM (virtual machine) assignment in local cloud data centres where the overall running cost of cloud workflow systems will be minimised given the satisfaction of QoS constraints for individual tasks. Based on our hierarchical scheduling strategy, a package based random scheduling algorithm is presented as the candidate service-level scheduling algorithm and three representative metaheuristic based scheduling algorithms including genetic algorithm (GA), ant colony optimisation (ACO), and particle swarm optimisation (PSO) are adapted, implemented and analysed as the candidate task-level scheduling algorithms. The hierarchical scheduling strategy is being implemented in our SwinDeW-C cloud workflow system and demonstrating satisfactory performance. Meanwhile, the experimental results show that the overall performance of ACO based scheduling algorithm is better than others on three basic measurements: the optimisation rate on makespan, the optimisation rate on cost and the CPU time.

277 citations


Journal ArticleDOI
TL;DR: A model for task-oriented resource allocation in a cloud computing environment where an induced bias matrix is used to identify the inconsistent elements and improve the consistency ratio when conflicting weights in various tasks are assigned is proposed.
Abstract: Resource allocation is a complicated task in cloud computing environment because there are many alternative computers with varying capacities. The goal of this paper is to propose a model for task-oriented resource allocation in a cloud computing environment. Resource allocation task is ranked by the pairwise comparison matrix technique and the Analytic Hierarchy Process giving the available resources and user preferences. The computing resources can be allocated according to the rank of tasks. Furthermore, an induced bias matrix is further used to identify the inconsistent elements and improve the consistency ratio when conflicting weights in various tasks are assigned. Two illustrative examples are introduced to validate the proposed method.

251 citations


Journal ArticleDOI
TL;DR: The failure rates of HPC systems are reviewed, rollback-recovery techniques which are most often used for long-running applications on HPC clusters are discussed, and a taxonomy is developed for over twenty popular checkpoint/restart solutions.
Abstract: In recent years, High Performance Computing (HPC) systems have been shifting from expensive massively parallel architectures to clusters of commodity PCs to take advantage of cost and performance benefits. Fault tolerance in such systems is a growing concern for long-running applications. In this paper, we briefly review the failure rates of HPC systems and also survey the fault tolerance approaches for HPC systems and issues with these approaches. Rollback-recovery techniques which are most often used for long-running applications on HPC clusters are discussed because they are widely used for long-running applications on HPC systems. Specifically, the feature requirements of rollback-recovery are discussed and a taxonomy is developed for over twenty popular checkpoint/restart solutions. The intent of this paper is to aid researchers in the domain as well as to facilitate development of new checkpointing solutions.

238 citations


Journal ArticleDOI
TL;DR: This paper is devoted to categorization of green computing performance metrics in data centers, such as basic metrics like power metrics, thermal metrics and extended performance metrics i.e. multiple data center indicators.
Abstract: Data centers now play an important role in modern IT infrastructures. Although much research effort has been made in the field of green data center computing, performance metrics for green data centers have been left ignored. This paper is devoted to categorization of green computing performance metrics in data centers, such as basic metrics like power metrics, thermal metrics and extended performance metrics i.e. multiple data center indicators. Based on a taxonomy of performance metrics, this paper summarizes features of currently available metrics and presents insights for the study on green data center computing.

182 citations


Journal ArticleDOI
TL;DR: This paper proposes a new efficient and secure biometrics-based multi- server authentication with key agreement scheme for smart cards on elliptic curve cryptosystem (ECC) without verification table to minimize the complexity of hash operation among all users and fit multi-server communication environments.
Abstract: Conventional single-server authentication schemes suffer a significant shortcoming. If a remote user wishes to use numerous network services, he/she must register his/her identity and password at these servers. It is extremely tedious for users to register numerous servers. In order to resolve this problem, various multi-server authentication schemes recently have been proposed. However, these schemes are insecure against some cryptographic attacks or inefficiently designed because of high computation costs. Moreover, these schemes do not provide strong key agreement function which can provide perfect forward secrecy. Based on these motivations, this paper proposes a new efficient and secure biometrics-based multi-server authentication with key agreement scheme for smart cards on elliptic curve cryptosystem (ECC) without verification table to minimize the complexity of hash operation among all users and fit multi-server communication environments. By adopting the biometrics technique, the proposed scheme can provide more strong user authentication function. By adopting the ECC technique, the proposed scheme can provide strong key agreement function with the property of perfect forward secrecy to reduce the computation loads for smart cards. As a result, compared with related multi-serve authentication schemes, the proposed scheme has strong security and enhanced computational efficiency. Thus, the proposed scheme is extremely suitable for use in distributed multi-server network environments such as the Internet and in limited computations and communication resource environments to access remote information systems since it provides security, reliability, and efficiency.

169 citations


Journal ArticleDOI
TL;DR: This paper study state-of-the-art techniques and research related to power saving in the IaaS of a cloud computing system, which consumes a huge part of total energy in a cloud Computing system.
Abstract: Although cloud computing has rapidly emerged as a widely accepted computing paradigm, the research on cloud computing is still at an early stage. Cloud computing suffers from different challenging issues related to security, software frameworks, quality of service, standardization, and power consumption. Efficient energy management is one of the most challenging research issues. The core services in cloud computing system are the SaaS (Software as a Service), PaaS (Platform as a Service), and IaaS (Infrastructure as a Service). In this paper, we study state-of-the-art techniques and research related to power saving in the IaaS of a cloud computing system, which consumes a huge part of total energy in a cloud computing system. At the end, some feasible solutions for building green cloud computing are proposed. Our aim is to provide a better understanding of the design challenges of energy management in the IaaS of a cloud computing system.

165 citations


Journal ArticleDOI
TL;DR: Detailed research is conducted on performance evaluation of cloud service considering fault recovery and the commonly adopted assumption of Poisson arrivals of users’ service requests is relaxed, and the interarrival times of service requests can take arbitrary probability distribution.
Abstract: Cloud computing is a recent trend in IT, which has attracted lots of attention. In cloud computing, service reliability and service performance are two important issues. To improve cloud service reliability, fault tolerance techniques such as fault recovery may be used, which in turn has impact on cloud service performance. Such impact deserves detailed research. Although there exist some researches on cloud/grid service reliability and performance, very few of them addressed the issues of fault recovery and its impact on service performance. In this paper, we conduct detailed research on performance evaluation of cloud service considering fault recovery. We consider recovery on both processing nodes and communication links. The commonly adopted assumption of Poisson arrivals of users' service requests is relaxed, and the interarrival times of service requests can take arbitrary probability distribution. The precedence constraints of subtasks are also considered. The probability distribution of service response time is derived, and a numerical example is presented. The proposed cloud performance evaluation models and methods could yield results which are realistic, and thus are of practical value for related decision-makings in cloud computing.

125 citations


Journal ArticleDOI
TL;DR: In this paper, the concept of Wisdom Web of Things (W2T) is proposed to realize the harmonious symbiosis of humans, computers, and things in the emerging hyper world.
Abstract: The rapid development of the Internet and the Internet of Things accelerates the emergence of the hyper world. It has become a pressing research issue to realize the organic amalgamation and harmonious symbiosis among humans, computers, and things in the hyper world, which consists of the social world, the physical world, and the information world (cyber world). In this paper, the notion of Wisdom Web of Things (W2T) is proposed in order to address this issue. As inspired by the material cycle in the physical world, the W2T focuses on the data cycle, namely "from things to data, information, knowledge, wisdom, services, humans, and then back to things." A W2T data cycle system is designed to implement such a cycle, which is, technologically speaking, a practical way to realize the harmonious symbiosis of humans, computers, and things in the emerging hyper world.

124 citations


Journal ArticleDOI
Zhikui Chen1, Feng Xia1, Tao Huang1, Fanyu Bu1, Haozhe Wang1 
TL;DR: A higher accuracy localization scheme is proposed which can effectively satisfy diverse requirements for many indoor and outdoor location services and experimental results show that the proposed scheme can improve the localization accuracy.
Abstract: Many localization algorithms and systems have been developed by means of wireless sensor networks for both indoor and outdoor environments. To achieve higher localization accuracy, extra hardware equipments are utilized by most of the existing localization solutions, which increase the cost and considerably limit the location-based applications. The Internet of Things (IOT) integrates many technologies, such as Internet, Zigbee, Bluetooth, infrared, WiFi, GPRS, 3G, etc., which can enable different ways to obtain the location information of various objects. Location-based service is a primary service of the IOT, while localization accuracy is a key issue. In this paper, a higher accuracy localization scheme is proposed which can effectively satisfy diverse requirements for many indoor and outdoor location services. The proposed scheme composes of two phases: (1) the partition phase, in which the target region is split into small grids; (2) the localization refinement phase, in which a higher accuracy of localization can be obtained by applying an algorithm designed in the paper. A trial system is set up to verify correctness of the proposed scheme and furthermore to illustrate its feasibility and availability. The experimental results show that the proposed scheme can improve the localization accuracy.

Journal ArticleDOI
TL;DR: The proposed Adaptive-Scheduling-with-QoS-Satisfaction algorithm, namely AsQ, for the hybrid cloud environment to raise the resource utilization rate of the private cloud and to diminish task response time as much as possible and achieves a total optimization regarding cost and deadline constraints.
Abstract: A hybrid cloud integrates private clouds and public clouds into one unified environment. For the economy and the efficiency reasons, the hybrid cloud environment should be able to automatically maximize the utilization rate of the private cloud and minimize the cost of the public cloud when users submit their computing jobs to the environment. In this paper, we propose the Adaptive-Scheduling-with-QoS-Satisfaction algorithm, namely AsQ, for the hybrid cloud environment to raise the resource utilization rate of the private cloud and to diminish task response time as much as possible. We exploit runtime estimation and several fast scheduling strategies for near-optimal resource allocation, which results in high resource utilization rate and low execution time in the private cloud. Moreover, the near-optimal allocation in the private cloud can reduce the amount of tasks that need to be executed on the public cloud to satisfy their deadline. For the tasks that have to be dispatched to the public cloud, we choose the minimal cost strategy to reduce the cost of using public clouds based on the characteristics of tasks such as workload size and data size. Therefore, the AsQ can achieve a total optimization regarding cost and deadline constraints. Many experiments have been conducted to evaluate the performance of the proposed AsQ. The results show that the performance of the proposed AsQ is superior to recent similar algorithms in terms of task waiting time, task execution time and task finish time. The results also show that the proposed algorithm achieves a better QoS satisfaction rate than other similar studies.

Journal ArticleDOI
TL;DR: This paper describes the attacks against localization and location verification, and then existing solutions are described, and typical secure localization algorithms of one popular category are implemented and studied by simulations.
Abstract: The locations of sensor nodes are very important to many wireless sensor networks (WSNs). When WSNs are deployed in hostile environments, two issues about sensors' locations need to be considered. First, attackers may attack the localization process to make estimated locations incorrect. Second, since sensor nodes may be compromised, the base station (BS) may not trust the locations reported by sensor nodes. Researchers have proposed two techniques, secure localization and location verification, to solve these two issues, respectively. In this paper, we present a survey of current work on both secure localization and location verification. We first describe the attacks against localization and location verification, and then we classify and describe existing solutions. We also implement typical secure localization algorithms of one popular category and study their performance by simulations.

Journal ArticleDOI
TL;DR: Three techniques to speedup fundamental problems in data mining algorithms on the CUDA platform are proposed: scalable thread scheduling scheme for irregular pattern, parallel distributed top-k scheme, and parallel high dimension reduction scheme.
Abstract: Recent development in Graphics Processing Units (GPUs) has enabled inexpensive high performance computing for general-purpose applications. Compute Unified Device Architecture (CUDA) programming model provides the programmers adequate C language like APIs to better exploit the parallel power of the GPU. Data mining is widely used and has significant applications in various domains. However, current data mining toolkits cannot meet the requirement of applications with large-scale databases in terms of speed. In this paper, we propose three techniques to speedup fundamental problems in data mining algorithms on the CUDA platform: scalable thread scheduling scheme for irregular pattern, parallel distributed top-k scheme, and parallel high dimension reduction scheme. They play a key role in our CUDA-based implementation of three representative data mining algorithms, CU-Apriori, CU-KNN, and CU-K-means. These parallel implementations outperform the other state-of-the-art implementations significantly on a HP xw8600 workstation with a Tesla C1060 GPU and a Core-quad Intel Xeon CPU. Our results have shown that GPU + CUDA parallel architecture is feasible and promising for data mining applications.

Journal ArticleDOI
TL;DR: An intrusion prevention system, VMFence, in a virtualization-based cloud computing environment, which is used to monitor network flow and file Integrity in real time, and provide a network defense and file integrity protection as well.
Abstract: With the development of information technology, cloud computing becomes a new direction of grid computing. Cloud computing is user-centric, and provides end users with leasing service. Guaranteeing the security of user data needs careful consideration before cloud computing is widely applied in business. Virtualization provides a new approach to solve the traditional security problems and can be taken as the underlying infrastructure of cloud computing. In this paper, we propose an intrusion prevention system, VMFence, in a virtualization-based cloud computing environment, which is used to monitor network flow and file integrity in real time, and provide a network defense and file integrity protection as well. Due to the dynamicity of the virtual machine, the detection process varies with the state of the virtual machine. The state transition of the virtual machine is described via Definite Finite Automata (DFA). We have implemented VMFence on an open-source virtual machine monitor platform--Xen. The experimental results show our proposed method is effective and it brings acceptable overhead.

Journal ArticleDOI
TL;DR: This paper proposes a light weight and platform independent ME framework called Mandi, which allows consumers and providers to trade computing resources according to their requirements and evaluates the performance of the first prototype of “Mandi” in terms of its scalability.
Abstract: The recent development in Cloud computing has enabled the realization of delivering computing as an utility. Many industries such as Amazon and Google have started offering Cloud services on a "pay as you go" basis. These advances have led to the evolution of the market infrastructure in the form of a Market Exchange (ME) that facilitates the trading between consumers and Cloud providers. Such market environment eases the trading process by aggregating IT services from a variety of sources, and allows consumers to easily select them. In this paper, we propose a light weight and platform independent ME framework called "Mandi", which allows consumers and providers to trade computing resources according to their requirements. The novelty of Mandi is that it not only gives its users the flexibility in terms of negotiation protocol, but also allows the simultaneous coexistence of multiple trading negotiations. In this paper, we first present the requirements that motivated our design and discuss how these facilitate the trading of compute resources using multiple market models (also called negotiation protocols). Finally, we evaluate the performance of the first prototype of "Mandi" in terms of its scalability.

Journal ArticleDOI
TL;DR: Empirical results show that the proposed WLARA performs better than other related approaches in terms of SLA violations and the provider’s profits, and also shows that using the automated SLA negotiation mechanism supports providers in earning higher profits.
Abstract: The number of cloud service users has increased worldwide, and cloud service providers have been deploying and operating data centers to serve the globally distributed cloud users. The resource capacity of a data center is limited, so distributing the load to global data centers will be effective in providing stable services. Another issue in cloud computing is the need for providers to guarantee the service level agreements (SLAs) established with consumers. Whereas various load balancing algorithms have been developed, it is necessary to avoid SLA violations (e.g., service response time) when a cloud provider allocates the load to data centers geographically distributed across the world. Considering load balancing and guaranteed SLA, therefore, this paper proposes an SLA-based cloud computing framework to facilitate resource allocation that takes into account the workload and geographical location of distributed data centers. The contributions of this paper include: (1) the design of a cloud computing framework that includes an automated SLA negotiation mechanism and a workload- and location-aware resource allocation scheme (WLARA), and (2) the implementation of an agent-based cloud testbed of the proposed framework. Using the testbed, experiments were conducted to compare the proposed schemes with related approaches. Empirical results show that the proposed WLARA performs better than other related approaches (e.g., round robin, greedy, and manual allocation) in terms of SLA violations and the provider's profits. We also show that using the automated SLA negotiation mechanism supports providers in earning higher profits.

Journal ArticleDOI
TL;DR: VM deployment is a heavyweight approach for process offloading on smart mobile devices and requires additional resources on the computing host, according to analysis of the impact of VM deployment and management on the execution time of application in different experiments.
Abstract: In mobile cloud computing, application offloading is implemented as a software level solution for augmenting computing potentials of smart mobile devices. VM is one of the prominent approaches for offloading computational load to cloud server nodes. A challenging aspect of such frameworks is the additional computing resources utilization in the deployment and management of VM on Smartphone. The deployment of Virtual Machine (VM) requires computing resources for VM creation and configuration. The management of VM includes computing resources utilization in the monitoring of VM in entire lifecycle and physical resources management for VM on Smartphone. The objective of this work is to ensure that VM deployment and management requires additional computing resources on mobile device for application offloading. This paper analyzes the impact of VM deployment and management on the execution time of application in different experiments. We investigate VM deployment and management for application processing in simulation environment by using CloudSim, which is a simulation toolkit that provides an extensible simulation framework to model the simulation of VM deployment and management for application processing in cloud-computing infrastructure. VM deployment and management in application processing is evaluated by analyzing VM deployment, the execution time of applications and total execution time of the simulation. The analysis concludes that VM deployment and management require additional resources on the computing host. Therefore, VM deployment is a heavyweight approach for process offloading on smart mobile devices.

Journal ArticleDOI
TL;DR: The proposed scheme has the following benefits: it complies with all the requirements for multi-server environments; it can withstand all the well-known attacks at the present time; it is equipped with a more secure key agreement procedure; and it is quite efficient in terms of the cost of computation and transmission.
Abstract: Two user authentication schemes for multi-server environments have been proposed by Tsai and Wang et al., respectively. However, there are some flaws existing in both schemes. Therefore, a new scheme for improving these drawbacks is proposed in this paper. The proposed scheme has the following benefits: (1) it complies with all the requirements for multi-server environments; (2) it can withstand all the well-known attacks at the present time; (3) it is equipped with a more secure key agreement procedure; and (4) it is quite efficient in terms of the cost of computation and transmission. In addition, the analysis and comparisons show that the proposed scheme outperforms the other related schemes in various aspects.

Journal ArticleDOI
TL;DR: A light-weight security scheme is proposed for mobile user in cloud environment to protect the mobile user’s identity with dynamic credentials and offloads the frequently occurring dynamic credential generation operations on a trusted entity to keep minimum processing burden on the mobile device.
Abstract: To improve the resource limitation of mobile devices, mobile users may utilize cloud-computational and storage services. Although the utilization of the cloud services improves the processing and storage capacity of mobile devices, the migration of confidential information on untrusted cloud raises security and privacy issues. Considering the security of mobile-cloud-computing subscribers' information, a mechanism to authenticate legitimate mobile users in the cloud environment is sought. Usually, the mobile users are authenticated in the cloud environment through digital credential methods, such as password. Once the users' credential information theft occurs, the adversary can use the hacked information for impersonating the mobile user later on. The alarming situation is that the mobile user is unaware about adversary's malicious activities. In this paper, a light-weight security scheme is proposed for mobile user in cloud environment to protect the mobile user's identity with dynamic credentials. The proposed scheme offloads the frequently occurring dynamic credential generation operations on a trusted entity to keep minimum processing burden on the mobile device. To enhance the security and reliability of the scheme, the credential information is updated frequently on the basis of mobile-cloud packets exchange. Furthermore, the proposed scheme is compared with the existing scheme on the basis of performance metrics i.e. turnaround time and energy consumption. The experimental results for the proposed scheme showed significant improvement in turnaround time and energy consumption as compared to the existing scheme.

Journal ArticleDOI
TL;DR: Theoretical as well as experimental results conclusively demonstrate that the dynamic adaptive fault tolerance strategy DAFT has high potential as it provides efficient fault tolerance enhancements, significant cloud serviceability improvement, and great SLOs satisfaction.
Abstract: Failures are normal rather than exceptional in cloud computing environments, high fault tolerance issue is one of the major obstacles for opening up a new era of high serviceability cloud computing as fault tolerance plays a key role in ensuring cloud serviceability. Fault tolerant service is an essential part of Service Level Objectives (SLOs) in clouds. To achieve high level of cloud serviceability and to meet high level of cloud SLOs, a foolproof fault tolerance strategy is needed. In this paper, the definitions of fault, error, and failure in a cloud are given, and the principles for high fault tolerance objectives are systematically analyzed by referring to the fault tolerance theories suitable for large-scale distributed computing environments. Based on the principles and semantics of cloud fault tolerance, a dynamic adaptive fault tolerance strategy DAFT is put forward. It includes: (i) analyzing the mathematical relationship between different failure rates and two different fault tolerance strategies, which are checkpointing fault tolerance strategy and data replication fault tolerance strategy; (ii) building a dynamic adaptive checkpointing fault tolerance model and a dynamic adaptive replication fault tolerance model by combining the two fault tolerance models together to maximize the serviceability and meet the SLOs; and (iii) evaluating the dynamic adaptive fault tolerance strategy under various conditions in large-scale cloud data centers and consider different system centric parameters, such as fault tolerance degree, fault tolerance overhead, response time, etc. Theoretical as well as experimental results conclusively demonstrate that the dynamic adaptive fault tolerance strategy DAFT has high potential as it provides efficient fault tolerance enhancements, significant cloud serviceability improvement, and great SLOs satisfaction. It efficiently and effectively achieves a trade-off for fault tolerance objectives in cloud computing environments.

Journal ArticleDOI
TL;DR: A low-latency, high-throughput, and fault-tolerant routing algorithm named Look-Ahead-Fault-Tolerant (LAFT) is proposed that reduces the communication latency and enhances the system performance while maintaining a reasonable hardware complexity and ensuring fault tolerance.
Abstract: Despite the higher scalability and parallelism integration offered by Network-on-Chip (NoC) over the traditional shared-bus based systems, it is still not an ideal solution for future large-scale Systems-on-Chip (SoCs), due to limitations such as high power consumption, high-cost communication, and low throughput. Recently, extending 2D-NoC to the third dimension (3D-NoC) has been proposed to deal with these problems; however, 3D-NoC systems are exposed to a variety of manufacturing and design factors making them vulnerable to different faults that cause corrupted message transfer or even catastrophic system failures. Therefore, a 3D-NoC system should be fault tolerant to transient malfunctions or permanent physical damages. In this paper, we propose a low-latency, high-throughput, and fault-tolerant routing algorithm named Look-Ahead-Fault-Tolerant (LAFT). LAFT reduces the communication latency and enhances the system performance while maintaining a reasonable hardware complexity and ensuring fault tolerance. We implemented the proposed algorithm on a real 3D-NoC architecture (3D-OASIS-NoC) and prototyped it on FPGA, then we evaluated its performance over various applications. Evaluation results show that the proposed algorithm efficiently reduces the communication latency that can reach an average of 38 % and 16 %, when compared to conventional XYZ and our early designed Look-Ahead-XYZ routing algorithms, respectively, and enhances the throughput with up to 46 % and 29 %.

Journal ArticleDOI
TL;DR: Investigation of how the mathematical languages used to describe and to observe automatic computations influence the accuracy of the obtained results focuses on single and multi-tape Turing machines.
Abstract: The paper investigates how the mathematical languages used to describe and to observe automatic computations influence the accuracy of the obtained results. In particular, we focus our attention on single and multi-tape Turing machines, which are described and observed through the lens of a new mathematical language, which is strongly based on three methodological ideas borrowed from physics and applied to mathematics, namely: the distinction between the object (we speak here about a mathematical object) of an observation and the instrument used for this observation; interrelations holding between the object and the tool used for the observation; the accuracy of the observation determined by the tool. Results of the observation executed by the traditional and new languages are compared and discussed.

Journal ArticleDOI
TL;DR: This paper describes a new parallel Frequent Itemset Mining algorithm called Frontier Expansion, an improved data-parallel algorithm derived from the Equivalent Class Clustering (Eclat) method, in which a partial breadth-first search is utilized to exploit maximum parallelism while being constrained by the available memory capacity.
Abstract: In this paper we describe a new parallel Frequent Itemset Mining algorithm called "Frontier Expansion." This implementation is optimized to achieve high performance on a heterogeneous platform consisting of a shared memory multiprocessor and multiple Graphics Processing Unit (GPU) coprocessors. Frontier Expansion is an improved data-parallel algorithm derived from the Equivalent Class Clustering (Eclat) method, in which a partial breadth-first search is utilized to exploit maximum parallelism while being constrained by the available memory capacity. In our approach, the vertical transaction lists are represented using a "bitset" representation and operated using wide bitwise operations across multiple threads on a GPU. We evaluate our approach using four NVIDIA Tesla GPUs and observed a 6---30× speedup relative to state-of-the-art sequential Eclat and FPGrowth implementations executed on a multicore CPU.

Journal ArticleDOI
TL;DR: An improved partitioning algorithm that improves load balancing and memory consumption is proposed via an improved sampling algorithm and partitioner and experiments show that the proposed algorithm is faster, more memory efficient, and more accurate than the current implementation.
Abstract: In the era of Big Data, huge amounts of structured and unstructured data are being produced daily by a myriad of ubiquitous sources Big Data is difficult to work with and requires massively parallel software running on a large number of computers MapReduce is a recent programming model that simplifies writing distributed applications that handle Big Data In order for MapReduce to work, it has to divide the workload among computers in a network Consequently, the performance of MapReduce strongly depends on how evenly it distributes this workload This can be a challenge, especially in the advent of data skew In MapReduce, workload distribution depends on the algorithm that partitions the data One way to avoid problems inherent from data skew is to use data sampling How evenly the partitioner distributes the data depends on how large and representative the sample is and on how well the samples are analyzed by the partitioning mechanism This paper proposes an improved partitioning algorithm that improves load balancing and memory consumption This is done via an improved sampling algorithm and partitioner To evaluate the proposed algorithm, its performance was compared against a state of the art partitioning mechanism employed by TeraSort Experiments show that the proposed algorithm is faster, more memory efficient, and more accurate than the current implementation

Journal ArticleDOI
TL;DR: Experimental results show that different CBR methods can get the best results in different parameters settings, and there is not a best method for the software effort estimation among the six differentCBR methods.
Abstract: Since software development has become an essential investment for many organizations recently, both the software industry and academic communities are more and more concerned about a reliable and accurate estimation of the software development effort. This study puts forward six widely used case-based reasoning (CBR) methods with optimized weights derived from the particle swarm optimization (PSO) method to estimate the software effort. Meanwhile, four combination methods are adopted to assemble the results of independent CBR methods. The experiments are carried out using two datasets of software projects from Desharnais dataset and Miyazaki dataset. Experimental results show that different CBR methods can get the best results in different parameters settings, and there is not a best method for the software effort estimation among the six different CBR methods. Currently, combination methods proposed in this study outperform independent methods, and the weighted mean combination (WMC) method can get the better result.

Journal ArticleDOI
TL;DR: A two-stage data hiding method with high capacity and good visual quality based on image interpolation and histogram modification techniques that has better PSNR value of stego-image with improving 43 % on the average when compared the past key-studies.
Abstract: Information hiding is an important research issue in digital life. In this paper, we propose a two-stage data hiding method with high capacity and good visual quality based on image interpolation and histogram modification techniques. At the first stage, we first generate a high-quality cover image using the developed enhanced neighbor mean interpolation and then take the difference values from input and cover pixels as a carrier to embed secret data. In this stage, our proposed scheme raises the image quality a lot due to the ENMI method. At the second stage, a histogram modification method is applied on the difference image to further increase the embedding capacity and preserve the image quality without distortion. Experimental results indicate that the proposed method have better PSNR value of stego-image with improving 43 % on the average when compared the past key-studies.

Journal ArticleDOI
TL;DR: A novel methodology for evaluating association rules on graphics processing units (GPUs) and the use of GPUs and the compute unified device architecture (CUDA) programming model enables the rules mined to be evaluated in a massively parallel way, thus reducing the computational time required.
Abstract: Association rule mining is a well-known data mining task, but it requires much computational time and memory when mining large scale data sets of high dimensionality. This is mainly due to the evaluation process, where the antecedent and consequent in each rule mined are evaluated for each record. This paper presents a novel methodology for evaluating association rules on graphics processing units (GPUs). The evaluation model may be applied to any association rule mining algorithm. The use of GPUs and the compute unified device architecture (CUDA) programming model enables the rules mined to be evaluated in a massively parallel way, thus reducing the computational time required. This proposal takes advantage of concurrent kernels execution and asynchronous data transfers, which improves the efficiency of the model. In an experimental study, we evaluate interpreter performance and compare the execution time of the proposed model with regard to single-threaded, multi-threaded, and graphics processing unit implementation. The results obtained show an interpreter performance above 67 billion giga operations per second, and speed-up by a factor of up to 454 over the single-threaded CPU model, when using two NVIDIA 480 GTX GPUs. The evaluation model demonstrates its efficiency and scalability according to the problem complexity, number of instances, rules, and GPU devices.

Journal ArticleDOI
TL;DR: An energy-efficient location-based service (LBS) and mobile cloud convergence framework based on the end-to-end architecture between server and a smartphone that is independent of the internal architecture of current 3G cellular networks is proposed.
Abstract: The increasing use of wireless Internet and smartphone has accelerated the need for pervasive and ubiquitous computing (PUC) Smartphones stimulate growth of location-based service and mobile cloud computing However, smartphone mobile computing poses challenges because of the limited battery capacity, constraints of wireless networks and the limitations of device A fundamental challenge arises as a result of power-inefficiency of location awareness The location awareness is one of smartphone's killer applications; it runs steadily and consumes a large amount of power Another fundamental challenge stems from the fact that smartphone mobile devices are generally less powerful than other devices Therefore, it is necessary to offload the computation-intensive part by careful partitioning of application functions across a cloud In this paper, we propose an energy-efficient location-based service (LBS) and mobile cloud convergence This framework reduces the power dissipation of LBSs by substituting power-intensive sensors with the use of less-power-intensive sensors, when the smartphone is in a static state, for example, when lying idle on a table in an office The substitution is controlled by a finite state machine with a user-movement detection strategy We also propose a seamless connection handover mechanism between different access networks For convenient on-site establishment, our approach is based on the end-to-end architecture between server and a smartphone that is independent of the internal architecture of current 3G cellular networks

Journal ArticleDOI
TL;DR: A dynamic scheduling scheme for the selection of the device between the CPU and the GPU to execute the application based on the estimated-execution-time information is proposed, resulting in reduced execution time and enhanced energy efficiency of heterogeneous computing systems.
Abstract: Computing systems should be designed to exploit parallelism in order to improve performance. In general, a GPU (Graphics Processing Unit) can provide more parallelism than a CPU (Central Processing Unit), resulting in the wide usage of heterogeneous computing systems that utilize both the CPU and the GPU together. In the heterogeneous computing systems, the efficiency of the scheduling scheme, which selects the device to execute the application between the CPU and the GPU, is one of the most critical factors in determining the performance. This paper proposes a dynamic scheduling scheme for the selection of the device between the CPU and the GPU to execute the application based on the estimated-execution-time information. The proposed scheduling scheme enables the selection between the CPU and the GPU to minimize the completion time, resulting in a better system performance, even though it requires the training period to collect the execution history. According to our simulations, the proposed estimated-execution-time scheduling can improve the utilization of the CPU and the GPU compared to existing scheduling schemes, resulting in reduced execution time and enhanced energy efficiency of heterogeneous computing systems.