scispace - formally typeset
Search or ask a question

Showing papers in "Cluster Computing in 2015"


Journal ArticleDOI
TL;DR: A hybrid approach called FUGE is presented that is based on fuzzy theory and a genetic algorithm (GA) that aims to perform optimal load balancing considering execution time and cost and shows the efficiency of the FUGE approach in terms of execution time, execution cost, and average degree of imbalance.
Abstract: Job scheduling is one of the most important research problems in distributed systems, particularly cloud environments/computing. The dynamic and heterogeneous nature of resources in such distributed systems makes optimum job scheduling a non-trivial task. Maximal resource utilization in cloud computing demands/necessitates an algorithm that allocates resources to jobs with optimal execution time and cost. The critical issue for job scheduling is assigning jobs to the most suitable resources, considering user preferences and requirements. In this paper, we present a hybrid approach called FUGE that is based on fuzzy theory and a genetic algorithm (GA) that aims to perform optimal load balancing considering execution time and cost. We modify the standard genetic algorithm (SGA) and use fuzzy theory to devise a fuzzy-based steady-state GA in order to improve SGA performance in term of makespan. In details, the FUGE algorithm assigns jobs to resources by considering virtual machine (VM) processing speed, VM memory, VM bandwidth, and the job lengths. We mathematically prove our optimization problem which is convex with well-known analytical conditions (specifically, Karush---Kuhn---Tucker conditions). We compare the performance of our approach to several other cloud scheduling models. The results of the experiments show the efficiency of the FUGE approach in terms of execution time, execution cost, and average degree of imbalance.

132 citations


Journal ArticleDOI
TL;DR: In this study, the factors that affect the cloud adoption by higher education institutions were identified and tested using SmartPLS software, a powerful statistical analysis tool for structural equation modeling and three factors were found significant in this context.
Abstract: Academic study of cloud computing is an emerging research field in Saudi Arabia. Saudi Arabia represents the largest economy in the Arab Gulf region, which makes it a potential market of cloud computing technologies. This cross-sectional exploratory empirical research is based on technology---organization---environment (TOE) framework, targeting higher education institutions. In this study, the factors that affect the cloud adoption by higher education institutions were identified and tested using SmartPLS software, a powerful statistical analysis tool for structural equation modeling. Three factors were found significant in this context. Relative advantage, complexity and data concern were the most significant factors. The model explained 47.9 % of the total adoption variance. The findings offer education institutions and cloud computing service providers with better understanding of factors affecting the adoption of cloud computing.

124 citations


Journal ArticleDOI
TL;DR: This paper designs a novel task scheduling scheme based on reinforcement learning and queuing theory to optimize task scheduling under the resource constraints, and the state aggregation technologies is employed to accelerate the learning progress.
Abstract: Task scheduling is a necessary prerequisite for performance optimization and resource management in the cloud computing system. Focusing on accurate scaled cloud computing environment and efficient task scheduling under resource constraints problems, we introduce fine-grained cloud computing system model and optimization task scheduling scheme in this paper. The system model is comprised of clearly defined separate submodels including task schedule submodel, task execute submodel and task transmission submodel, so that they can be accurately analyzed in the order of processing of user requests. Moreover the submodels are scalable enough to capture the flexibility of the cloud computing paradigm. By analyzing the submodels, where results are repeated to obtain sufficient accuracy, we design a novel task scheduling scheme based on reinforcement learning and queuing theory to optimize task scheduling under the resource constraints, and the state aggregation technologies is employed to accelerate the learning progress. Our results, on the one hand, demonstrate the efficiency of the task scheduling scheme and, on the other hand, reveal the relationship between the arrival rate, server rate, number of VMs and the number of buffer size.

103 citations


Journal ArticleDOI
TL;DR: A comprehensive survey over the state-of-the-art of large scale graph processing platforms, namely, GraphChi, Apache Giraph, GPS, GraphLab and GraphX, and an extensive experimental study of five popular systems in this domain.
Abstract: Graph is a fundamental data structure that captures relationships between different data entities. In practice, graphs are widely used for modeling complicated data in different application domains such as social networks, protein networks, transportation networks, bibliographical networks, knowledge bases and many more. Currently, graphs with millions and billions of nodes and edges have become very common. In principle, graph analytics is an important big data discovery technique. Therefore, with the increasing abundance of large graphs, designing scalable systems for processing and analyzing large scale graphs has become one of the most timely problems facing the big data research community. In general, scalable processing of big graphs is a challenging task due to their size and the inherent irregular structure of graph computations. Thus, in recent years, we have witnessed an unprecedented interest in building big graph processing systems that attempted to tackle these challenges. In this article, we provide a comprehensive survey over the state-of-the-art of large scale graph processing platforms. In addition, we present an extensive experimental study of five popular systems in this domain, namely, GraphChi, Apache Giraph, GPS, GraphLab and GraphX. In particular, we report and analyze the performance characteristics of these systems using five common graph processing algorithms and seven large graph datasets. Finally, we identify a set of the current open research challenges and discuss some promising directions for future research in the domain of large scale graph processing.

97 citations


Journal ArticleDOI
TL;DR: The results showed that telemedicine service for diabetes mellitus management should facilitate infrastructure methods such as continuous assistance service and service guideline education, and the capacity of telemedICine service providers is more important for teleMedicine success than the competence of the individuals receiving telemediine service care.
Abstract: Telemedicine service is effective intervention in blood glucose management and reducing the progression of diabetic complications. While telemedicine service for the enhanced management of diabetes has been known for its usefulness, there is little understanding regarding which factors should be considered when diabetic patients accept telemedicine. Thus, this study aimed to examine the factors that influence the acceptance of telemedicine service for the enhanced management of diabetes mellitus based on the Unified Theory of Acceptance and Use of Technolog (UTAUT) model. Data were collected from a paper-based survey of 116 diabetic patients who were outpatients in six different university hospitals. This study used partial least squares regression to determine the causal relationship between the five variables. Demographic variables, such as age and gender, as moderating variables for behavioral intention to use were analyzed. The results indicate that facilitating factors have effects on the behavioral intention to use telemedicine service through the performance expectancy ( $$p<0.05$$ p < 0.05 ). In addition, facilitating factors have effects on the behavioral intention to use telemedicine service through the effort expectancy ( $$p<0.05$$ p < 0.05 ). This study also found that performance expectancy, effort expectancy and social influence have positive effects on behavioral intentions to use telemedicine service, as predicted using the UTAUT model ( $$p<0.05$$ p < 0.05 ). Finally, gender and age were found to be moderators between PE and behavioral intention to use telemedicine service as predicted using the UTAUT model. Our results showed that telemedicine service for diabetes mellitus management should facilitate infrastructure methods such as continuous assistance service and service guideline education. Therefore, the capacity of telemedicine service providers is more important for telemedicine success than the competence of the individuals receiving telemedicine service care. In addition, performance expectancy, effort expectancy and social influence are influencing factors for the acceptance of telemedicine service for diabetes management. Accordingly, in order to raise service usage, telemedicine service providers' variety support is important.

92 citations


Journal ArticleDOI
TL;DR: The dynamic resources provisioning and monitoring system is presented, a multi-agent system to manage the cloud provider’s resources while taking into account the customers’ quality of service requirements as determined by the service-level agreement (SLA).
Abstract: The cloud computing paradigm provides a shared pool of resources and services with different models delivered to the customers through the Internet via an on-demand dynamically-scalable form charged using a pay-per-use model. The main problem we tackle in this paper is to optimize the resource provisioning task by shortening the completion time for the customers' tasks while minimizing the associated cost. This study presents the dynamic resources provisioning and monitoring (DRPM) system, a multi-agent system to manage the cloud provider's resources while taking into account the customers' quality of service requirements as determined by the service-level agreement (SLA). Moreover, DRPM includes a new virtual machine selection algorithm called the host fault detection algorithm. The proposed DRPM system is evaluated using the CloudSim tool. The results show that using the DRPM system increases resource utilization and decreases power consumption while avoiding SLA violations.

84 citations


Journal ArticleDOI
TL;DR: An efficient distributed frequent itemset mining algorithm (DFIMA) which can significantly reduce the amount of candidate itemsets by applying a matrix-based pruning approach is proposed.
Abstract: Frequent itemset mining is an essential step in the process of association rule mining. Conventional approaches for mining frequent itemsets in big data era encounter significant challenges when computing power and memory space are limited. This paper proposes an efficient distributed frequent itemset mining algorithm (DFIMA) which can significantly reduce the amount of candidate itemsets by applying a matrix-based pruning approach. The proposed algorithm has been implemented using Spark to further improve the efficiency of iterative computation. Numeric experiment results using standard benchmark datasets by comparing the proposed algorithm with the existing algorithm, parallel FP-growth, show that DFIMA has better efficiency and scalability. In addition, a case study has been carried out to validate the feasibility of DFIMA.

76 citations


Journal ArticleDOI
TL;DR: The context motion tracking provides emergency situation monitoring service accordingly with alert and symptom level in case of detecting symptoms through measured results and analysis and it provides information necessary for chronic disease management by analyzing life habits.
Abstract: Nowadays, great attention is paid to studies on fusion health and medical care combined of IT and BT for chronic disease patients due to westernization of dietary life, increase in stress, decrease in physical activities, and others. In reality, full recovery of chronic disease is difficult to achieve as its cause is diverse and complex. Therefore, the necessity for continuous management is proposed rather than approach to treat the disease. It is urgent to come up with the countermeasure since the lengthening of life expectancy in the aging society brings about the increase in chronic diseases and such increased medical expense becomes a big burden in socioeconomic activities. Companies are promoting test-projects in association with health management together with nationwide health management business for chronic diseases. In this study, we proposed the emergency situation monitoring service using context motion tracking for chronic disease patients. Proposed service diagnoses current status of patient based on contextual information collected and it provides information necessary for chronic disease management by analyzing life habits. Bio status recognition can provide proper service through the extraction of contextual data relevant to chronic disease patients. The context motion tracking provides emergency situation monitoring service accordingly with alert and symptom level in case of detecting symptoms through measured results and analysis. Semantic inference engine for context awareness conducts active and intelligent analysis on health condition and life patterns. Since it can properly correspond to extraordinary circumstances, it provides necessary service environment for emergency situation or symptoms. The cameras, speakers, and sensors are installed accordingly with the structure of indoor living space of user and the contextual information is transmitted from them. Considering the user convenience, motion history image is used for the motion recognition and continuous tracking from video. The system detects the patterns of expertise based on life pattern and psychological state through life log based motion detection and provides the service accordingly. It provides health related information and emergency situation monitoring service to user at anytime and anywhere and it is easy to use with simple handling. As a result, this system has the advantage of being able to detect emergency situations realistically and intuitively.

71 citations


Journal ArticleDOI
TL;DR: This paper proposes an upgrade version MGMR++ to eliminate GPU memory limitation and a pipelined version, PMGMR, to handle the Big Data challenge through both CPU memory and hard disks and achieves about 2.5-fold performance improvement.
Abstract: MapReduce is a popular data-parallel processing model encompassed with recent advances in computing technology and has been widely exploited for large-scale data analysis. The high demand on MapReduce has stimulated the investigation of MapReduce implementations with different architectural models and computing paradigms, such as multi-core clusters, Clouds, Cubieboards and GPUs. Particularly, current GPU-based MapReduce approaches mainly focus on single-GPU algorithms and cannot handle large data sets, due to the limited GPU memory capacity. Based on the previous multi-GPU MapReduce version MGMR, this paper proposes an upgrade version MGMR++ to eliminate GPU memory limitation and a pipelined version, PMGMR, to handle the Big Data challenge through both CPU memory and hard disks. MGMR++ is extended from MGMR with flexible C++ templates and CPU memory utilization, while PMGMR fine-tuned the performance through the latest GPU features such as streams and Hyper-Q as well as hard disk utilization. Compared to MGMR (Jiang et al., Cluster Computing 2013), the proposed schemes achieve about 2.5-fold performance improvement, increase system scalability, and allow programmers to write straightforward MapReduce code for Big Data.

65 citations


Journal ArticleDOI
TL;DR: Experimental results show that the proposed IDP-based ASR system performs reasonably well even when the speech is transmitted via smart phones.
Abstract: Cloud computing brings several advantages such as flexibility, scalability, and ubiquity in terms of data acquisition, data storage, and data transmission. This can help remote healthcare among other applications in a great deal. This paper proposes a cloud based framework for speech enabling healthcare. In the proposed framework, the patients or any healthy person seeking for some medical assistance can send his/her request by speech commands. The commands are managed and processed in the cloud server. Any doctor with proper authentication can receive the request. By analyzing the request, the doctor can assist the patient or the person. This paper also proposes a new feature extraction technique, namely, interlaced derivative pattern (IDP), to automatic speech recognition (ASR) system to be deployed into the cloud server. The IDP exploits the relative Mel-filter bank coefficients along different neighborhood directions from the speech signal. Experimental results show that the proposed IDP-based ASR system performs reasonably well even when the speech is transmitted via smart phones.

64 citations


Journal ArticleDOI
Ze Deng1, Hu Yangyang1, Mao Zhu1, Xiaohui Huang1, Du Bo1 
TL;DR: This study has explored the feasibility to utilize the contemporary general-purpose computing on the graphics processing unit (GPGPU) with the GPGPU-aided clustering approach parallelized the Tra-POPTICS with the Hyper-Q feature of Kelper GPU and massive GPU threads.
Abstract: Clustering trajectory data is an important way to mine hidden information behind moving object sampling data, such as understanding trends in movement patterns, gaining high popularity in geographic information and so on. In the era of `Big data', the current approaches for clustering trajectory data generally do not apply for excessive costs in both scalability and computing performance for trajectory big data. Aiming at these problems, this study first proposes a new clustering algorithm for trajectory big data, namely Tra-POPTICS by modifying a scalable clustering algorithm for point data (POPTICS). Tra-POPTICS has employed the spatiotemporal distance function and trajectory indexing to support trajectory data. Tra-POPTICS can process the trajectory big data in a distributed manner to meet a great scalability. Towards providing a fast solution to clustering trajectory big data, this study has explored the feasibility to utilize the contemporary general-purpose computing on the graphics processing unit (GPGPU). The GPGPU-aided clustering approach parallelized the Tra-POPTICS with the Hyper-Q feature of Kelper GPU and massive GPU threads. The experimental results indicate that (1) the Tra-POPTICS algorithm has a comparable clustering quality with T-OPTICS (the state of art work of clustering trajectories in a centralized fashion) and outperforms T-OPTICS by average four times in terms of scalability, and (2) the G-Tra-POPTICS has a comparable clustering quality with T-POPTICS as well and further gains about 30 speedup on average for clustering trajectories comparing to Tra-POPTICS with eight threads. The proposed algorithms exhibit great scalability and computing performance in clustering trajectory big data.

Journal ArticleDOI
TL;DR: A global architecture is proposed for QoS based scheduling for big data application to distributed cloud datacenter at two levels which are coarse grained and fine grained, and results indicated better QoS achievement and 33.15 % cost gain of the proposed architecture over traditional Amazon methods.
Abstract: Big data is one of the major technology usages for business operations in today's competitive market. It provides organizations a powerful tool to analyze large unstructured data to make useful decisions. Result quality, time, and price associated with big data analytics are very important aspects for its success. Selection of appropriate cloud infrastructure at coarse and fine grained level will ensure better results. In this paper, a global architecture is proposed for QoS based scheduling for big data application to distributed cloud datacenter at two levels which are coarse grained and fine grained. At coarse grain level, appropriate local datacenter is selected based on network distance between user and datacenter, network throughput and total available resources using adaptive K nearest neighbor algorithm. At fine grained level, probability triplet (C, I, M) is predicted using naive Bayes algorithm which provides probability of new application to fall in compute intensive (C), input/output intensive (I) and memory intensive (M) categories. Each datacenter is transformed into a pool of virtual clusters capable of executing specific category of jobs with specific (C, I, M) requirements using self organized maps. Novelty of study is to represent whole datacenter resources in a predefined topological ordering and executing new incoming jobs in their respective predefined virtual clusters based on their respective QoS requirements. Proposed architecture is tested on three different Amazon EMR datacenters for resource utilization, waiting time, availability, response time and estimated time to complete the job. Results indicated better QoS achievement and 33.15 % cost gain of the proposed architecture over traditional Amazon methods.

Journal ArticleDOI
TL;DR: Using AprioirAll algorithm-based sequential pattern profile analysis, bio-detection can detect a user who is undergoing an emergency based on abnormal patterns and detects the maximum sequence that can satisfy the minimum support in a given transaction.
Abstract: Due to the development of IT convergence technologies, increased attention has focused on smart health service platforms to detect emergency situations related to chronic disease, telemedicine, silvercare, and wellness. Moreover, there is a high demand for technologies that can properly judge a situation and provide suitable countermeasures or health information if an emergency situation occurs. In this paper, we propose the sequential pattern analysis based bio-detection for smart health services. A smart health service platform is able to save bio-images and their locations detected in a smart health surveillance area where CCD cameras are installed. When a person's figure is saved, the route tracing detects any movement and then traces its location. In addition, the platform analyzes the perceived bio-images and sequential patterns in order to determine whether or not the emergency situation is normal. Using AprioirAll algorithm-based sequential pattern profile analysis, bio-detection can detect a user who is undergoing an emergency based on abnormal patterns. It performs this task by managing information obtained from data and trace analyses, and it starts bio-detection only when there are patterns not conforming to sequential patterns. In other words, bio-detection detects the maximum sequence that can satisfy the minimum support in a given transaction. Sequential pattern profile analysis based on life-logs can analyze normal and abnormal profiles to provide health guidelines.

Journal ArticleDOI
TL;DR: NGEP divides individuals to several niches to evolve separately and fuses selected niches according to the similarities of the best individuals to ensure the dispersibility of chromosomes, and adjusts the fitness function to adapt to the needs of the underlying applications.
Abstract: Analyses and applications of big data require special technologies to efficiently process large number of data. Mining association rules focus on obtaining relations between data. When mining association rules in big data, conventional methods encounter severe problems incurred by the tremendous cost of computing and inefficiency to achieve the goal. This study proposes an evolutionary algorithm to address these problems, namely Niche-Aided Gene Expression Programming (NGEP). The NGEP algorithm (1) divides individuals to several niches to evolve separately and fuses selected niches according to the similarities of the best individuals to ensure the dispersibility of chromosomes, and (2) adjusts the fitness function to adapt to the needs of the underlying applications. A number of experiments have been performed to compare NGEP with the FP-Growth and Apriori algorithms to evaluate the NGEP's performance in mining association rules with a dataset of measurement for environment pressure (Iris dataset) and an Artificial Simulation Database (ASD). Experimental results indicate that NGEP can efficiently achieve more association rules (36 vs. 33 vs. 25 in Iris dataset experiments and 57 vs. 44 vs. 44 in ASD experiments) with a higher accuracy rate (74.8 vs. 53.2 vs. 50.6 % in Iris dataset experiments and 95.8 vs. 77.4 vs. 80.3 % in ASD experiments) and the time of computing is also much less than the other two methods.

Journal ArticleDOI
TL;DR: Experimental results show that the proposed MapReduce algorithm with spatial grid index consumes less time than its peer without spatial index, and the proposed algorithm has an upward speed-up ratio when more nodes of Hadoop framework are used.
Abstract: As one of the important operations in Geographic Information System (GIS) spatial analysis, polygon overlay processing is a time-consuming task in many big data cases. In this paper, a specially designed MapReduce algorithm with grid index is proposed to decrease the running time. Our proposed algorithm can reduce the times of calling intersection computation by the aid of grid index. The experiment is carried out on the cloud framework based on Hadoop built by ourselves. Experimental results show that our algorithm with spatial grid index consumes less time than its peer without spatial index. Moreover, the proposed algorithm has an upward speed-up ratio when more nodes of Hadoop framework are used. Nevertheless, with the increase of nodes, the upward trend of speed-up ratio slows down.

Journal ArticleDOI
TL;DR: The results show that agents, through autonomous and dynamic collaboration, can efficiently balance loads in a distributed manner outperforming centralized approaches with a performance comparable to commercial solutions, namely Red Hat, while migrating fewer VMs.
Abstract: Cloud data centers are generally composed of heterogeneous commodity servers hosting multiple virtual machines (VMs) with potentially different specifications and fluctuating resource usages. This may cause a resource usage imbalance within servers that may result in performance degradation and violations to service level agreements. This work proposes a collaborative agent-based problem solving technique capable of balancing workloads across commodity, heterogeneous servers by making use of VM live migration. The agents are endowed with (i) migration heuristics to determine which VMs should be migrated and their destination hosts, (ii) migration policies to decide when VMs should be migrated, (iii) VM acceptance policies to determine which VMs should be hosted, and (iv) front-end load balancing heuristics. The results show that agents, through autonomous and dynamic collaboration, can efficiently balance loads in a distributed manner outperforming centralized approaches with a performance comparable to commercial solutions, namely Red Hat, while migrating fewer VMs.

Journal ArticleDOI
TL;DR: The following research investigates the development and application of ontology and rules to build an evidence-based, reusable and cross-domain knowledge base for collaborative patient care.
Abstract: The volume, velocity and variety of data generated today require special techniques and technologies for analysis and inferencing. These challenges are significantly pronounced within healthcare where data is being generated exponentially from biomedical research and electronic patient records. Moreover, with the increasing importance on holistic care, it has become vital to analyse information from all the domains that affect patient health, such as medical and oral conditions. A lot of medical and oral conditions are inter-dependent and call for collaborative management; however, technical issues such as heterogeneous data collection and storage formats, limited sharing of patient information, and lack of decision support over the shared information among others have seriously limited collaborative patient care. To address the above issues, the following research investigates the development and application of ontology and rules to build an evidence-based, reusable and cross-domain knowledge base. An example implementation of the knowledge base in Protege is also done to evaluate the effectiveness of the approach for reasoning and decision support of cross-domain patient information.

Journal ArticleDOI
TL;DR: Computational intelligence based on Dempster–Shafer theory is applied to prove with digital evidence, the presence of malicious insider in the critical networks with utmost accuracy.
Abstract: The Insider threat is minimally addressed by current information security practices, yet the insider poses the most serious threat to organization through various malicious activities. Forensic investigation is a technique used to prove the presence of malicious insider with digital evidence. The proposed surveillance mechanism for countering insider threats operates in two phases. In phase one, the network has to be monitored for incoming and outgoing packets. The information is transferred using packets, and these packets are monitored and captured and the important features are extracted. By performing investigation on the captured packets, information related to suspicious activities can be obtained. In phase two, we mine various log files which are considered to posses vital traces of information when insider attack has been performed. The analysis of the log files is performed in order to extract the key pattern from files. The extracted patterns from log files are further processed. The suspicious data patterns are grouped into clusters to trace the anomaly. They are classified as legal and anomaly pattern with the help of KNN classifier .If anomaly is traced, the user's past activities are referred and a cross check is made with the features of captured packets the computational intelligence based on Dempster---Shafer theory is applied to prove with digital evidence, the presence of malicious insider in the critical networks with utmost accuracy.

Journal ArticleDOI
TL;DR: A novel distributed community structure mining framework that uses the local information from the nodes and their neighbors, instead of the pagerank, to calculate the probability distribution of the nodes.
Abstract: Social media networks are playing increasingly prominent role in people's daily life. Community structure is one of the salient features of social media network and has been applied to practical applications, such as recommendation system and network marketing. With the rapid expansion of social media size and surge of tremendous amount of information, how to identify the communities in big data scenarios has become a challenge. Based on our previous work and the map equation (an equation from information theory for community mining), we develop a novel distributed community structure mining framework. In the framework, (1) we propose a new link information update method to try to avoid data writing related operations and try to speedup the process. (2) We use the local information from the nodes and their neighbors, instead of the pagerank, to calculate the probability distribution of the nodes. (3) We exclude the network partitioning process from our previous work and try to run the map equation directly on MapReduce. Empirical results on real-world social media networks and artificial networks show that the new framework outperforms our previous work and some well-known algorithms, such as Radetal, FastGN, in accuracy, velocity and scalability.

Journal ArticleDOI
TL;DR: DI-MMAP, a high-performance runtime that memory-maps large external data sets into an application’s address space and shows significantly better performance than the Linux mmap system call, is presented.
Abstract: We present DI-MMAP, a high-performance runtime that memory-maps large external data sets into an application's address space and shows significantly better performance than the Linux mmap system call. Our implementation is particularly effective when used with high performance locally attached Flash arrays on highly concurrent, latency-tolerant data-intensive HPC applications. We describe the kernel module and show performance results on a benchmark test suite, a new bioinformatics metagenomic classification application, and on a level-asynchronous Breadth-First Search (BFS) graph traversal algorithm. Using DI-MMAP, the metagenomics classification application performs up to 4× better than standard Linux mmap. A fully external memory configuration of BFS executes up to 7.44× faster than traditional mmap. Finally, we demonstrate that DI-MMAP shows scalable out-of-core performance for BFS traversal in main memory constrained scenarios. Such scalable memory constrained performance would allow a system with a fixed amount of memory to solve a larger problem as well as provide memory QoS guarantees for systems running multiple data-intensive applications.

Journal ArticleDOI
TL;DR: A new genetic algorithm (GA) is introduced, namely multi-level grouping GA (MLGGA), which is designed for multi- level bin packing problems such as that of carbon footprint reduction in a distributed cloud over a network of data centers.
Abstract: Global warming caused by greenhouse gas (GHG) emissions is one of the main concerns for both developed and developing countries. In a fast growing Information and Communication Technology industry, current energy efficiency methodologies are not sufficient for new raising problems such as optimization of complex distributed systems. Therefore, proper methodologies tailored for this type of systems could significantly reduce their GHG emissions. In this paper, a new genetic algorithm (GA) is introduced, namely multi-level grouping GA (MLGGA), which is designed for multi-level bin packing problems such as that of carbon footprint reduction in a distributed cloud over a network of data centers. The new MLGGA algorithm is tested on real data in a simulation platform, and its results are compared with other state-of-the-art methodologies. The results show a significant increase in the performance achieved by the proposed algorithm.

Journal ArticleDOI
TL;DR: An M2M-based smart health service for human UI/UX using motion recognition that can easily respond to dynamic changes in the wireless environment and conduct systematic management based on user’s motion recognition using technology to support mobility among sensor nodes in M1M.
Abstract: Home networks currently dominated by human---object or human---human information production, exchange, processing, and paradigms are transitioning to machine to machine (M2M) due to the sudden introduction of embedded devices. Recently, due to the spread of IT equipment, more M2M-related devices are being used, and M2M-based projects are underway in various fields such as M2M-based u-city, u-port, u-work, u-traffic, etc. M2M has been applied in various fields, and u-healthcare is attracting attention in the M2M medical field. U-healthcare refers to technology in which ordinary patients can receive prescription services from experts by continuously monitoring changes in their health status during daily life at home based on wired and wireless communications infrastructures. In this paper, we propose an M2M-based smart health service for human UI/UX using motion recognition. Non-IP protocol, not TCP/IP protocol, has been used in sensor networks applied to M2M-based u-healthcare. However, sensors should be connected to the Internet in order to expand the use of services and facilitate management of the M2M-based sensor network. Therefore, we designed an M2M-based smart health service considering network mobility since data measured by the sensors should be transferred over the Internet. Unlike existing healthcare platforms, M2M-based smart health services have been developed for motion recognition as well as bio-information. Smart health services for motion recognition can sense four kinds of emotions, anger, sadness, neutrality, and joy, as well as stress using sensors. Further, they can measure the state of the individual by recognizing a user's respiratory and heart rates using an ECG sensor. In the existing medical environment, most medical information systems managing patient data use a centralized server structure. Using a fixed network, it is easy to collect and process limited data, but there are limits to processing a large amount of data collected from M2M devices in real-time. Generally, M2M communication used in u-healthcare consists of many networked devices and gateways. An M2M network may use standardized wireless technology based on the requirements of a particular device. Network mobility occurs when the connecting point changes according to the movement of any network, and the terminal can be connected without changing its address. If the terminal within the network communicates with any corresponding node, communication between the terminal and corresponding node should be continuously serviced without discontinuation. The method proposed in this paper can easily respond to dynamic changes in the wireless environment and conduct systematic management based on user's motion recognition using technology to support mobility among sensor nodes in M2M.

Journal ArticleDOI
TL;DR: Experimental results on two real data-intensive applications show that SLDP is energy-efficient, space-saving and able to improve MapReduce performance in a heterogeneous Hadoop cluster significantly.
Abstract: Data placement decision of Hadoop distributed file system (HDFS) is very important for the data locality which is a primary criterion for task scheduling of MapReduce model and eventually affects the application performance. The existing HDFS's rack-aware data placement strategy and replication scheme are work well with MapReduce framework in homogeneous Hadoop clusters, but in practice, such data placement policy can noticeably reduce MapReduce performance and may cause increasingly energy dissipation in heterogeneous environments. Besides that, HDFS employs an inflexible replica factor acquiescently for each data block, which will give rise to unnecessary waste of storage space when there is a lot of inactive data in Hadoop system. In this paper, we propose a novel data placement strategy (SLDP) for heterogeneous Hadoop clusters. SLDP adopts a heterogeneity aware algorithm to divide various nodes into several virtual storage tiers (VSTs) firstly, and then places data blocks across nodes in each VST circuitously according to the hotness of data. Furthermore, SLDP uses a hotness proportional replication to save disk space and also has an effective power control function. Experimental results on two real data-intensive applications show that SLDP is energy-efficient, space-saving and able to improve MapReduce performance in a heterogeneous Hadoop cluster significantly.

Journal ArticleDOI
TL;DR: This paper proposed ontology-driven slope modeling for disaster management service through the convergence of construction, transportation technology and IT to address the social issues related to disaster prevention and response and judging the potential risk of disasters to contribute to the safety of the public and the quality of their life.
Abstract: These days, with the development of information technology, new paradigms have been created through academical and technological convergence in various areas. The IT convergence draws much attention as the next generation technology for disaster prevention and management in the construction and transportation area. Along with global warming, global climate changes and unusual weather occur around the world, and consequently disasters become more huge. IT convergence based disaster management service makes it possible to quickly respond to unexpected disasters in the ubiquitous environment and mitigate the disasters. Although research on disaster prevention and management has constantly been conducted, it is relatively slow to develop the technology for disaster prediction and prevention. For efficient safety and disaster prevention and management in the next generation IT convergence, it is essential to establish a systematic disaster prevention technology and a disaster prevention information system. In this paper, we proposed ontology-driven slope modeling for disaster management service through the convergence of construction, transportation technology and IT. User profile, environment information, location information, weather index, slope stability, disaster, statistics and analysis of disasters, and forest fire disaster index are used to build internal context information, external context information, and service context information. Ontology-based context awareness modeling of the landslides and disasters generated is constructed, and relevant rules are generated by inference engine. Based on the ontology of external and internal context awareness, the rules of service inference derived by inference engine are produced using protege 5.0. According to the service inference rules, disaster control services best fitting for users' environment is provided. By addressing the social issues related to disaster prevention and response and judging the potential risk of disasters, the proposed method can contribute to improving the safety of the public and the quality of their life. Social consensus on the necessity of prevention of urban climate disasters can be formed easily, and a ripple effect is expected on the situational response to natural disaster.

Journal ArticleDOI
TL;DR: Simulations demonstrate near-optimality of the proposed algorithms in terms of makespan and fairness for the proposed load balancing scheme.
Abstract: In this paper an algorithm has been proposed to balance the loads in a distributed computing system based on game theory which models the load balancing problem as a non-cooperative game among the users. The proposed load balancing game, which is infinite and with perfect information, aims to establish fairness both in system and user level. The optimal or near-optimal solution of the game is approximated by a genetic algorithm and an introduced hybrid population-based simulated annealing algorithm, using the concept of Nash equilibrium. Since all users responses are shown to converge to their near-optimal solution, distribution of users' jobs is "fair". Simulations demonstrate near-optimality of the proposed algorithms in terms of makespan and fairness for the proposed load balancing scheme.

Journal ArticleDOI
TL;DR: A novel scheme to deliver real-time video contents through an improved UDP-based protocol is proposed to improve the practicability of multi-media transmission in the healthcare system.
Abstract: With the rise of the robot and cloud computing technology, human-centric healthcare service attracts widely attention in order to meet the great challenges of traditional healthcare, especially the limited medical resources. This paper presents a healthcare system based on cloud computing and robotics, which consists of wireless body area networks, robot, software system and cloud platform. This system is expected to accurately measure user's physiological information for analysis and feedback, which is assisted by the robot integrated with various sensors. In order to improve the practicability of multi-media transmission in the healthcare system, this paper proposes a novel scheme to deliver real-time video contents through an improved UDP-based protocol. Finally, the proposed approach is evaluated with experimental study based on a real testbed.

Journal ArticleDOI
TL;DR: A learning automata-assisted distributive intrusion detection system is designed based on clustering that yields an improvement of 10 % in detection rate of malicious nodes when compared with the existing schemes.
Abstract: In recent years, vehicular cloud computing (VCC) has emerged as a new technology which is being used in wide range of applications in the area of multimedia-based healthcare applications. In VCC, vehicles act as the intelligent machines which can be used to collect and transfer the healthcare data to the local, or global sites for storage, and computation purposes, as vehicles are having comparatively limited storage and computation power for handling the multimedia files. However, due to the dynamic changes in topology, and lack of centralized monitoring points, this information can be altered, or misused. These security breaches can result in disastrous consequences such as-loss of life or financial frauds. Therefore, to address these issues, a learning automata-assisted distributive intrusion detection system is designed based on clustering. Although there exist a number of applications where the proposed scheme can be applied but, we have taken multimedia-based healthcare application for illustration of the proposed scheme. In the proposed scheme, learning automata (LA) are assumed to be stationed on the vehicles which take clustering decisions intelligently and select one of the members of the group as a cluster-head. The cluster-heads then assist in efficient storage and dissemination of information through a cloud-based infrastructure. To secure the proposed scheme from malicious activities, standard cryptographic technique is used in which the auotmaton learns from the environment and takes adaptive decisions for identification of any malicious activity in the network. A reward and penalty is given by the stochastic environment where an automaton performs its actions so that it updates its action probability vector after getting the reinforcement signal from the environment. The proposed scheme was evaluated using extensive simulations on ns-2 with SUMO. The results obtained indicate that the proposed scheme yields an improvement of 10 % in detection rate of malicious nodes when compared with the existing schemes.

Journal ArticleDOI
TL;DR: The results of this extensive performance study show that the proposed algorithm can scale recommender systems for all-pairs similarity searching, and this paper details the development and employment of the MapReduce framework.
Abstract: Recommender systems have been proven useful in numerous contemporary applications and helping users effectively identify items of interest within massive and potentially overwhelming collections. Among the recommender system techniques, the collaborative filtering mechanism is the most successful; it leverages the similar tastes of similar users, which can serve as references for recommendation. However, a major weakness for the collaborative filtering mechanism is its performance in computing the pairwise similarity of users. Thus, the MapReduce framework was examined as a potential means to address this performance problem. This paper details the development and employment of the MapReduce framework, examining whether it improves the performance of a personal ontology based recommender system in a digital library. The results of this extensive performance study show that the proposed algorithm can scale recommender systems for all-pairs similarity searching.

Journal ArticleDOI
TL;DR: This paper proposed a novel approach to address the cold-start problem by combining similarity values obtain from a movie “Facebook Pages” and first compute users’ similarity according to the rating cast on the authors' Movie Rating System to produce a new user’s similarity value.
Abstract: Recommender systems are generally known as predictive ecosystem which recommends an appropriate list of items that may imply their similar preference or interest. Nevertheless, most discussed issues in recommendation system research domain are the cold-start problem. In this paper we proposed a novel approach to address this problem by combining similarity values obtain from a movie "Facebook Pages". To achieve this, we first compute users' similarity according to the rating cast on our Movie Rating System. Then, we combined similarity value obtain from user's genre interest in "Like" information extracted from "Facebook Pages". Finally, all the similarity values are combined to produce a new user's similarity value. Our experiment results show that our approach is outperformed in cold-start problem compared to the benchmark algorithms. To evaluate whether our system is strong enough to recommend higher accuracy recommendation to users, we also conducted prediction coverage in this research work.

Journal ArticleDOI
TL;DR: Design factors for interactive mobile AR storytelling systems are presented, and narrative theory is applied to design and explore actual possibilities of interactivity levels achievable using mobile AR medium to explore the potential of AR as a general peoples’ creative medium.
Abstract: An increasing number of personal mobile devices such as smartphone are playing an important role in our daily lives. Among constituing technologies for such pervasive computing environment, mobile augmented reality (mobile AR) is a technique that extends physical world with virtual objects or information in truly mobile settings. That is, away from the carefully conditioned environments of research laboratories and special purpose work areas, general people can engage in location aware or physical object related content using their portable devices. Deploying attractive mobile AR services, however, has been regarded quite difficult because computer-vision based relevant techniques are very complex. Because of that, discussions on the practical possibilities of AR have tended to stay in location-aware information delivery service or as a consumption platform of developer-supplied content. Against those popular research trends pursuing technical advances of AR, this paper attempts to explore the potential of AR as a general peoples' creative medium. It can provide a good storytelling environment because a specific location or real-world objects can easily become story subject matters and stimulate people's imagination. There exist, however, some barriers preventing users to actively participate in onsite creative activity constructing their own story. In mobile situation, it could be cumbersome to create a narrative at site, requiring to author some sequences of events. Therefore, the careful design regarding such situational characteristics of mobile AR is essential for realizing real-time mobile interactive AR narrative. This paper thus presented design factors for interactive mobile AR storytelling systems, and applied narrative theory to design and explore actual possibilities of interactivity levels achievable using mobile AR medium. Technical difficulties are ruled out as possible so that the design could be focused on real-time onsite story creation activity. Proposed design concepts are developed as three kinds of prototypes, each of them reflecting different level of narrative interactivity. They are also deployed in public exhibition and the suggested design factors are evaluated in terms of user experience and interviews. The result of evaluation shows that even simple AR interactive narrative setup can have strong power to allow users playful experience of in-situ narrative creation.