scispace - formally typeset
Search or ask a question

Showing papers in "Journal of Grid Computing in 2019"


Journal ArticleDOI
TL;DR: A framework called EDoT is proposed based on the research trends, common practices, and techniques used for detecting events on Twitter and can serve as a guideline for developing event detection methods, especially for researchers who are new in this area.
Abstract: In the last few years, Twitter has become a popular platform for sharing opinions, experiences, news, and views in real-time. Twitter presents an interesting opportunity for detecting events happening around the world. The content (tweets) published on Twitter are short and pose diverse challenges for detecting and interpreting event-related information. This article provides insights into ongoing research and helps in understanding recent research trends and techniques used for event detection using Twitter data. We classify techniques and methodologies according to event types, orientation of content, event detection tasks, their evaluation, and common practices. We highlight the limitations of existing techniques and accordingly propose solutions to address the shortcomings. We propose a framework called EDoT based on the research trends, common practices, and techniques used for detecting events on Twitter. EDoT can serve as a guideline for developing event detection methods, especially for researchers who are new in this area. We also describe and compare data collection techniques, the effectiveness and shortcomings of various Twitter and non-Twitter-based features, and discuss various evaluation measures and benchmarking methodologies. Finally, we discuss the trends, limitations, and future directions for detecting events on Twitter.

68 citations


Journal ArticleDOI
TL;DR: A framework for self-management of cloud resources for execution of clustered workloads named as SCOOTER is proposed that efficiently schedules the provisioned cloud resources and maintains the Service Level Agreement (SLA) by considering properties of self- management and the maximum possible QoS parameters are required to improve cloud based services.
Abstract: Provisioning of adequate resources to cloud workloads depends on the Quality of Service (QoS) requirements of these cloud workloads. Based on workload requirements (QoS) of cloud users, discovery and allocation of best workload-resource pair is an optimization problem. Acceptable QoS can be offered only if provisioning of resources is appropriately controlled. So, there is a need for a QoS-based resource provisioning framework for the autonomic scheduling of resources to observe the behavior of the services and adjust it dynamically in order to satisfy the QoS requirements. In this paper, framework for self-management of cloud resources for execution of clustered workloads named as SCOOTER is proposed that efficiently schedules the provisioned cloud resources and maintains the Service Level Agreement (SLA) by considering properties of self-management and the maximum possible QoS parameters are required to improve cloud based services. Finally, the performance of SCOOTER has been evaluated in a cloud environment that demonstrates the optimized QoS parameters such as execution cost, energy consumption, execution time, SLA violation rate, fault detection rate, intrusion detection rate, resource utilization, resource contention, throughput and waiting time.

58 citations


Journal ArticleDOI
TL;DR: A workflow-net based framework for agent cooperation is proposed to enable collaboration among fog computing devices and form a cooperative IoT service delivery system and results show that the cooperation process increases the number of achieved tasks and is performed in a timely manner.
Abstract: Most Internet of Things (IoT)-based service requests require excessive computation which exceeds an IoT device’s capabilities. Cloud-based solutions were introduced to outsource most of the computation to the data center. The integration of multi-agent IoT systems with cloud computing technology makes it possible to provide faster, more efficient and real-time solutions. Multi-agent cooperation for distributed systems such as fog-based cloud computing has gained popularity in contemporary research areas such as service composition and IoT robotic systems. Enhanced cloud computing performance gains and fog site load distribution are direct achievements of such cooperation. In this article, we propose a workflow-net based framework for agent cooperation to enable collaboration among fog computing devices and form a cooperative IoT service delivery system. A cooperation operator is used to find the topology and structure of the resulting cooperative set of fog computing agents. The operator shifts the problem defined as a set of workflow-nets into algebraic representations to provide a mechanism for solving the optimization problem mathematically. IoT device resource and collaboration capabilities are properties which are considered in the selection process of the cooperating IoT agents from different fog computing sites. Experimental results in the form of simulation and implementation show that the cooperation process increases the number of achieved tasks and is performed in a timely manner.

57 citations


Journal ArticleDOI
TL;DR: This paper proposes to leverage state information about the network to inform service placement decisions, and to do so through a fast heuristic algorithm, which is critical to quickly react to changing conditions, and shows that its results are relevant for contributing to higher QoE, a crucial parameter for using services from volunteer-based systems.
Abstract: Community networks (CNs) have gained momentum in the last few years with the increasing number of spontaneously deployed WiFi hotspots and home networks. These networks, owned and managed by volunteers, offer various services to their members and to the public. While Internet access is the most popular service, the provision of services of local interest within the network is enabled by the emerging technology of CN micro-clouds. By putting services closer to users, micro-clouds pursue not only a better service performance, but also a low entry barrier for the deployment of mainstream Internet services within the CN. Unfortunately, the provisioning of these services is not so simple. Due to the large and irregular topology, high software and hardware diversity of CNs, a “careful” placement of micro-clouds services over the network is required to optimize service performance. This paper proposes to leverage state information about the network to inform service placement decisions, and to do so through a fast heuristic algorithm, which is critical to quickly react to changing conditions. To evaluate its performance, we compare our heuristic with one based on random placement in Guifi.net, the biggest CN worldwide. Our experimental results show that our heuristic consistently outperforms random placement by 2x in bandwidth gain. We quantify the benefits of our heuristic on a real live video-streaming service, and demonstrate that video chunk losses decrease significantly, attaining a 37% decrease in the packet loss rate. Further, using a popular Web 2.0 service, we demonstrate that the client response times decrease up to an order of magnitude when using our heuristic. Since these improvements translate in the QoE (Quality of Experience) perceived by the user, our results are relevant for contributing to higher QoE, a crucial parameter for using services from volunteer-based systems and adapting CN micro-clouds as an eco-system for service deployment.

40 citations


Journal ArticleDOI
TL;DR: This algorithm is more efficient, fast, and less complex and spawns improved results and outperforms some of the best techniques used for mammogram classification based on Sensitivity, Specificity, Accuracy, and Area under the curve (ROC).
Abstract: Widespread use of electronic health records is a major cause of a massive dataset that ultimately results in Big Data. Computer-aided systems for healthcare can be an effective tool to automatically process such big data. Breast cancer is one of the major causes of high mortality rate among women in the world since it is difficult to detect due to lack of early symptoms. There is a number of techniques and advanced technologies available to detect breast tumors nowadays. One of the common approaches for breast tumour detection is mammography. The similarity between the normal (unaffected) tissues and the masses (affected) tissues is often very high that leads to false positives (FP). In the field of medicine, the sensitivity to false positives is very high because it results in false diagnosis and can lead to serious consequences. Therefore, it is a challenge for the researchers to correctly distinguish between the normal and affected tissues to increase the detection accuracy. Radiologists use Gabor filter bank for feature extraction and apply it to the entire input image that yields poor results. The proposed system optimizes the Gabor filter bank to select most appropriate Gabor filter using a metaheuristic algorithm known as “Cuckoo Search”. The proposed algorithm is run over sub-images in order to extract more descriptive features. Moreover, feature subset selection is used to reduce feature size because feature extracted from the segmented region of interest will be high dimensional and cannot be handled easily. This algorithm is more efficient, fast, and less complex and spawns improved results. The proposed method is tested on 2000 mammograms taken from DDSM database and outperforms some of the best techniques used for mammogram classification based on Sensitivity, Specificity, Accuracy, and Area under the curve (ROC).

39 citations


Journal ArticleDOI
TL;DR: Two IoT-aware multi-resource task scheduling algorithms for heterogeneous cloud environment namely main resource load balancing and time balancing are proposed to obtain better result of load balance, Service-Level Agreement (SLA) and IoT task response time and meanwhile to reduce the energy consumption as much as possible.
Abstract: Cloud computing and Internet of Things (IoT) are two of the most important technologies that have significantly changed human’s life. However, with the growing prevalence of Cloud-IoT paradigm, the load imbalance and higher SLA lead to more resource wastage and energy consumption. Although there are many researches that study Cloud-IoT from the perspective of offloading side, few of them have focused on how the offloaded workload are dealt with in Cloud. This paper proposes two IoT-aware multi-resource task scheduling algorithms for heterogeneous cloud environment namely main resource load balancing and time balancing. The algorithms aim to obtain better result of load balance, Service-Level Agreement (SLA) and IoT task response time and meanwhile to reduce the energy consumption as much as possible. They both are devised to assign single task to a properly selected Virtual Machine (VM) each time. The task placed in a pre-processed queue is assigned sequentially each time. And the VM selection rule is carried out based on the newly inventive ideas called relative load or relative time cost. Besides, two customized parameters that influence the result of pre-process tasks are provided for users or administrators to flexibly control the behavior of the algorithms. According to the experiments, the main resource load balancing performs well in terms of SLA and load balance, while time balancing is good at saving time and energy. Besides, both of them perform well in IoT task response time.

36 citations


Journal ArticleDOI
TL;DR: This paper proposes DeepSeq – a deep learning architecture that utilizes only the protein sequence information to predict its associated functions and discusses how the same architecture can be used to solve even more complicated problems such as prediction of 2D and 3D structure as well as protein-protein interactions.
Abstract: Accurate annotation of protein functions is important for a profound understanding of molecular biology. A large number of proteins remain uncharacterized because of the sparsity of available supporting information. For a large set of uncharacterized proteins, the only type of information available is their amino acid sequence. This motivates the need to make sequence based computational techniques that can precisely annotate uncharacterized proteins. In this paper, we propose DeepSeq – a deep learning architecture – that utilizes only the protein sequence information to predict its associated functions. The prediction process does not require handcrafted features; rather, the architecture automatically extracts representations from the input sequence data. Results of our experiments with DeepSeq indicate significant improvements in terms of prediction accuracy when compared with other sequence-based methods. Our deep learning model achieves an overall validation accuracy of 86.72%, with an F1 score of 71.13%. We achieved improved results for protein function prediction problem through DeepSeq, by utilizing sequence only information. Moreover, using the automatically learned features and without any changes to DeepSeq, we successfully solved a different problem i.e. protein function localization, with no human intervention. Finally, we discuss how the same architecture can be used to solve even more complicated problems such as prediction of 2D and 3D structure as well as protein-protein interactions.

35 citations


Journal ArticleDOI
TL;DR: A score-based edge service scheduling algorithm that evaluates network, compute, and reliability capabilities of edge nodes and outputs the maximum scoring mapping between resources and services with regard to four critical aspects of service quality is proposed.
Abstract: Latency-sensitive and data-intensive applications, such as IoT or mobile services, are leveraged by Edge computing, which extends the cloud ecosystem with distributed computational resources in proximity to data providers and consumers. This brings significant benefits in terms of lower latency and higher bandwidth. However, by definition, edge computing has limited resources with respect to cloud counterparts; thus, there exists a trade-off between proximity to users and resource utilization. Moreover, service availability is a significant concern at the edge of the network, where extensive support systems as in cloud data centers are not usually present. To overcome these limitations, we propose a score-based edge service scheduling algorithm that evaluates network, compute, and reliability capabilities of edge nodes. The algorithm outputs the maximum scoring mapping between resources and services with regard to four critical aspects of service quality. Our simulation-based experiments on live video streaming services demonstrate significant improvements in both network delay and service time. Moreover, we compare edge computing with cloud computing and content delivery networks within the context of latency-sensitive and data-intensive applications. The results suggest that our edge-based scheduling algorithm is a viable solution for high service quality and responsiveness in deploying such applications.

31 citations


Journal ArticleDOI
TL;DR: This work proposes a novel distributed protocol for a face recognition system that exploits the computational capabilities of the surveillance devices (i.e. cameras) to perform the recognition of the person.
Abstract: Video surveillance systems have become an indispensable tool for the security and organization of public and private areas. Most of the current commercial video surveillance systems rely on a classical client/server architecture to perform face and object recognition. In order to support the more complex and advanced video surveillance systems proposed in the last years, companies are required to invest resources in order to maintain the servers dedicated to the recognition tasks. In this work, we propose a novel distributed protocol for a face recognition system that exploits the computational capabilities of the surveillance devices (i.e. cameras) to perform the recognition of the person. The cameras fall back to a centralized server if their hardware capabilities are not enough to perform the recognition. In order to evaluate the proposed algorithm we simulate and test the 1NN and weighted kNN classification algorithms via extensive experiments on a freely available dataset. As a prototype of surveillance devices we have considered Raspberry PI entities. By means of simulations, we show that our algorithm is able to reduce up to 50% of the load from the server with no negative impact on the quality of the surveillance service.

27 citations


Journal ArticleDOI
TL;DR: The article discusses resource and application monitoring, resource management, and data forecast at both performance and architectural perspectives of enterprise systems, together with novel trends, including cloud elasticity and artificial intelligence-based load prediction algorithms.
Abstract: Today, enterprise applications impose more and more resource requirements to support an ascending number of clients and to deliver them an acceptable Quality of Service (QoS). To ensure such requirements are met, it is essential to apply appropriate resource and application monitoring techniques. Such techniques collect data to enable predictions and actions which can offer better system performance. Typically, system administrators need to consider different data sources, so making the relationship among them by themselves. To address these gaps and considering the context of general networked-based systems, we propose a survey that combines a discussion about system monitoring, data prediction, and resource management procedures in a unified view. The article discusses resource and application monitoring, resource management, and data forecast at both performance and architectural perspectives of enterprise systems. Our idea is to describe consolidated subjects such as monitoring metrics and resource scheduling, together with novel trends, including cloud elasticity and artificial intelligence-based load prediction algorithms. This survey links the aforesaid three pillars, emphasizing relationships among them and also pointing out opportunities and research challenges in the area.

27 citations


Journal ArticleDOI
TL;DR: A novel reservation plan adaptation system based on machine learning that allows the updating of a reservation plan initially prepared by an administrator and makes it possible to adapt reservation plans one or more weeks ahead.
Abstract: In this paper we propose a novel reservation plan adaptation system based on machine learning. In the context of cloud auto-scaling, an important issue is the ability to define and use a resource reservation plan, which enables efficient resource scheduling. If necessary, the plan may allocate new resources upon reservation where a sufficient amount of resources is available. Our solution allows the updating of a reservation plan initially prepared by an administrator. It makes it possible to adapt reservation plans one or more weeks ahead. Hence, it allows time for the administrator to analyze the plan and discover potential problems with resource under-provisioning or over-provisioning, which may prevent server overload in the former case and unnecessary expenses in the latter. It also makes it possible to extract and analyze the knowledge learned, which may provide useful information about resource usage characteristics. The proposed solution is tested on OpenStack using real Wikipedia server traffic data. Experimental results demonstrate that machine learning enables an improvement in resource usage.

Journal ArticleDOI
TL;DR: The experimental results show that the efficient job scheduling approach can veffectively reduce the job response time and improve the throughput of cluster, and the task scheduling method can reduce the response time of tasks, improve QoS satisfaction rate and minimize the cost of public cloud.
Abstract: With the advent of the era of big data, many companies have taken the most important steps in the hybrid cloud to handle large amounts of data. In a hybrid cloud environment, cloud burst technology enables applications to be processed at a lower cost in a private cloud and burst into the public cloud when the resources of the private cloud are exhausted. However, there are many challenges in hybrid cloud environment, such as the heterogeneous jobs, different cloud providers and how to deploy a new application with minimum monetary cost. In this paper, the efficient job scheduling approach for heterogeneous workloads in private cloud is proposed to ensure high resource utilization. Moreover, the task scheduling method based on BP neural network in hybrid cloud is proposed to ensure that the tasks can be completed within the specified deadline of the user. The experimental results show that the efficient job scheduling approach can veffectively reduce the job response time and improve the throughput of cluster. The task scheduling method can reduce the response time of tasks, improve QoS satisfaction rate and minimize the cost of public cloud.

Journal ArticleDOI
TL;DR: This paper is a systematic analysis of the usage of Bitcoin metadata over the years, and discusses all the known techniques to embed metadata in the Bitcoin blockchain, and extracts metadata from different angles.
Abstract: Besides recording transfers of currency, the Bitcoin blockchain is being used to save metadata — i.e. arbitrary pieces of data which do not affect transfers of bitcoins. This can be done by using different techniques, and for different purposes. For instance, a growing number of protocols embed metadata in the blockchain to certify and transfer the ownership of a variety of assets beyond cryptocurrency. A point of debate in the Bitcoin community is whether metadata negatively impact on the effectiveness of Bitcoin with respect to its primary function. This paper is a systematic analysis of the usage of Bitcoin metadata over the years. We discuss all the known techniques to embed metadata in the Bitcoin blockchain; we then extract metadata, and analyse them from different angles.

Journal ArticleDOI
TL;DR: This paper proposes a complex, semi-simulation environment that aims to provide a solution for these IoT challenges and designs an Android-based, mobile IoT device simulator called MobIoTSim, and proposes a customizable cloud gateway to manage these devices by receiving, processing and visualizing sensor data coming from MobIeTSim.
Abstract: The Internet of Things (IoT) is the latest trend of the current ICT evolution, represented by a huge amount of powerful smart devices that have started to appear on the Internet. By responding to this new trend, many cloud providers have started to offer services for IoT management. Recent advances have already shown that cloud computing can be used to serve IoT needs by performing data generation, processing and visualization tasks. In this currently forming ecosystem, IoT system developers need to purchase, connect and configure these devices, and they also have to choose the right infrastructure provider offering the combination of protocols and data structures fitting their applications. In this paper, we propose a complex, semi-simulation environment that aims to provide a solution for these IoT challenges. Our main contribution is the the design of an Android-based, mobile IoT device simulator called MobIoTSim. We also propose a customizable cloud gateway to manage these devices by receiving, processing and visualizing sensor data coming from MobIoTSim. To be as close as possible to real world application, we created an IoT trace archive service called SUMMON, which can be used to gather real-world sensor data, which can be used by MobIoTSim. Finally, we demonstrate how to create IoT applications utilizing numerous IoT devices with this environment, and evaluate the device management scalability and responsiveness of its components.

Journal ArticleDOI
TL;DR: A decentralised end-to-end voting platform (from voter to candidate) based on the block-chain technology, which study and exploit both the non-permissioned ledger of Bitcoin, and the MultiChain permissioned ledger, which is a permissioned public ledger.
Abstract: We propose a decentralised end-to-end voting platform (from voter to candidate) based on the block-chain technology. In particular, we study and exploit both the non-permissioned ledger of Bitcoin, and the MultiChain permissioned ledger. We describe the main architectural choices behind the two implementations, including the pre-voting and post-voting phases. Similar approaches are not as decentralised as our application, where it is possible to directly cast a vote to the block-chain, without any intermediate level. Benefits and drawbacks of each implementation are explained. The Bitcoin block-chain consists in a large number of already available nodes in the related peer-to-peer network; moreover, its reliability and resistance to attacks are also well established. With MultiChain we instead exploit a fine-grained permission system: MultiChain is a permissioned public ledger. Hence, with it we can also satisfy two more properties of end-to-end voting systems: uncoercibility and receipt-freeness and data confidentiality and neutrality. Moreover, we can avoid costs and price fluctuations related to Bitcoin.

Journal ArticleDOI
TL;DR: This work is the first of its kind to enhance text according to the emotional state detected by EEG brainwaves, and releases an individual from thinking and typing words, which might be a complicated procedure sometimes.
Abstract: Often people might not be able to express themselves properly on social media, like not being able to think of appropriate words representative of their emotional state. In this paper, we propose an end to end system which aims to enhance user-input sentence according to his/her current emotional state. It works by a) detecting the emotion of the user and b) enhancing the input sentence by inserting emotive words to make the sentence more representative of the emotional state of the user. The emotional state of the user is recognized by analyzing the Electroencephalogram (EEG) signals from the brain. For text enhancement, we modify the words corresponding to the detected emotion using correlation finder scheme. Next, the verification of the sentence correctness has been performed using Long Short Term Memory (LSTM) Networks based Language Modeling framework. An accuracy of 74.95% has been recorded for the classification of five emotional states in a dataset of 25 participants using EEG signals. Similarly, promising results have been obtained for the task text enhancement and overall end-to-end system. To the best of our knowledge, this work is the first of its kind to enhance text according to the emotional state detected by EEG brainwaves. The system also releases an individual from thinking and typing words, which might be a complicated procedure sometimes.

Journal ArticleDOI
TL;DR: The dynamic profile questions were more usable than both the text-based and image-based questions and a response time factor may be implemented to identify and report impersonation attacks.
Abstract: © The Author(s) 2018 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Journal ArticleDOI
TL;DR: A new multicast scheme for (m,k)-firm stream to deliver data packet to group members and have longer network lifetime than existing schemes is proposed.
Abstract: As data source for Big Data, wireless sensor networks play great role in data collection and dissemination. Specially, real-time communication remains one of the crucial research challenges, because of the high complexity and severe networking requirements on restrained node in wireless sensor networks. Moreover, current schemes are assumed to take general traffic model for real-time delivery so they lack adaptability. To solve this problem, few routing protocols have been designed to accommodate new real-time model, (m,k)-firm, which is regarded as the most applicable scheme for event as well as query based applications in wireless sensor networks. However, since current schemes for (m,k)-firm stream are proposed to support unicast communications only, they cannot be applied to multicast communications where many group members are willing to receive data packets from the sink node. To overcome this problem, we propose a new multicast scheme for (m,k)-firm stream to deliver data packet to group members. To construct a multicast tree, different types of overlay tree are constructed according to distance based priority (DBP) value. Simulation results prove that the proposed scheme can meet (m,k)-firm requirement and have longer network lifetime than existing schemes.

Journal ArticleDOI
TL;DR: The aim is to provide a general interface for supporting programmable scaling policies utilizing monitoring metrics originating from infrastructure, application or any external components to implement practically any possible business logic for a given application.
Abstract: With the increasing utilization of cloud computing and container technologies, orchestration is becoming an important area on both cloud and container levels. Beyond resource allocation, deployment and configuration, scaling is a key functionality in orchestration in terms of policy, description and flexibility. This paper presents an approach where the aim is to provide a high degree of flexibility in terms of available monitoring metrics and in terms of the definition of elasticity rules to implement practically any possible business logic for a given application. The aim is to provide a general interface for supporting programmable scaling policies utilizing monitoring metrics originating from infrastructure, application or any external components. The paper introduces a component, called Policy Keeper performing the auto-scaling based on user-defined rules, details how this component is operating in the auto-scaling framework, called MiCADO and demonstrates a deadline-based scaling use case.

Journal ArticleDOI
TL;DR: This paper evaluates two different dynamic community discovery classes in DOSNs and proves that the social graph has high instability and distributed solutions to manage the dynamism are needed and shows that a Temporal Trade-off class is the most promising one.
Abstract: The community structure is one of the most studied features of the Online Social Networks (OSNs). Community detection guarantees several advantages for both centralized and decentralized social networks. Decentralized Online Social Networks (DOSNs) have been proposed to provide more control over private data. Several challenges in DOSNs can be faced by exploiting communities. The detection of communities and the management of their evolution represents a hard process, especially in highly dynamic environments, where churn is a real problem. In this paper, we focus our attention on the analysis of dynamic community detection in DOSNs by studying a real Facebook dataset. We evaluate two different dynamic community discovery classes to understand which of them can be applied to a distributed environment. Results prove that the social graph has high instability and distributed solutions to manage the dynamism are needed and show that a Temporal Trade-off class is the most promising one.

Journal ArticleDOI
TL;DR: Experimental results show the effectiveness of the hybrid simulation-optimization approach for optimizing the number of allocated virtual machines and the scheduling of tasks regarding performability.
Abstract: Given the characteristics of dynamic provisioning and illusion of unlimited resources, clouds are becoming a popular alternative for running scientific workflows. In a cloud system for processing workflow applications, the system’s performance is heavily influenced by two factors: the scheduling strategy and failure of components. Failures in a cloud system can simultaneously affect several users and depreciate the number of available computing resources. A bad scheduling strategy can increase the expected makespan and the idle time of physical machines. In this paper, we propose an optimization method for the scheduling of scientific workflows on cloud systems. The method comprises the use of a meta-heuristic algorithm coupled to a performability model that provides the fitnesses of explored solutions. For being able to represent the combined effect of scheduling and component failures, we adopted discrete event simulation for the performability model. Experimental results show the effectiveness of the hybrid simulation-optimization approach for optimizing the number of allocated virtual machines and the scheduling of tasks regarding performability.

Journal ArticleDOI
TL;DR: An integration model of the Grid and the Cloud is proposed using the HTCondor batch system and the NorduGrid ARC middleware to enable batch job execution in any public or private Cloud by deploying a virtualized Grid cluster using theARC middleware - PaaS model for running Grid applications.
Abstract: Scientific computing has evolved considerably in recent years. Scientific applications have become more complex and require an increasing number of computing resources to perform on a large scale. Grid computing has become widely used and is the chosen infrastructure for many scientific calculations and projects, even though it demands a steep learning curve. The computing and storage resources in the Grid are limited, heterogeneous and often overloaded. This heterogeneity is not only present in the hardware setups, but also in the software composition, where configuration permissions are limited. It also has a negative effect on the portability of scientific applications. The use of Cloud resources could eliminate those constraints. In the Cloud, resources are provisioned on demand and can be scaled up and down, while scientists can easily customize their execution environments in the form of virtual machines. Extending the Grid with Cloud resources would improve the utilization of shared resources and would enable the use of additional resources when the Grid resources are overloaded – known as Cloud bursting. We propose an integration model of the Grid and the Cloud using the HTCondor batch system and the NorduGrid ARC middleware. This model enables batch job execution in any public or private Cloud by deploying a virtualized Grid cluster using the ARC middleware - PaaS model for running Grid applications. An evaluation of the virtual Grid cluster was made and compared with the physical one by running NAMD simulations.

Journal ArticleDOI
TL;DR: A three-phase Dynamic Data Replication Algorithms (DDRA) for deploying the resources has been proposed to improve the efficiency of the information duplication under the cloud storage system and can enhance the availability, access efficiency and load balancing under the hierarchical cloud computing environment.
Abstract: The rapid advancement of network technology has changed the way the world operates and has produced a large number of network application services for users. In order to provide more convenient services, the network service providers need to provide a more stable and high-capacity system. Therefore, cloud computing technology has been developed in the recent decade. The network service providers can reduce the cost related to the cloud computing services by using the virtualizing and data replicating techniques. Besides, an efficient information duplication strategy is necessary to reduce the workload and enhance the ability of the system. Therefore, a three-phase Dynamic Data Replication Algorithms (DDRA) for deploying the resources has been proposed in this paper to improve the efficiency of the information duplication under the cloud storage system. For the first two phases, the proposed algorithm is designed to determine the suitable service nodes to achieve the balance of workload according to the service nodes’ workloads. In the third phase, a dynamic duplication deployment scheme has been designed to achieve the higher access performance and better load balancing between service nodes for the overall environment. As a result, the proposed algorithm can enhance the availability, access efficiency and load balancing under the hierarchical cloud computing environment.

Journal ArticleDOI
TL;DR: This research analyzed public views, sentiments and opinions shared on social media about a democratic participatory activity called Azadi-March, which was held in Pakistan with participation of online users from all over the world.
Abstract: The growing public endorsement of social media has changed public life dramatically Public views and suggestions have now become important for both organizations and individuals Big data scientists and data mining analysts are increasingly moving their attention toward sentiment analysis because of the growing rate of user-generated contents over microblogging sites Sentiment analysis is a research field related to computationally identifying public views, feelings, recommendations, opinions and sentiments about focused entities Research literature shows traces of research work on product and movie reviews for better decision making using big data analysis Big data analytics offer remarkable opportunities to individuals as well as organizations by providing proficient decision making frameworks and improved forecasting models The sociopolitical collaboration has gained much attention from online users over the past few years In this research we analyzed public views, sentiments and opinions shared on social media about a democratic participatory activity called Azadi-March, which was held in Pakistan with participation of online users from all over the world We carried out computational semantic orientation on public tweets for analyzing public awareness and the effects of online communication through social media over the real world public decision making We employed unsupervised approach for identification and scoring of tweets We used lexicon based approach in which annotated lexica are used for scoring verbs, adverbs and other parts of speech A corpus is used for scoring adjectives and informal opinion indicators Emoji, exclamatory statements and other additional features are incorporated for supplementary analysis We noticed that emoticons and NetLingo play significant role in sentiment orientation Opinion groups are generated from all retrieved tweets and aggregate sentiment weights of opinion groups are computed The findings of this study indicate that our proposed lexicon based approach outperforms the contemporary machine learning techniques by achieving 86% average accuracy at sentence level sentiment analysis

Journal ArticleDOI
TL;DR: Three user groups are identified: Privacy Guardians are highly concerned and guard their privacy carefully, Privacy Cynics are moderately concerned but feel powerless, and Privacy Pragmatists are less concerned and trade privacy for benefits.
Abstract: With our lives being increasingly digital, most users are concerned about their online privacy. Still, many users provide manifold data online and show no protection behaviors. Research has found different explanations for this privacy paradox: users perform a privacy calculus (weighing benefits and concerns about data sharing), make affective and inconsiderate decisions, or are overtaxed by the complexity of privacy protection practices. Complementing these theories, we hypothesize that different user types approach privacy differently. In interviews and focus groups (N = 25), we see that users show a different reasoning for their online behaviors. Subsequently, an online survey (N = 337) was carried out. Using cluster analyses, we identify three user groups: Privacy Guardians are highly concerned and guard their privacy carefully, Privacy Cynics are moderately concerned but feel powerless, and Privacy Pragmatists are less concerned and trade privacy for benefits. The user groups need to be addressed by individually tailored information and communication strategies to be able to adequately benefit from the digital era according to their requirements.

Journal ArticleDOI
TL;DR: Experimental results of the mechanism highlight that an SDN-aware allocation solution can reduce the data center usage and improve the quality-of-service perceived by hosted tenants.
Abstract: Virtual Infrastructures (VIs) emerged as a potential solution for network evolution and cloud services provisioning on the Internet. Deploying VIs, however, is still challenging mainly due to a rigid management of networking resources. By splitting control and data planes, Software-Defined Networks (SDN) enable custom and more flexible management, allowing for reducing data center usage, as well as providing mechanisms to guarantee bandwidth and latency control on switches and endpoints. However, reaping the benefits of SDN for VI embedding in cloud data centers is not trivial. Allocation frameworks require combined information from the control plan (e.g., isolation policies, flow identification) and data (e.g., storage capacity, flow table configuration) to find a suitable solution. In this context, the present work proposes a mixed integer programming formulation for the VI allocation problem that considers the main challenges regarding SDN-based cloud data centers. Some constraints are then relaxed resulting in a linear program, for which a heuristic is introduced. Experimental results of the mechanism, termed as QVIA-SDN, highlight that an SDN-aware allocation solution can reduce the data center usage and improve the quality-of-service perceived by hosted tenants.

Journal ArticleDOI
TL;DR: A multi-elastic virtualized datacenter provides users with the ability to deploy customized scalable computing clusters while reducing its energy footprint and an open-source framework to deploy elastic virtual clusters running on elastic physical clusters is introduced.
Abstract: Computer clusters are widely used platforms to execute different computational workloads. Indeed, the advent of virtualization and Cloud computing has paved the way to deploy virtual elastic clusters on top of Cloud infrastructures, which are typically backed by physical computing clusters. In turn, the advances in Green computing have fostered the ability to dynamically power on the nodes of physical clusters as required. Therefore, this paper introduces an open-source framework to deploy elastic virtual clusters running on elastic physical clusters where the computing capabilities of the virtual clusters are dynamically changed to satisfy both the user application’s computing requirements and to minimise the amount of energy consumed by the underlying physical cluster that supports an on-premises Cloud. For that, we integrate: i) an elasticity manager both at the infrastructure level (power management) and at the virtual infrastructure level (horizontal elasticity); ii) an automatic Virtual Machine (VM) consolidation agent that reduces the amount of powered on physical nodes using live migration and iii) a vertical elasticity manager to dynamically and transparently change the memory allocated to VMs, thus fostering enhanced consolidation. A case study based on real datasets executed on a production infrastructure is used to validate the proposed solution. The results show that a multi-elastic virtualized datacenter provides users with the ability to deploy customized scalable computing clusters while reducing its energy footprint.

Journal ArticleDOI
TL;DR: Simulation results confirmed that the proposed approach performs better in terms of traffic overhead and average end-to-end delay as compared to an existing state-of-the-art approach.
Abstract: Big data involves a large amount of data generation, storage, transfer from one place to another, and analysis to extract meaningful information. Information centric networking (ICN) is an infrastructure that transfers big data from one node to another node, and provides in-network caches. For software defined network-based ICN approach, a recently proposed centralized cache server architecture deploys single cache server based on path-stretch value. Despite the advantages of centralized cache in ICN, single cache server for a large network has scalability issue. Moreover, it only considers the path-stretch ratio for cache server deployment. Consequently, the traffic can not be reduced optimally. To resolve such issues, we propose to deploy multiple cache servers based on joint optimization of multiple parameters, namely: (i) closeness centrality; (ii) betweenness centrality; (iii) path-stretch values; and (iv) load balancing in the network. Our proposed approach first computes the locations and the number of cache servers based on the network topology information in an offline manner and the cache servers are placed at their corresponding locations in the network. Next, the controller installs flow rules at the switches such that the switches can forward the request for content to one of its nearest cache server. Upon reaching a content request, if the content request matches with the contents stored at the cache server, the content is delivered to the requesting node; otherwise, the request is forwarded to the controller. In the next step, controller computes the path such that the content provider first sends the content to the cache server. Finally, a copy of the content is forwarded to the requesting node. Simulation results confirmed that the proposed approach performs better in terms of traffic overhead and average end-to-end delay as compared to an existing state-of-the-art approach.

Journal ArticleDOI
TL;DR: This work presents a Stochastic Petri Net (SPN) approach for evaluating cloud-based DR solutions for IT environments, which allows evaluating various performability metrics and can help DR coordinators to choose the most appropriate DR solution.
Abstract: An increasing number of organizations are relying on cloud-based Disaster Recovery (DR) solutions to ensure high availability of Information Technology (IT) environments. The flexibility of the cloud resources as well as their pay-as-you-go pricing model has enabled organizations to adopt cost-effective yet reliable DR services. Although cloud-based DR solutions have been used in many organizations, such DR solutions have not been properly assessed in terms of their capacity to meet user demand under disaster occurrences, and the possibility of using the DR cloud for performance improvements. In this work, we present a Stochastic Petri Net (SPN) approach for evaluating cloud-based DR solutions for IT environments. Our approach allows evaluating various performability metrics (e.g., response time, throughput, availability and others), and thus, can help DR coordinators to choose the most appropriate DR solution. A real-world case study is presented to demonstrate the applicability of the approach. We also validate the accuracy of our analytic approach by comparing analytic results with the ones obtained from the cloud simulator CloudSim Plus.

Journal ArticleDOI
TL;DR: Peer to peer, Internet of Things, Smartcities, distributed sensing are examples of modern ICT paradigms that aim to describe globally cooperative infrastructures built upon objects’ intelligence and self-configuring capabilities.