scispace - formally typeset
Search or ask a question

Showing papers on "Service level objective published in 2019"


01 Jan 2019
TL;DR: This document provides some exemplary use cases for service function chaining in mobile service provider networks to localize and explain the application domain of service chaining within mobile networks as far as it is required to complement the SFC problem statement and architecture framework of the working group.
Abstract: This document provides some exemplary use cases for service function chaining in mobile service provider networks. The objective of this draft is not to cover all conceivable service chains in detail. Rather, the intention is to localize and explain the application domain of service chaining within mobile networks as far as it is required to complement the SFC problem statement and architecture framework of the working group. Service function chains typically reside in a LAN segment which links the mobile access network to the actual application platforms located in the carrier's datacenters or somewhere else in the Internet. Service function chains (SFC) ensure a fair distribution of network resources according to agreed service policies, enhance the performance of service delivery or take care of security and privacy. SFCs may also include Value Added Services (VAS). Commonly, SFCs are typical middle box based services. General considerations and specific use cases are presented in this document to demonstrate the different technical requirements of these goals for service function chaining in mobile service provider networks. The specification of service function chaining for mobile networks must take into account an interaction between service function chains and the 3GPP Policy and Charging Control (PCC) environment.

105 citations


Proceedings ArticleDOI
08 Jul 2019
TL;DR: Spock is proposed, a new scalable and elastic control system that exploits both VMs and serverless functions to reduce cost and ensure SLO for elastic web services and yields significant cost savings.
Abstract: We are witnessing the emergence of elastic web services which are hosted in public cloud infrastructures. For reasons of cost-effectiveness, it is crucial for the elasticity of these web services to match the dynamically-evolving user demand. Traditional approaches employ clusters of virtual machines (VMs) to dynamically scale resources based on application demand. However, they still face challenges such as higher cost due to over-provisioning or incur service level objective (SLO) violations due to under-provisioning. Motivated by this observation, we propose Spock, a new scalable and elastic control system that exploits both VMs and serverless functions to reduce cost and ensure SLO for elastic web services. We show that under two different scaling policies, Spock reduces SLO violations of queries by up to 74\% when compared to VM-based resource procurement schemes. Further, Spock yields significant cost savings, by up to 33\% compared to traditional approaches which use only VMs.

51 citations


Journal ArticleDOI
TL;DR: Experimental results show that the incentive contracts have a positive impact on both service requesters and providers and that the incentives mechanism outperforms the existing combinatorial auction-based approaches in finding optimal solutions.
Abstract: QoS-aware service selection seeks to find the optimal service providers to achieve the optimization goal of a service requester, such as the maximization of utility, while satisfying global QoS requirements. Service providers are usually self-interested and have some private information, such as minimum prices, that would significantly factor into the decision making of the service requester. Thus, service requesters face a decision making dilemma with incomplete information. Recent work has used iterative combinatorial auctions to address this problem. However, such studies do not sufficiently consider that the service requester can elicit the private information from service providers by observing their actions. This can help the service selection process achieve better outcomes. In this paper, we propose a type of incentive contract that can motivate the service providers to offer the QoS and prices that the service requester prefers. Based on the incentive contracts, we propose an incentive mechanism for effective service selection. In the mechanism, a service requester offers a set of incentive contracts to the service providers and then elicits their private information based on their responses to the incentive contracts. The process is iterated until the service requester finally obtains a solution that fulfills the global QoS requirements. Experimental results show that the incentive contracts have a positive impact on both service requesters and providers and that the incentive mechanism outperforms the existing combinatorial auction-based approaches in finding optimal solutions.

24 citations


Proceedings ArticleDOI
Jeffrey C. Mogul1, John Wilkes1
13 May 2019
TL;DR: It is argued that a mutually beneficial set of Service Level Expectations and Customer Behavior Expectations ameliorates many of the problems of today's SLOs by explicitly sharing risk between customer and service provider.
Abstract: Cloud customers want strong, understandable promises (Service Level Objectives, or SLOs) that their applications will run reliably and with adequate performance, but cloud providers don't want to offer them, because they are technically hard to meet in the face of arbitrary customer behavior and the hidden interactions brought about by statistical multiplexing of shared resources. Existing cloud SLOs are more concerned with defending against corner cases than defining normal behavior. This and other tensions make SLOs surprisingly hard to define. We show that this problem shares some similarities with the challenges of applying statistics to make decisions based on sampled data. We argue that a mutually beneficial set of Service Level Expectations (SLEs) and Customer Behavior Expectations (CBEs) ameliorates many of the problems of today's SLOs by explicitly sharing risk between customer and service provider.

22 citations


Proceedings ArticleDOI
16 Jun 2019
TL;DR: A solution to the self-adaptive problem of vertical elasticity for co-located containerized applications by learns performance models that relate SLOs to workload, resource limits and service level indicators and derives limits that meet SLOs and minimize resource consumption via a combination of optimization and restricted brute-force search.
Abstract: With changing workloads, cloud service providers can leverage vertical container scaling (adding/removing resources) so that Service Level Objective (SLO) violations are minimized and spare resources are maximized. In this paper, we investigate a solution to the self-adaptive problem of vertical elasticity for co-located containerized applications. First, the system learns performance models that relate SLOs to workload, resource limits and service level indicators. Second, it derives limits that meet SLOs and minimize resource consumption via a combination of optimization and restricted brute-force search. Third, it vertically scales containers based on the derived limits. We evaluated our technique on a Kubernetes private cloud of 8 nodes with three deployed applications. The results registered two SLO violations out of 16 validation tests; acceptably low derivation times facilitate realistic deployment. Violations are primarily attributed to application specifics, such as garbage collection, which require further research to be circumvented.

19 citations


Proceedings ArticleDOI
03 Apr 2019
TL;DR: This work presents MMLite, a functionally decomposed and stateless MME design wherein individual control procedures are implemented as microservices and states are decoupled from their processing, thus enabling elasticity and fault tolerance and for SLO compliance, a multi-level load balancing approach.
Abstract: With increase in cellular-enabled IoT devices having diverse traffic characteristics and service level objectives (SLOs), handling the control traffic in a scalable and resource-efficient manner in the cellular packet core network is critical. The traditional monolithic design of the cellular core adopted by service-providers is inflexible with respect to the diverse requirements and bursty loads of IoT devices, specifically for properties such as elasticity, customizability, and scalability. To address this key challenge, we focus on the most critical control plane component of the cellular packet core network, the Mobility Management Entity (MME). We present MMLite, a functionally decomposed and stateless MME design wherein individual control procedures are implemented as microservices and states are decoupled from their processing, thus enabling elasticity and fault tolerance. For SLO compliance, we develop a multi-level load balancing approach based on skewed consistent hashing to efficiently distribute incoming connections. We evaluate the performance benefits of MMLite over existing approaches with respect to scaling, fault tolerance, SLO compliance and resource efficiency.

16 citations


Proceedings ArticleDOI
01 Dec 2019
TL;DR: Proof-of-Concept (PoC) evaluation results show the potential of the extended CloudSim extension in terms of execution efficiency and simulation reality.
Abstract: The performance of Functions-as-a-Service (FaaS) would be significantly improved by organizing cloud servers into a hierarchical distributed architecture, resulting in low-latency access and faster response when compared to centralized cloud. However, the distributed organization introduces a new type of decision making problem for placing and executing functions to a specific cloud server. In order to handle the problem, we extended a well-known cloud computing simulator, CloudSim. The extended CloudSim enables users to define FaaS functions with various characteristics and service level objectives (SLOs), place them across geo-distributed cloud servers, and evaluate per-function performance. Proof-of-Concept (PoC) evaluation results show the potential of our CloudSim extension in terms of execution efficiency and simulation reality.

15 citations


Proceedings ArticleDOI
TL;DR: DeCaf as discussed by the authors is a system for automated diagnosis and triaging of key performance indicators (KPIs) issues using service logs using machine learning along with pattern mining to help service owners automatically root cause and triage performance issues.
Abstract: Large scale cloud services use Key Performance Indicators (KPIs) for tracking and monitoring performance. They usually have Service Level Objectives (SLOs) baked into the customer agreements which are tied to these KPIs. Dependency failures, code bugs, infrastructure failures, and other problems can cause performance regressions. It is critical to minimize the time and manual effort in diagnosing and triaging such issues to reduce customer impact. Large volume of logs and mixed type of attributes (categorical, continuous) in the logs makes diagnosis of regressions non-trivial. In this paper, we present the design, implementation and experience from building and deploying DeCaf, a system for automated diagnosis and triaging of KPI issues using service logs. It uses machine learning along with pattern mining to help service owners automatically root cause and triage performance issues. We present the learnings and results from case studies on two large scale cloud services in Microsoft where DeCaf successfully diagnosed 10 known and 31 unknown issues. DeCaf also automatically triages the identified issues by leveraging historical data. Our key insights are that for any such diagnosis tool to be effective in practice, it should a) scale to large volumes of service logs and attributes, b) support different types of KPIs and ranking functions, c) be integrated into the DevOps processes.

15 citations


Journal ArticleDOI
01 Dec 2019
TL;DR: This paper proposes an approach for automating the process of providing a latency-aware failover strategy through a server placement algorithm leveraging genetic algorithms that factor in the proximity of users and inter-DC latencies.
Abstract: Despite advances in Cloud computing, ensuring high availability (HA) remains a challenge due to varying loads and the potential for Cloud outages. Deploying applications in distributed Clouds can help overcome this challenge by geo-replicating applications across multiple Cloud data centers (DCs). However, this distributed deployment can be a performance bottleneck due to network latencies between users and DCs as well as inter-DC latencies incurred during the geo-replication process. For most web applications, both HA and Performance (HAP) are essential and need to meet pre-agreed Service Level Objectives (SLOs). Efficiently placing and managing primary and backup replicas of applications in distributed Clouds to achieve HAP is a challenging task. Existing solutions consider either HA or performance but not both. In this paper we propose an approach for automating the process of providing a latency-aware failover strategy through a server placement algorithm leveraging genetic algorithms that factor in the proximity of users and inter-DC latencies. To facilitate the distributed deployment of applications and avoid the overheads of Clouds, we utilize container technologies. To evaluate our proposed approach, we conduct experiments on the Australia-wide National eResearch Collaboration Tools and Resources (NeCTAR - www.nectar.org.au) Research Cloud. Our results show at least a 23.3% and 22.6% improvement in response times under normal and failover conditions respectively compared to traditional, latency-unaware approaches. Also, the 95th percentile of response times in our approach are at most1.5 ms above the SLO compared to 11–32 ms using other approaches.

15 citations


Proceedings ArticleDOI
16 Jun 2019
TL;DR: This work mined real SLOs published on the web, extracted their goals and characterized them, and collected 75 SLOs where response time, query percentile and reporting period were specified to confirm and refute common perceptions.
Abstract: Service level objectives (SLOs) stipulate performance goals for cloud applications, microservices, and infrastructure. SLOs are widely used, in part, because system managers can tailor goals to their products, companies, and workloads. Systems research intended to support strong SLOs should target realistic performance goals used by system managers in the field. Evaluations conducted with uncommon SLO goals may not translate to real systems. Some textbooks discuss the structure of SLOs but (1) they only sketch SLO goals and (2) they use outdated examples. We mined real SLOs published on the web, extracted their goals and characterized them. Many web documents discuss SLOs loosely but few provide details and reflect real settings. Systematic literature review (SLR) prunes results and reduces bias by (1) modeling expected SLO structure and (2) detecting and removing outliers. We collected 75 SLOs where response time, query percentile and reporting period were specified. We used these SLOs to confirm and refute common perceptions. For example, we found few SLOs with response time guarantees below 10 ms for 90% or more queries. This reality bolsters perceptions that single digit SLOs face fundamental research challenges.

13 citations


Book
05 Jun 2019
TL;DR: In this article, a series of focus groups were conducted to identify the context-specific set of service quality expectations that customers hold for each of these scenes, and Formal Concept Analysis (FCA) was applied to these findings to graphically illustrate how customer expectations for airline service quality vary by service scene.
Abstract: Purpose – The purpose of this paper is to demonstrate the critical role that context plays in measuring service quality. Design/methodology/approach – This study replicated an experiment methodology to show that customers perceive an airline service drama as a sequence of scenes. A series of focus groups were then conducted to identify the context-specific set of service quality expectations that customers hold for each of these scenes. Finally, Formal Concept Analysis (FCA), a mathematical modeling technique, was applied to these findings to graphically illustrate how customer expectations for airline service quality vary by service scene. Findings – Results from this study indicate that static measures of service quality are apparently inadequate in explaining customer expectations during more enduring service encounters. The FCA hierarchical model developed in this study revealed profound differences in customer service expectations across the six airline service scenes. These results suggest that more...

Proceedings ArticleDOI
09 Dec 2019
TL;DR: This paper designs MRburst (Multi-Resource burstable performance scheduler) to automatically limit multiple resources and make the application comply with a user-defined service level objective (SLO) while minimizing wasted resources.
Abstract: During the past few years, all leading cloud providers introduced burstable instances that can sprint their performance for a limited period to address sudden workload variations. Despite the availability of burstable instances, there is no clear understanding of how to minimize the waste of resources by regulating their burst capacity to the workload requirements. This is especially true when it comes to non-CPU-intensive applications. In this paper, we investigate how to limit network and I/O usage to optimize the efficiency of the bursting process. We also study which resource shall be controlled to benefit both cloud providers and end-users. We design MRburst (Multi-Resource burstable performance scheduler) to automatically limit multiple resources (i.e., network, I/O, and CPU) and make the application comply with a user-defined service level objective (SLO) while minimizing wasted resources. MRburst is evaluated on Amazon EC2 using two multi-resource applications: an FTP server and a Ceph system. Experimental results show that MRburst outperforms state-of-the-art approaches by allowing instances to speed up their performance for up to 2.4 times longer period while meeting SLO.

Proceedings ArticleDOI
01 Apr 2019
TL;DR: This paper introduces a fully integrated SLA management framework in a real 5G environment that allows network operators to choose between different SLOs during the SLA Template generation, and then automatically formulate an Agreement, based on each network slice instantiation and the corresponding NS.
Abstract: A key feature of fifth generation (5G) and Software Defined Networking (SDN) is the assurance of high levels of the quality of service (QoS). To this end, Service Level Agreements (SLAs) are introduced in order to fulfill the gap between network operators and their customers. An SLA is a contract between the operator and the internal or external customer, which determines what Network Services (NSs) are offered and the guaranteed level of performance. Taking into consideration the above-mentioned needs, in this paper, we are introducing a fully integrated SLA management framework in a real 5G environment. In this demonstration we aim to bind business requirements as Service Level Objectives (SLOs) between network operators and the customers, with measurable recourse attributes. To achieve this, we allow network operators to choose between different SLOs during the SLA Template generation, and then automatically formulate an Agreement, based on each network slice instantiation and the corresponding NS. Finally, we provide a monitoring system in order to detect and alert for any violations.

Journal ArticleDOI
TL;DR: The multi-tenancy architectureﻵ allowsﻷsoftware-as-a-serviceﻴapplicationsﻹ to﻽toソserveﻱ multipleﻅtenantsﻰ withﻩ with a singleﻳ instance.
Abstract: In SaaS applications, which are mainly built following a multi-tenant architecture, the support of service variability among tenants is limited. It is due to the principle of sharing the same instance of the SaaS service between the tenants; thus, the particularities of each tenant's needs are not considered. Having to customize and adapt a SaaS service to tenants' needs and their specific contexts, while maintaining multi-tenancy, has emerged as a persisting challenge. The SaaS service adaptation has not only to be according to the functional requirements of the different tenants, but it has also to cope with the non-functional requirements in their various levels that are subject to change in time and space: i.e. within the same tenant and between tenants, respectively. For example, a tenant can just require minimal security level as it needs solely to be authenticated and has no other security concerns; while another may require a high security level and advanced and supplementary security mechanisms such as access control and encryption. There are also situations when the same tenant changes its desired quality level through time; for instance, when there is an increase of service requests in a critical period, the tenant may want to decrease the response time and to increase the uptime value in order to smoothly perform his requests. Therefore, the application needs to be adapted dynamically in order to meet every tenant's quality requirements as if he is the only one to consume the service.Software Product Lines Engineering (SPLE) has largely tackled the variability management field in the literature. It enables high reusability in shorter time, at lower costs and with higher quality by creating product families or lines that share commonalities and have variation points. The derivation of the final product is performed at the level of those variation points while considering the inter-and intra- dependencies between the artifacts of a product line.Our proposed approach is based on the SPLE principles to build SaaS services tailored to tenant-specific Service Level Agreements (SLAs). We defined the Domain Engineering phase to build SLA families. Based on domain analysis, the non-functional requirements of the tenants, their commonalities, variation points and variants are captured and modeled. A Generic SLA is generated which contains the terms and the Service Level Objectives (SLOs) of all the contracting tenants. From this Generic SLA, the proposed middleware generates core and tenant-specific policies. The Configurator, a middleware component, adapts the application according to these defined policies based on annotations in the form of key-value pairs.As contributions of our article [1]: we proposed the architecture of the proposed middleware and its different components; we defined the Policy metamodel, that models policies retrieved from the Generic SLA as annotations, and described the model-to-model transformations from VariableSLA metamodel to Policy metamodel; and we proposeed a Configurator metamodel that models and describes how the Configurator component of the middleware configures the SaaS service.Our approach differs from the literature in that it enables building a family of SLAs: a generic SLA and tenant-specific policies; linking, thus, the non-functional variability with the SLA. Moreover, it is not limited to a specific quality attribute but supports any quality attribute expressed in the SLA. It allows configurations at design time and at runtime through annotating the SaaS service components with adequate quality levels.

Proceedings ArticleDOI
01 Jun 2019
TL;DR: Potential improvements of autoscaling are explored by designing and evaluating several predictive-based autoscaled policies and demonstrating that the combination of horizontal and vertical scaling enables more flexibility and reduces costs.
Abstract: With the growing complexity of microservice applications and proliferation of containers, scaling of cloud applications became challenging. Containers enabled the adaptation of the application capacity to the changing workload on the finer level of granularity than it was possible only with virtual machines. The common way to automate the adaptation of a cloud application is via autoscaling. Autoscaling is provided both on the level of virtual machines and containers. Its accuracy on dynamic workloads suffers significantly from the reactive nature of the available autoscaling solutions. The aim of the paper is to explore potential improvements of autoscaling by designing and evaluating several predictive-based autoscaling policies. These policies are naive (used as a baseline), best resource pair, only-Delta-load, always-resize, resize when beneficial. The scaling policies were implemented in Scaling Policy Derivation Tool (SPDT). SPDT takes the long-term forecast of the workload and the capacity model of microservices as input to produce the sequence of scaling actions scheduled for the execution in future with the aims to meet the service level objectives and minimize the costs. Policies implemented in SPDT were evaluated for three microservice applications and several workload patterns. The tests demonstrate that the combination of horizontal and vertical scaling enables more flexibility and reduces costs. Schedule derivation according to some policies might be compute-intensive, therefore careful consideration of the optimization objective (e.g. cost minimization or timeliness of the scaling policy) is required from the user of SPDT.

Proceedings ArticleDOI
01 Jan 2019
TL;DR: This paper models DVFS on the job level and through which Service Levels Objectives can be guaranteed with respect to prescribed mean or quantiles of service delays according to given Service Level Agreements (SLA) between user and service provider.
Abstract: Dynamic Voltage and Frequency Scaling (DVFS) is a method to save energy consumption of electronic devices and to protect them against overheating by automatic sensing and adaptation of their energy consumption. This can be accomplished either on the program instruction level for electronic devices or on the task or job level for server clusters. This paper models DVFS on the job level and through which Service Levels Objectives can be guaranteed with respect to prescribed mean or quantiles of service delays according to given Service Level Agreements (SLA) between user and service provider. The two parameters V (voltage) and f (frequency) cannot be changed independently of each other; typically only several combinations of V and f values are implemented in hardware for several power states. In this paper a novel analysis of operating DVFS is suggested for Server Clusters of Cloud Data Centers (CDC) under prescribed bounds of service level objectives which are defined by SLAs. The method is based on the theory of queuing models of the type GI/G/n for a server cluster to establish a relationship between SLA parameters and the power consumption and is performed for the example of the Intel Pentium M Processor with Enhanced SpeedStep Power Management. As result of this method precise bounds are provided for the load ranges of service request rates $\lambda$ for each power mode which guarantee minimum power consumption dependent on given SLA values and job arrival and service statistics. As the instantaneous load in a CDC can be highly volatile the current load level is usually monitored by periodic sensing which may result in a rather high frequency of DVFS range changes and corresponding overhead. For that reason an automated smoothing method is suggested which reduces the frequency of DVFS range changes significantly. This method is based on a Finite State Machine (FSM) with hysteresis levels.

Proceedings ArticleDOI
27 Mar 2019
TL;DR: An application- and provider-independent SLO modelling language (SLO-ML) is proposed to enable customers to specify the required SLOs and the architecture to realise it is sketched.
Abstract: The diversity of cloud offerings motivated the proposition of cloud modelling languages (CMLs) to abstract complexities related to selection of cloud services. However, current CMLs lack the support for modelling service level objectives (SLOs) that are required for the customer applications. Consequently, we propose an application- and provider-independent SLO modelling language (SLO-ML) to enable customers to specify the required SLOs. We also sketch the architecture to realise SLO-ML.

Journal Article
TL;DR: This survey identifies the state of the art covering concepts, approaches and open problems of the SLAs establishment, deployment and management and describes SLAs’ characteristics and objectives.
Abstract: Information and Communication Technology (ICT) is being provided to the variety of end-users demands, thereby providing a better and improved management of services is crucial. Therefore, Service Level Agreements (SLAs) are essential and play a key role to manage the provided services among the network entities. This survey identifies the state of the art covering concepts, approaches and open problems of the SLAs establishment, deployment and management. This paper is organised in a way that the reader can access a variety of proposed SLA methods and models addressed and provides an overview of the SLA actors and elements. It also describes SLAs' characteristics and objectives. SLAs' existing methodologies are explained and categorised followed by the Service Quality Categories (SQD) and Quality-Based Service Descriptions (QSD). SLA modelling and architectures are discussed, and open research problems and future research directions are introduced. The establishment of a reliable, safe and QoE-aware computer networking needs a group of services that goes beyond pure networking services. Therefore, within the paper this broader set of services are taken into consideration and for each Service Level Objective (SLO) the related services domains will be indicated. The purpose of this survey is to identify existing research gaps in utilising SLA elements to develop a generic methodology, considering all quality parameters beyond the Quality of Service (QoS) and what must or can be taken into account to define, establish and deploy an SLA. This study is still an active research on how to specify and develop an SLA to achieve the win-win agreements among all actors.

Proceedings ArticleDOI
02 Dec 2019
TL;DR: This paper investigates the state-of-the-art for SLO support in both cloud providers SLAs and CMLs in order to identify the gaps forSLO support and outlines research directions towards achieving the MDE-based cloud brokerage.
Abstract: The current large selection of cloud instances that are functionally equivalent makes selecting the right cloud service a challenging decision. We envision a model driven engineering (MDE) approach to raise the level of abstraction for cloud service selection. One way to achieve this is through a domain specific language (DSL) for modelling the service level objectives (SLOs) and a brokerage system that utilises the SLO model to select services. However, this demands an understanding of the provider SLAs and the capabilities of the current cloud modelling languages (CMLs). This paper investigates the state-of-the-art for SLO support in both cloud providers SLAs and CMLs in order to identify the gaps for SLO support. We then outline research directions towards achieving the MDE-based cloud brokerage.

Journal ArticleDOI
TL;DR: Analysis of a multi-priority system with access time service level requirements for dynamic arriving requests showed improved access times for semi-urgent class and stable class patients with the largest 10–50% access times over a set of reported statistics.

Book ChapterDOI
25 Jun 2019
TL;DR: An ontology is proposed, WIoT-SLA, that combines IoT service properties with two prominent web service SLA specifications: WS-Agreement and WSLA to take advantage of their complementary features and achieve semantic interoperability.
Abstract: In the Internet of Things (IoT), billions of physical devices, distributed over a large geographic area, provide a near real-time state of the world. These devices’ capabilities can be abstracted as IoT services and delivered to users in a demand-driven way. In such a dynamic large-scale environment, a service provider who supports a service level agreement (SLA) can have a comprehensive competitive edge in terms of service quality management, service customization, optimized resource allocation, and trustworthiness. However, there is no consistent way of drafting an SLA with respect to describing heterogeneous IoT services, which obstructs automatic service selection, SLA negotiation, and SLA monitoring. In this paper, we propose an ontology, WIoT-SLA, to achieve semantic interoperability. We combine IoT service properties with two prominent web service SLA specifications: WS-Agreement and WSLA, to take advantage of their complementary features. This ontology is used to formalize the SLAs and SLA negotiation offers, which further facilitates the service selection and automatic SLA negotiation. It can also be used by a monitoring engine to detect SLA violations by providing the semantics of service level objectives (SLOs) and quality metrics. To evaluate our work, a prototype is implemented to demonstrate its feasibility and efficiency.

Proceedings ArticleDOI
10 Jun 2019
TL;DR: Greeniac is presented, a cluster-level task manager that employs Reinforcement Learning to identify optimal configurations at the server- and cluster-levels for different workloads and achieves up to 28% energy saving compared to best-case cluster scheduling techniques with local HMP-aware scheduling on a 4-server fog cluster.
Abstract: Fog computing has the potential to be an energy-efficient alternative to cloud computing for guaranteeing latency requirements of Latency-critical (LC) IoT services. However, even in fog computing low energy-efficiency of homogeneous multi-core server processors can be a major contributor to energy wastage. Recent studies have shown that Heterogeneous Multi-core Processors (HMPs) can improve energy efficiency of servers by adapting to dynamic load changes of LC-services. However, proposed approaches optimize energy only at a single server level. In our work, we demonstrate that optimization at the cluster-level across many HMP-servers can offer much greater energy savings through optimal work distribution across the HMP-servers while still guaranteeing the Service Level Objectives (SLO) of LC-services. In this paper, we present Greeniac, a cluster-level task manager that employs Reinforcement Learning to identify optimal configurations at the server- and cluster-levels for different workloads. We develop a server-level service scheduler and a cluster-level load balancing module to assign services and distribute tasks across HMP servers based on the learned configurations. In addition to meeting the required SLO targets, Greeniac achieves up to 28% energy saving compared to best-case cluster scheduling techniques with local HMP-aware scheduling on a 4-server fog cluster, with potentially larger savings in a larger cluster.

Proceedings ArticleDOI
09 Jul 2019
TL;DR: Preliminary simulation results are presented that enable the development of a generic methodology for SLA modeling and establishment that will lead to a win-win situation for all involved actors.
Abstract: The question of how to specify, provide and measure service quality for network end-users has been of utmost interest for service and network infrastructure providers and their clients as well. The Service Level Agreement (SLA) is a beneficial tool in formalizing the interrelationships resulting from a negotiation among all participating actors with the target of achieving a common comprehension concerning delivery of services, its priorities, quality, responsibilities, and other relevant parameters. A horizontal SLA is an agreement between two service-providers existing at the same architectural layer (as for example two Internet Protocol (IP) or two Optical Transport Network (OTN) domains). A vertical SLA is an agreement between two individual providers at two different architectural layers (for instance, between an optical network and the core MPLS network). A service has to be defined without ambiguity utilizing Service Level Specifications (SLS) and three information types must be described: i) The QoX metrics as well as their corresponding thresholds; ii) A method of service performance measurement; iii) Service schedule.In this work we present preliminary simulation results that enable the development of a generic methodology for SLA modeling and establishment that will lead to a win-win situation for all involved actors. As an example, we put special attention in the benefits obtained by Optical Networks operators.

Proceedings ArticleDOI
09 Dec 2019
TL;DR: A versatile tool for cost-effective SLO tuning, named k8-resource-optimizer, that relies on black-box performance tuning algorithms and can find near-optimal configurations for different multi-tenant deployment settings and different types of resource parameters is proposed.
Abstract: Resource management concepts of container orchestration platforms such as Kubernetes can be used to achieve multi-tenancy with quality of service differentiation between tenants. However, to support cost-effective enforcement of Service Level Objectives (SLOs) about response time or throughput, an automated resource optimization approach is needed for mapping custom SLOs of different tenants to cost-efficient resource allocation policies. We propose a versatile tool for cost-effective SLO tuning, named k8-resource-optimizer, that relies on black-box performance tuning algorithms. We illustrate and validate the tool for optimizing different resource configuration properties of a simple job processing application. Our experiments showed that k8-resource-optimizer can find near-optimal configurations for different multi-tenant deployment settings and different types of resource parameters. However an open research challenge is that, when the number of parameters increases, the total tuning cost may also increase beyond what is acceptable for contemporary cloud-native applications. We shortly discuss three possible complementary solutions to tackle this challenge.

Patent
19 Mar 2019
TL;DR: In this paper, the backup engine of a first storage system receives a request to perform a backup session from the first storage systems to a second storage system based on a backup service level objective (SLO) that has been configured for the backup session.
Abstract: A backup engine of a first storage system receives a request to perform a backup session from the first storage system to a second storage system based on a backup service level objective (SLO) that has been configured for the backup session. In response to the request, it is determined that a first backup resource allocated for the backup session by the first storage system cannot satisfy the SLO based on statistics of prior backup sessions in view of characteristics of the backup session to be performed. A dynamic resource allocation (DRA) module is to dynamically perform a first DRA to modify the first backup resource to satisfy the SLO. The backup engine then initiates the backup session by transmitting backup data from the first storage system to the second storage system using the modified first backup resource.

Journal ArticleDOI
13 Aug 2019
TL;DR: In this article, the authors explore relative importance of service quality dimensions across a "select" service context and explore how to allocate resources in the fashion that is consistent with customer priorities.
Abstract: Today"s world economy is becoming more of service oriented. This situation can be attributed to the increased importance and significance being witnessed in the service industry by both the developed and the developing economies of the world. In fact, growth in the service sector is often used as an indicator to measure a country"s economic growth. The service industry has since become the main stay of the economy of many nations. The service industry offers services to customers. These customers often expect value for their money. Customer satisfaction is of equal importance in the service industry just like in the product related business. Three forces dominate the prevailing marketing environment in the service sector increasing competition from private players, changing and improving technologies and continuous shifts in the regulatory environment, which has led to the growing customer sophistication. Customers have become more and more aware of their requirements and demand higher standards of services. Their perceptions and expectations are continually evolving, making it difficult for the service providers to measure and manage services effectively. The key lies in improving the service selectively, paying attention to more critical service attributes/dimensions as a part of customer service management. It is an imperative to understand how sensitive the customers are to various service attributes or dimensions. Allocating resources in the fashion that is consistent with customer priorities can enhance the effectiveness in the service operations. In addition, customer service attribute priorities need to be fully explored in service specific contexts. This paper is an attempt to explore relative importance of service quality dimensions across a "select" service context.

Proceedings ArticleDOI
02 Dec 2019
TL;DR: The architectural design of SLO-ML and the associated broker that realises the deployment operations are presented and its promises and limitations are expressed in terms of gained productivity and experienced usability and its limitations are highlighted as application requirements grow.
Abstract: Cloud modelling languages (CMLs) are designed to assist customers in tackling the diversity of services in the current cloud market. While many CMLs have been proposed in the literature, they lack practical support for automating the selection of services based on the specific service level objectives of a customer's application. We put forward SLO-ML, a novel and generative CML to capture service level requirements. Subsequently, SLO-ML selects the services to honour the customer's requirements and generates the deployment code appropriate to these services. We present the architectural design of SLO-ML and the associated broker that realises the deployment operations. We evaluate SLO-ML using an experimental case study with a group of researchers and developers using a real-world cloud application. We also assess SLO-ML's overheads through empirical scalability tests. We express the promises of SLO-ML in terms of gained productivity and experienced usability, and we highlight its limitations by analysing it as application requirements grow.

Proceedings ArticleDOI
01 Oct 2019
TL;DR: The results show that the proposed strategy can dynamically adjust the physical distribution of data before the disaster arrives, reaching the goal of rapid data evacuation.
Abstract: Slos (Service level objectives) defines specific quantization parameters for network service performance such as bandwidth of network, response time, delay, safety requirements, etc. The sum of all the factors that cause a violation of the Slos in a certain zone is called the Zone Risks of the data center network. This paper quantifies the Zone Risks of the data center network, under the SDN network architecture, the RBF Neural Network is used to predict the Zone Risks of the network. According to the level of Zone Risks, the data is placed in data center nodes with the lowest Zone Risks in advance. By this way, the strategy proposed by this paper significantly reduce the total amount of data need to be evacuated and improve the quality and efficiency of data evacuation when the Large-Scale disaster arrives. In order to verify the effectiveness of the proposed strategy, we use Mininet and OpenDaylight as SDN emulation and the controller respectively. The results show that the proposed strategy can dynamically adjust the physical distribution of data before the disaster arrives, reaching the goal of rapid data evacuation.

Journal ArticleDOI
TL;DR: An optimized memory bandwidth management approach for ensuring quality of service (QoS) and high server utilization and experimentally found that the proposed approach can achieve up to 99% SLO assurance and improve the server utilization up to 6.5×.
Abstract: Latency-critical workloads such as web search engines, social networks and finance market applications are sensitive to tail latencies for meeting service level objectives (SLOs). Since unexpected tail latencies are caused by sharing hardware resources with other co-executing workloads, a service provider executes the latency-critical workload alone. Thus, the data center for the latency-critical workloads has exceedingly low hardware resource utilization. For improving hardware resource utilization, the service provider has to co-locate the latency-critical workloads and other batch processing ones. However, because the memory bandwidth cannot be provided in isolation unlike the cores and cache memory, the latency-critical workloads experience poor performance isolation even though the core and cache memory are allocated in isolation to the workloads. To solve this problem, we propose an optimized memory bandwidth management approach for ensuring quality of service (QoS) and high server utilization. By providing isolated shared resources including the memory bandwidth to the latency-critical workload and co-executing batch processing ones, firstly, our proposed approach performs few pre-profilings under the assumption that memory bandwidth contention is the worst with a divide and conquer method. Second, we predict the memory bandwidth to meet the SLO for all queries per seconds (QPSs) based on results of the pre-profilings. Then, our approach allocates the amount of the isolated memory bandwidth that guarantees the SLO to the latency-critical workload and the rest of the memory bandwidth to co-executing batch processing ones. It is experimentally found that our proposed approach can achieve up to 99% SLO assurance and improve the server utilization up to 6.5×.

Patent
03 Dec 2019
TL;DR: In this paper, a decision tree is used to select a correction technique which is biased based on the cost of deployment and the decision tree maintains an average benefit of each technique and over time with rankings based on maximizing cost-benefit.
Abstract: Storage group performance targets are achieved by managing resources using discrete techniques that are selected based on learned cost-benefit rank. The techniques include delaying start of IOs based on storage group association, making a storage group active or passive on a port, and biasing front end cores. A performance goal may be assigned to each storage group based on volume of IOs and the difference between an observed response time and a target response time. A decision tree is used to select a correction technique which is biased based on the cost of deployment. The decision tree maintains an average benefit of each technique and over time with rankings based on maximizing cost-benefit.