scispace - formally typeset
Search or ask a question
Author

Jordà Polo

Bio: Jordà Polo is an academic researcher from Barcelona Supercomputing Center. The author has contributed to research in topics: Scheduling (computing) & Scalability. The author has an hindex of 9, co-authored 26 publications receiving 776 citations. Previous affiliations of Jordà Polo include Polytechnic University of Catalonia.

Papers
More filters
Proceedings ArticleDOI
19 Apr 2010
TL;DR: This paper introduces a new task scheduler for a MapReduce framework that allows performance-driven management of Map Reduce tasks and allows applications to meet their performance objectives without over-provisioning of physical resources.
Abstract: MapReduce is a data-driven programming model proposed by Google in 2004 which is especially well suited for distributed data analytics applications. We consider the management of MapReduce applications in an environment where multiple applications share the same physical resources. Such sharing is in line with recent trends in data center management which aim to consolidate workloads in order to achieve cost and energy savings. In a shared environment, it is necessary to predict and manage the performance of workloads given a set of performance goals defined for them. In this paper, we address this problem by introducing a new task scheduler for a MapReduce framework that allows performance-driven management of MapReduce tasks. The proposed task scheduler dynamically predicts the performance of concurrent MapReduce jobs and adjusts the resource allocation for the jobs. It allows applications to meet their performance objectives without over-provisioning of physical resources.

224 citations

Book ChapterDOI
12 Dec 2011
TL;DR: A resource-aware scheduling technique for MapReduce multi-job workloads that aims at improving resource utilization across machines while observing completion time goals, guided by user-provided completion time Goals for each job.
Abstract: We present a resource-aware scheduling technique for MapReduce multi-job workloads that aims at improving resource utilization across machines while observing completion time goals. Existing MapReduce schedulers define a static number of slots to represent the capacity of a cluster, creating a fixed number of execution slots per machine. This abstraction works for homogeneous workloads, but fails to capture the different resource requirements of individual jobs in multi-user environments. Our technique leverages job profiling information to dynamically adjust the number of slots on each machine, as well as workload placement across them, to maximize the resource utilization of the cluster. In addition, our technique is guided by user-provided completion time goals for each job. Source code of our prototype is available at [1].

174 citations

Proceedings ArticleDOI
28 Sep 2015
TL;DR: The goal of this work is to compare the performance of CPU and network running benchmarks in the two aforementioned models of micro services architecture to provide a benchmark analysis guidance for system designers.
Abstract: Micro services architecture has started a new trend for application development for a number of reasons: (1) to reduce complexity by using tiny services, (2) to scale, remove and deploy parts of the system easily, (3) to improve flexibility to use different frameworks and tools, (4) to increase the overall scalability, and (5) to improve the resilience of the system. Containers have empowered the usage of micro services architectures by being lightweight, providing fast start-up times, and having a low overhead. Containers can be used to develop applications based on monolithic architectures where the whole system runs inside a single container or inside a micro services architecture where one or few processes run inside the containers. Two models can be used to implement a micro services architecture using containers: master-slave, or nested-container. The goal of this work is to compare the performance of CPU and network running benchmarks in the two aforementioned models of micro services architecture hence provide a benchmark analysis guidance for system designers.

126 citations

Proceedings ArticleDOI
TL;DR: In this article, the authors compare the performance of CPU and network running benchmarks in the two aforementioned models of microservices architecture and provide a benchmark analysis guidance for system designers, which can be used to develop applications based on monolithic architectures where the whole system runs inside a single container or inside a microservices architectures where one or few processes run inside the containers.
Abstract: Microservices architecture has started a new trend for application development for a number of reasons: (1) to reduce complexity by using tiny services; (2) to scale, remove and deploy parts of the system easily; (3) to improve flexibility to use different frameworks and tools; (4) to increase the overall scalability; and (5) to improve the resilience of the system. Containers have empowered the usage of microservices architectures by being lightweight, providing fast start-up times, and having a low overhead. Containers can be used to develop applications based on monolithic architectures where the whole system runs inside a single container or inside a microservices architecture where one or few processes run inside the containers. Two models can be used to implement a microservices architecture using containers: master-slave, or nested-container. The goal of this work is to compare the performance of CPU and network running benchmarks in the two aforementioned models of microservices architecture hence provide a benchmark analysis guidance for system designers.

125 citations

Proceedings ArticleDOI
13 Sep 2010
TL;DR: This work describes the changes introduced in the Adaptive Scheduler to enable it with hardware awareness and with the ability to co-schedule accelerable and non-accelerable jobs on the same heterogeneous MapReduce cluster, making the most of the underlying hybrid systems.
Abstract: Next generation data centers will be composed of thousands of hybrid systems in an attempt to increase overall cluster performance and to minimize energy consumption. New programming models, such as MapReduce, specifically designed to make the most of very large infrastructures will be leveraged to develop massively distributed services. At the same time, data centers will bring an unprecedented degree of workload consolidation, hosting in the same infrastructure distributed services from many different users. In this paper we present our advancements in leveraging the Adaptive MapReduce Scheduler to meet user defined high level performance goals while transparently and efficiently exploiting the capabilities of hybrid systems. While the Adaptive Scheduler was already able to dynamically allocate resources to co-located MapReduce jobs based on their completion time goals, it was completely unaware of specific hardware capabilities. In our work we describe the changes introduced in the Adaptive Scheduler to enable it with hardware awareness and with the ability to co-schedule accelerable and non-accelerable jobs on the same heterogeneous MapReduce cluster, making the most of the underlying hybrid systems. The developed prototype is tested in a cluster of Cell/BE blades and relies on the use of accelerated and non-accelerated versions of the MapReduce tasks of different deployed applications to dynamically select the best version to run on each node. Decisions are made after workload composition and jobs' completion time goals. Results show that the augmented Adaptive Scheduler provides dynamic resource allocation across jobs, hardware affinity when possible, and is even able to spread jobs' tasks across accelerated and non-accelerated nodes in order to meet performance goals in extreme conditions. To our knowledge this is the first MapReduce scheduler and prototype that is able to manage high-level performance goals even in presence of hybrid systems and accelerable jobs.

58 citations


Cited by
More filters
Proceedings ArticleDOI
14 Jun 2011
TL;DR: This work designs a MapReduce performance model and implements a novel SLO-based scheduler in Hadoop that determines job ordering and the amount of resources to allocate for meeting the job deadlines and validate the approach using a set of realistic applications.
Abstract: MapReduce and Hadoop represent an economically compelling alternative for efficient large scale data processing and advanced analytics in the enterprise. A key challenge in shared MapReduce clusters is the ability to automatically tailor and control resource allocations to different applications for achieving their performance goals. Currently, there is no job scheduler for MapReduce environments that given a job completion deadline, could allocate the appropriate amount of resources to the job so that it meets the required Service Level Objective (SLO). In this work, we propose a framework, called ARIA, to address this problem. It comprises of three inter-related components. First, for a production job that is routinely executed on a new dataset, we build a job profile that compactly summarizes critical performance characteristics of the underlying application during the map and reduce stages. Second, we design a MapReduce performance model, that for a given job (with a known profile) and its SLO (soft deadline), estimates the amount of resources required for job completion within the deadline. Finally, we implement a novel SLO-based scheduler in Hadoop that determines job ordering and the amount of resources to allocate for meeting the job deadlines.We validate our approach using a set of realistic applications. The new scheduler effectively meets the jobs' SLOs until the job demands exceed the cluster resources. The results of the extensive simulation study are validated through detailed experiments on a 66-node Hadoop cluster.

494 citations

01 Jan 2006
TL;DR: After you change your VT Google password, you will be unable to log on to VT Google Apps services including Mail, Drive, Groups, etc.
Abstract: IT Status "Password doesn't match" error. 4Help is aware that after you change your VT Google password, you will be unable to log on to VT Google Apps services including Mail, Drive, Groups, etc. 4Help is notifying the appropriate people. 12:00 Noon: Engineers have found a backlog on Google password replication. Once the backlog clears you should be able to log on with your changed password that you set earlier. You may be able to log on with your old VT Google password until the system catches up and syncs the new password. Service Degraded Service Degraded [Resolved] Created: Thu, 04/14/2016 11:20am Resolved: Fri, 04/15/2016 1:16pm Duration: 1 day 1 hour 56 min 1734 Views Source URL: https://computing.vt.edu/content/google-0

312 citations

Proceedings ArticleDOI
08 Oct 2018
TL;DR: Gandiva is introduced, a new cluster scheduling framework that utilizes domain-specific knowledge to improve latency and efficiency of training deep learning models in a GPU cluster and achieves better utilization by transparently migrating and time-slicing jobs to achieve better job-to-resource fit.
Abstract: We introduce Gandiva, a new cluster scheduling framework that utilizes domain-specific knowledge to improve latency and efficiency of training deep learning models in a GPU cluster.One key characteristic of deep learning is feedback-driven exploration, where a user often runs a set of jobs (or a multi-job) to achieve the best result for a specific mission and uses early feedback on accuracy to dynamically prioritize or kill a subset of jobs; simultaneous early feedback on the entire multi-job is critical. A second characteristic is the heterogeneity of deep learning jobs in terms of resource usage, making it hard to achieve best-fit a priori. Gandiva addresses these two challenges by exploiting a third key characteristic of deep learning: intra-job predictability, as they perform numerous repetitive iterations called mini-batch iterations. Gandiva exploits intra-job predictability to time-slice GPUs efficiently across multiple jobs, thereby delivering low-latency. This predictability is also used for introspecting job performance and dynamically migrating jobs to better-fit GPUs, thereby improving cluster efficiency.We show via a prototype implementation and micro-benchmarks that Gandiva can speed up hyper-parameter searches during deep learning by up to an order of magnitude, and achieves better utilization by transparently migrating and time-slicing jobs to achieve better job-to-resource fit. We also show that, in a real workload of jobs running in a 180-GPU cluster, Gandiva improves aggregate cluster utilization by 26%, pointing to a new way of managing large GPU clusters for deep learning.

282 citations

01 Jan 2016
TL;DR: This service oriented architecture concepts technology and design will help people to read a good book with a cup of tea in the afternoon, instead they cope with some harmful virus inside their laptop.
Abstract: Thank you very much for downloading service oriented architecture concepts technology and design. Maybe you have knowledge that, people have look hundreds times for their favorite books like this service oriented architecture concepts technology and design, but end up in malicious downloads. Rather than reading a good book with a cup of tea in the afternoon, instead they cope with some harmful virus inside their laptop.

278 citations

Proceedings ArticleDOI
04 Nov 2016
TL;DR: This paper presents a systematic mapping study of microservices architectures and their implementation, focusing on identifying architectural challenges, the architectural diagrams/views and quality attributes related to microsevice systems.
Abstract: The accelerating progress of network speed, reliability and security creates an increasing demand to move software and services from being stored and processed locally on users' machines to being managed by third parties that are accessible through the network. This has created the need to develop new software development methods and software architectural styles that meet these new demands. One such example in software architectural design is the recent emergence of the microservices architecture to address the maintenance and scalability demands of online service providers. As microservice architecture is a new research area, the need for a systematic mapping study is crucial in order to summarise the progress so far and identify the gaps and requirements for future studies. In this paper we present a systematic mapping study of microservices architectures and their implementation. Our study focuses on identifying architectural challenges, the architectural diagrams/views and quality attributes related to microsevice systems.

272 citations