scispace - formally typeset
Search or ask a question
Author

Valentin Kravtsov

Bio: Valentin Kravtsov is an academic researcher from Technion – Israel Institute of Technology. The author has contributed to research in topics: Grid & Grid computing. The author has an hindex of 7, co-authored 10 publications receiving 207 citations.

Papers
More filters
Journal ArticleDOI
01 Apr 2008
TL;DR: The DataMiningGrid system provides tools and services facilitating the grid-enabling of data mining applications without any intervention on the application side and critical features of the system include flexibility, extensibility, scalability, efficiency, conceptual simplicity and ease of use.
Abstract: The DataMiningGrid system has been designed to meet the requirements of modern and distributed data mining scenarios. Based on the Globus Toolkit and other open technology and standards, the DataMiningGrid system provides tools and services facilitating the grid-enabling of data mining applications without any intervention on the application side. Critical features of the system include flexibility, extensibility, scalability, efficiency, conceptual simplicity and ease of use. The system has been developed and evaluated on the basis of a diverse set of use cases from different sectors in science and technology. The DataMiningGrid software is freely available under Apache License 2.0.

88 citations

Journal ArticleDOI
TL;DR: The authors developed the DataMiningGrid system, which integrates a diverse set of programs and application scenarios within a single framework, and features scalability, flexible extensibility, sophisticated support for relevant standards and different users.
Abstract: As modern data mining applications increase in complexity, so too do their demands for resources. Grid computing is one of several emerging networked computing paradigms promising to meet the requirements of heterogeneous, large-scale, and distributed data mining applications. Despite this promise, there are still too many issues to be resolved before grid technology is commonly applied to large-scale data mining tasks. To address some of these issues, the authors developed the DataMiningGrid system. It integrates a diverse set of programs and application scenarios within a single framework, and features scalability, flexible extensibility, sophisticated support for relevant standards and different users.

30 citations

Proceedings ArticleDOI
15 Jun 2009
TL;DR: A simple, yet powerful, methodology for application-agnostic diagnostic and remediation of performance hot spots in elastic multi-tiered client/server applications, deployed as collections of black box Virtual Machines (VM).
Abstract: In this work we present a simple, yet powerful, methodology for application-agnostic diagnostic and remediation of performance hot spots in elastic multi-tiered client/server applications, deployed as collections of black box Virtual Machines (VM). Our novel out-of-band black-box performance management system, Network Analysis for Remediating Performance Bottlenecks (NAP), listens to the TCP/IP traffic on the virtual network interfaces of the VMs comprising an application and analyzes statistical properties of this traffic. From this analysis, which is application independent and transparent to the VMs, NAP identifies performance bottlenecks that might effect application performance and derives remediation decisions that are most likely to alleviate the application performance degradation. We prototyped our solution for the Xen hypervisor and evaluated it using the popular Trade6 benchmark that simulates a typical e-commerce application. Our results show that NAP successfully identifies performance bottlenecks in a complex multi-tier application setting, while incurring negligible performance overhead.

28 citations

Proceedings ArticleDOI
09 Dec 2009
TL;DR: The concept of topology-aware grid applications is derived from parallelized computational models of complex systems that are executed on heterogeneous resources, either because they require specialized hardware for certain calculations, or because their parallelization is flexible enough to exploit such resources.
Abstract: The concept of topology-aware grid applications is derived from parallelized computational models of complex systems that are executed on heterogeneous resources, either because they require specialized hardware for certain calculations, or because their parallelization is flexible enough to exploit such resources. Here we describe two such applications, a multi-body simulation of stellar evolution, and an evolutionary algorithm that is used for reverse-engineering gene regulatory networks. We then describe the topology-aware middleware we have developed to facilitate the ``modeling-implementing-executing'' cycle of complex systems applications. The developed middleware allows topology-aware simulations to run on geographically distributed clusters with or without firewalls between them. Additionally, we describe advanced coallocation and scheduling techniques that take into account the applications topologies. Results are given based on running the topology-aware applications on the Grid'5000 infrastructure.

17 citations

Proceedings Article
01 Jan 2008
TL;DR: This work designs architecture for a quasi-opportunistic supercomputer within the EU-supported project QosCosGrid, and presents the results obtained from studying and identifying the requirements a grid needs to meet in order to facilitate quasi- opportunistic supercomputing.
Abstract: Grids are becoming mission-critical components in research and industry, offering sophisticated solutions in leveraging large-scale computing and storage resources. Grid resources are usually shared among multiple organizations in an opportunistic manner. However, an opportunistic or "best effort" quality-of-service scheme may be inadequate in situations where a large number of resources need to be allocated and applications which rely on static, stable execution environments. The goal of this work is to implement what we refer to as quasi-opportunistic supercomputing. A quasi-opportunistic supercomputer facilitates demanding parallel computing applications on the basis of massive, non-dedicated resources in grid computing environments. Within the EU-supported project QosCosGrid we are developing a quasi-opportunistic supercomputer. In this work we present the results obtained from studying and identifying the requirements a grid needs to meet in order to facilitate quasi-opportunistic supercomputing. Based on these requirements we have designed architecture for a quasi-opportunistic supercomputer. The paper presents and discusses this architecture.

14 citations


Cited by
More filters
Proceedings Article
01 Jan 2003

1,212 citations

Proceedings ArticleDOI
26 Oct 2011
TL;DR: CloudScale is a system that automates fine-grained elastic resource scaling for multi-tenant cloud computing infrastructures that can achieve significantly higher SLO conformance than other alternatives with low resource and energy cost.
Abstract: Elastic resource scaling lets cloud systems meet application service level objectives (SLOs) with minimum resource provisioning costs. In this paper, we present CloudScale, a system that automates fine-grained elastic resource scaling for multi-tenant cloud computing infrastructures. CloudScale employs online resource demand prediction and prediction error handling to achieve adaptive resource allocation without assuming any prior knowledge about the applications running inside the cloud. CloudScale can resolve scaling conflicts between applications using migration, and integrates dynamic CPU voltage/frequency scaling to achieve energy savings with minimal effect on application SLOs. We have implemented CloudScale on top of Xen and conducted extensive experiments using a set of CPU and memory intensive applications (RUBiS, Hadoop, IBM System S). The results show that CloudScale can achieve significantly higher SLO conformance than other alternatives with low resource and energy cost. CloudScale is non-intrusive and light-weight, and imposes negligible overhead (

662 citations

Proceedings Article
24 Mar 1997
TL;DR: A fresh look is presented at the nature of complexity in the building of computer based systems with a wide range of reasons all the way from hardware failures through software errors right to major system level mistakes.
Abstract: Every organisation from the scale of whole countries down to small companies has a list of system developments which have ended in various forms of disaster. The nature of the failures varies but typical examples are: cost overruns; timescale overruns and sometimes, loss of life. The post-mortems to these systems reveal a wide range of reasons all the way from hardware failures, through software errors right to major system level mistakes. More importantly a large number of these systems share one attribute: complexity. This paper presents a fresh look at the nature of complexity in the building of computer based systems.

620 citations

Proceedings ArticleDOI
01 Oct 2010
TL;DR: This paper presents a novel PRedictive Elastic reSource Scaling (PRESS) scheme for cloud systems that unobtrusively extracts fine-grained dynamic patterns in application resource demands and adjust their resource allocations automatically.
Abstract: Cloud systems require elastic resource allocation to minimize resource provisioning costs while meeting service level objectives (SLOs). In this paper, we present a novel PRedictive Elastic reSource Scaling (PRESS) scheme for cloud systems. PRESS unobtrusively extracts fine-grained dynamic patterns in application resource demands and adjust their resource allocations automatically. Our approach leverages light-weight signal processing and statistical learning algorithms to achieve online predictions of dynamic application resource requirements. We have implemented the PRESS system on Xen and tested it using RUBiS and an application load trace from Google. Our experiments show that we can achieve good resource prediction accuracy with less than 5% over-estimation error and near zero under-estimation error, and elastic resource scaling can both significantly reduce resource waste and SLO violations.

591 citations

Proceedings Article
01 Jan 2013
TL;DR: AGILE uses wavelets to provide a medium-term resource demand prediction with enough lead time to start up new application server instances before performance falls short, and it uses dynamic VM cloning to reduce application startup times.
Abstract: Dynamically adjusting the number of virtual machines (VMs) assigned to a cloud application to keep up with load changes and interference from other uses typically requires detailed application knowledge and an ability to know the future, neither of which are readily available to infrastructure service providers or application owners. The result is that systems need to be over-provisioned (costly), or risk missing their performance Service Level Objectives (SLOs) and have to pay penalties (also costly). AGILE deals with both issues: it uses wavelets to provide a medium-term resource demand prediction with enough lead time to start up new application server instances before performance falls short, and it uses dynamic VM cloning to reduce application startup times. Tests using RUBiS and Google cluster traces show that AGILE can predict varying resource demands over the medium-term with up to 3.42× better true positive rate and 0.34× the false positive rate than existing schemes. Given a target SLO violation rate, AGILE can efficiently handle dynamic application workloads, reducing both penalties and user dissatisfaction.

267 citations