scispace - formally typeset
Search or ask a question
Author

David Carmeli

Bio: David Carmeli is an academic researcher from Technion – Israel Institute of Technology. The author has contributed to research in topics: Grid & Quasi-opportunistic supercomputing. The author has an hindex of 3, co-authored 3 publications receiving 27 citations.

Papers
More filters
Proceedings Article
01 Jan 2008
TL;DR: This work designs architecture for a quasi-opportunistic supercomputer within the EU-supported project QosCosGrid, and presents the results obtained from studying and identifying the requirements a grid needs to meet in order to facilitate quasi- opportunistic supercomputing.
Abstract: Grids are becoming mission-critical components in research and industry, offering sophisticated solutions in leveraging large-scale computing and storage resources. Grid resources are usually shared among multiple organizations in an opportunistic manner. However, an opportunistic or "best effort" quality-of-service scheme may be inadequate in situations where a large number of resources need to be allocated and applications which rely on static, stable execution environments. The goal of this work is to implement what we refer to as quasi-opportunistic supercomputing. A quasi-opportunistic supercomputer facilitates demanding parallel computing applications on the basis of massive, non-dedicated resources in grid computing environments. Within the EU-supported project QosCosGrid we are developing a quasi-opportunistic supercomputer. In this work we present the results obtained from studying and identifying the requirements a grid needs to meet in order to facilitate quasi-opportunistic supercomputing. Based on these requirements we have designed architecture for a quasi-opportunistic supercomputer. The paper presents and discusses this architecture.

14 citations

Journal ArticleDOI
01 Sep 2010
TL;DR: A complete scheduling framework for multi-cluster, heterogeneous environments that provides an efficient solution for the scheduling of topology-aware applications and can be easily configured to support a variety of scheduling policies is proposed.
Abstract: Scheduling of large-scale, distributed topology-aware applications requires that not only the properties of the requested machines be considered, but also the properties of the machines' interconnections. This requirement severely complicates the scheduling process, as even a matching between a single multi-processors task and available machines in a single time slot becomes an NP-complete problem with no polynomial approximation. In this paper we propose a complete scheduling framework for multi-cluster, heterogeneous environments that provides, in practice, an efficient solution for the scheduling of topology-aware applications. The proposed framework is very flexible as it is composed of pluggable components and can be easily configured to support a variety of scheduling policies. W e also describe three novel scheduling and coallocation algorithms that were developed and plugged into the framework. The proposed scheduling framework was integrated into the QosCosGrid1 system, where it is used as the main decision-making module.

8 citations

Book ChapterDOI
09 Jun 2008
TL;DR: This paper describes a novel quasi-opportunistic supercomputing system that enables execution of demanding parallel applications in grids through identification and implementation of the set of key technologies required to realize the vision of grids as (virtual) supercomputers.
Abstract: The ultimate vision of grid computing are virtual supercomputers of unprecedented power, through utilization of geographically dispersed distributively owned resources. Despite the overwhelming success of grids there still exist many demanding applications considered the exclusive prerogative of real supercomputers (i.e. tightly coupled parallel applications like complex systems simulations). These rely on a static execution environment with predictable performance, provided through efficient co-allocation of a large number of reliable interconnected resources. In this paper, we describe a novel quasi-opportunistic supercomputersystem that enables execution of demanding parallel applications in grids through identification and implementation of the set of key technologies required to realize the vision of grids as (virtual) supercomputers. These technologies include an incentive-based framework basic on ideas from economics; a co-allocation subsystem that is enhanced by communication topology-aware allocation mechanisms; a fault tolerant message passing library that hides the failures of the underlying resources; and data pre-staging orchestration.

5 citations


Cited by
More filters
Proceedings Article
24 Mar 1997
TL;DR: A fresh look is presented at the nature of complexity in the building of computer based systems with a wide range of reasons all the way from hardware failures through software errors right to major system level mistakes.
Abstract: Every organisation from the scale of whole countries down to small companies has a list of system developments which have ended in various forms of disaster. The nature of the failures varies but typical examples are: cost overruns; timescale overruns and sometimes, loss of life. The post-mortems to these systems reveal a wide range of reasons all the way from hardware failures, through software errors right to major system level mistakes. More importantly a large number of these systems share one attribute: complexity. This paper presents a fresh look at the nature of complexity in the building of computer based systems.

620 citations

Journal ArticleDOI
TL;DR: A high-level and well-defined Multiscale Modeling Language (MML) is enhanced that describes and specifies multiscale models and their computational architecture in a modular way and is applied to two selected applications in nanotechnology and biophysics, showing its capabilities.

72 citations

Journal ArticleDOI
TL;DR: This work investigates the performance of distributed multiscales computing of component-based models, guided by six multiscale applications with different characteristics and from several disciplines, finding that the first mode has the apparent benefit of increasing simulation speed, and the second mode can increase simulation speed if local resources are limited.
Abstract: Multiscale simulations model phenomena across natural scales using monolithic or component-based code, running on local or distributed resources. In this work, we investigate the performance of distributed multiscale computing of component-based models, guided by six multiscale applications with different characteristics and from several disciplines. Three modes of distributed multiscale computing are identified: supplementing local dependencies with large-scale resources, load distribution over multiple resources, and load balancing of small- and large-scale resources. We find that the first mode has the apparent benefit of increasing simulation speed, and the second mode can increase simulation speed if local resources are limited. Depending on resource reservation and model coupling topology, the third mode may result in a reduction of resource consumption.

44 citations

Book ChapterDOI
20 May 2009
TL;DR: The middleware developed in the QosCosGrid project is described, which provides advance reservation and resource co-allocation functionality as well as support for parallel applications based on OpenMPI (for C/C++ and Fortran) or ProActive for Java.
Abstract: The aim of the QosCosGrid project is to bring supercomputer-like performance and structure to cross-cluster computations. To support parallel complex systems simulations, QosCosGrid provides six reusable templates that may be instantiated with simulation-specific code to help with developing parallel applications using the ProActive Java library. The templates include static and dynamic graphs, cellular automata and mobile agents. In this work, we show that little performance is lost when a ProActive cellular automata simulation is executed across two distant administrative domains. We describe the middleware developed in the QosCosGrid project, which provides advance reservation and resource co-allocation functionality as well as support for parallel applications based on OpenMPI (for C/C++ and Fortran) or ProActive for Java. In particular, we describe how we modified ProActive Java to enable inter- cluster communication through firewalls. The bulk of the QosCosGrid software is available in open source from the QosCosGrid project website: www.qoscosgrid.org.

32 citations

Book ChapterDOI
01 Jan 2012
TL;DR: In this chapter the authors present the new capabilities of QosCosGrid (QCG) middleware for advanced job and resource management in the grid environment and prove the usefulness of the new functionality.
Abstract: In this chapter we present the new capabilities of QosCosGrid (QCG) middleware for advanced job and resource management in the grid environment. By connecting many computing clusters together, QosCosGrid offers easy-to-use mapping, execution and monitoring capabilities for a variety of complex computations, such as parameter sweep, workflows, MPI or hybrid MPI-OpenMP as well as multiscale simulations. Thanks to QosCosGrid, large-scale programming models written in Fortran, C, C++ or Java can be automatically distributed over a network of computing resources with guaranteed Quality of Service --- for example guaranteed startup time of a job. Consequently, applications can be run at specified periods with reduced execution time and waiting times. This enables more complex problem instances to be addressed. In order to prove the usefulness of the new functionality of QosCosGrid a detailed description of the system along with a real use case scenario from the quantum chemistry science domain will be presented in this chapter.

31 citations