scispace - formally typeset
Search or ask a question
Topic

High-throughput computing

About: High-throughput computing is a research topic. Over the lifetime, 325 publications have been published within this topic receiving 9831 citations.


Papers
More filters
Proceedings ArticleDOI
08 Nov 2004
TL;DR: The goals of BOINC are described, the design issues that were confronted, and the solutions to these problems are described.
Abstract: BOINC (Berkeley Open Infrastructure for Network Computing) is a software system that makes it easy for scientists to create and operate public-resource computing projects. It supports diverse applications, including those with large storage or communication requirements. PC owners can participate in multiple BOINC projects, and can specify how their resources are allocated among these projects. We describe the goals of BOINC, the design issues that we confronted, and our solutions to these problems.

2,061 citations

Journal ArticleDOI
TL;DR: The history and philosophy of the Condor project is provided and how it has interacted with other projects and evolved along with the field of distributed computing is described.
Abstract: SUMMARY Since 1984, the Condor project has enabled ordinary users to do extraordinary computing. Today, the project continues to explore the social and technical problems of cooperative computing on scales ranging from the desktop to the world-wide computational Grid. In this paper, we provide the history and philosophy of the Condor project and describe how it has interacted with other projects and evolved along with the field of distributed computing. We outline the core components of the Condor system and describe how the technology of computing must correspond to social structures. Throughout, we reflect on the lessons of experience and chart the course travelled by research ideas as they grow into production systems. Copyright c � 2005 John Wiley & Sons, Ltd.

1,969 citations

Proceedings ArticleDOI
28 Jul 1998
TL;DR: The classified advertisement (classad) matchmaking framework is developed and implemented, a flexible and general approach to resource management in distributed environment with decentralized ownership of resources.
Abstract: Conventional resource management systems use a system model to describe resources and a centralized scheduler to control their allocation. We argue that this paradigm does not adapt well to distributed systems, particularly those built to support high throughput computing. Obstacles include heterogeneity of resources, which make uniform allocation algorithms difficult to formulate, and distributed ownership, leading to widely varying allocation policies. Faced with these problems, we developed and implemented the classified advertisement (classad) matchmaking framework, a flexible and general approach to resource management in distributed environment with decentralized ownership of resources. Novel aspects of the framework include a semi structured data model that combines schema, data, and query in a simple but powerful specification language, and a clean separation of the matching and claiming phases of resource allocation. The representation and protocols result in a robust, scalable and flexible framework that can evolve with changing resources. The framework was designed to solve real problems encountered in the deployment of Condor, a high throughput computing system developed at the University of Wisconsin-Madison. Condor is heavily used by scientists at numerous sites around the world. It derives much of its robustness and efficiency from the matchmaking architecture.

829 citations

Journal ArticleDOI
TL;DR: FireWorks has been used to complete over 50 million CPU‐hours worth of computational chemistry and materials science calculations at the National Energy Research Supercomputing Center, and its implementation strategy that rests on Python and NoSQL databases (MongoDB) is discussed.
Abstract: This paper introduces FireWorks, a workflow software for running high-throughput calculation workflows at supercomputing centers. FireWorks has been used to complete over 50 million CPU-hours worth of computational chemistry and materials science calculations at the National Energy Research Supercomputing Center. It has been designed to serve the demanding high-throughput computing needs of these applications, with extensive support for i concurrent execution through job packing, ii failure detection and correction, iii provenance and reporting for long-running projects, iv automated duplicate detection, and v dynamic workflows i.e., modifying the workflow graph during runtime. We have found that these features are highly relevant to enabling modern data-driven and high-throughput science applications, and we discuss our implementation strategy that rests on Python and NoSQL databases MongoDB. Finally, we present performance data and limitations of our approach along with planned future work. Copyright © 2015 John Wiley & Sons, Ltd.

405 citations

Proceedings ArticleDOI
15 May 2001
TL;DR: The paper presents the design of XtremWeb and presents two essential features of this design are multi-applications and high-performance, which are ensured by scalability, fault tolerance, efficient scheduling and a large base of volunteer PCs.
Abstract: Global computing achieves high throughput computing by harvesting a very large number of unused computing resources connected to the Internet. This parallel computing model targets a parallel architecture defined by a very high number of nodes, poor communication performance and continuously varying resources. The unprecedented scale of the global computing architecture paradigm requires us to revisit many basic issues related to parallel architecture programming models, performance models, and class of applications or algorithms suitable for this architecture. XtremWeb is an experimental global computing platform dedicated to provide a tool for such studies. The paper presents the design of XtremWeb. Two essential features of this design are multi-applications and high-performance. Accepting multiple applications allows institutions or enterprises to set up their own global computing applications or experiments. High-performance is ensured by scalability, fault tolerance, efficient scheduling and a large base of volunteer PCs. We also present an implementation of the first global application running on XtremWeb.

378 citations

Network Information
Related Topics (5)
Server
79.5K papers, 1.4M citations
75% related
Cloud computing
156.4K papers, 1.9M citations
73% related
Scheduling (computing)
78.6K papers, 1.3M citations
72% related
Network topology
52.2K papers, 1M citations
71% related
Network packet
159.7K papers, 2.2M citations
70% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202115
20206
201914
201817
201721
201615