Topic

High-throughput computing

About: High-throughput computing is a research topic. Over the lifetime, 325 publications have been published within this topic receiving 9831 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

BOINC: A System for Public-Resource Computing and Storage

[...]

Dustin Anderson¹•Institutions (1)

University of California, Berkeley¹

08 Nov 2004

TL;DR: The goals of BOINC are described, the design issues that were confronted, and the solutions to these problems are described.

...read moreread less

Abstract: BOINC (Berkeley Open Infrastructure for Network Computing) is a software system that makes it easy for scientists to create and operate public-resource computing projects. It supports diverse applications, including those with large storage or communication requirements. PC owners can participate in multiple BOINC projects, and can specify how their resources are allocated among these projects. We describe the goals of BOINC, the design issues that we confronted, and our solutions to these problems.

...read moreread less

2,061 citations

Journal Article•DOI•

Distributed computing in practice: the Condor experience

[...]

Douglas Thain¹, Todd Tannenbaum¹, Miron Livny¹•Institutions (1)

University of Wisconsin-Madison¹

01 Feb 2005-Concurrency and Computation: Practice and Experience

TL;DR: The history and philosophy of the Condor project is provided and how it has interacted with other projects and evolved along with the field of distributed computing is described.

...read moreread less

Abstract: SUMMARY Since 1984, the Condor project has enabled ordinary users to do extraordinary computing. Today, the project continues to explore the social and technical problems of cooperative computing on scales ranging from the desktop to the world-wide computational Grid. In this paper, we provide the history and philosophy of the Condor project and describe how it has interacted with other projects and evolved along with the field of distributed computing. We outline the core components of the Condor system and describe how the technology of computing must correspond to social structures. Throughout, we reflect on the lessons of experience and chart the course travelled by research ideas as they grow into production systems. Copyright c � 2005 John Wiley & Sons, Ltd.

...read moreread less

1,969 citations

Proceedings Article•DOI•

Matchmaking: distributed resource management for high throughput computing

[...]

Rajesh Raman¹, Miron Livny¹, Marvin Solomon¹•Institutions (1)

University of Wisconsin-Madison¹

28 Jul 1998

TL;DR: The classified advertisement (classad) matchmaking framework is developed and implemented, a flexible and general approach to resource management in distributed environment with decentralized ownership of resources.

...read moreread less

Abstract: Conventional resource management systems use a system model to describe resources and a centralized scheduler to control their allocation. We argue that this paradigm does not adapt well to distributed systems, particularly those built to support high throughput computing. Obstacles include heterogeneity of resources, which make uniform allocation algorithms difficult to formulate, and distributed ownership, leading to widely varying allocation policies. Faced with these problems, we developed and implemented the classified advertisement (classad) matchmaking framework, a flexible and general approach to resource management in distributed environment with decentralized ownership of resources. Novel aspects of the framework include a semi structured data model that combines schema, data, and query in a simple but powerful specification language, and a clean separation of the matching and claiming phases of resource allocation. The representation and protocols result in a robust, scalable and flexible framework that can evolve with changing resources. The framework was designed to solve real problems encountered in the deployment of Condor, a high throughput computing system developed at the University of Wisconsin-Madison. Condor is heavily used by scientists at numerous sites around the world. It derives much of its robustness and efficiency from the matchmaking architecture.

...read moreread less

829 citations

Journal Article•DOI•

FireWorks: a dynamic workflow system designed for high-throughput applications

[...]

Anubhav Jain¹, Shyue Ping Ong², Wei Chen¹, Bharat Medasani¹, Xiaohui Qu¹, Michael Kocher¹, Miriam Brafman¹, Guido Petretto³, Gian-Marco Rignanese³, Geoffroy Hautier³, Dan Gunter¹, Kristin A. Persson¹ - Show less +8 more•Institutions (3)

Lawrence Berkeley National Laboratory¹, University of California, San Diego², Université catholique de Louvain³

10 Dec 2015-Concurrency and Computation: Practice and Experience

TL;DR: FireWorks has been used to complete over 50 million CPU‐hours worth of computational chemistry and materials science calculations at the National Energy Research Supercomputing Center, and its implementation strategy that rests on Python and NoSQL databases (MongoDB) is discussed.

...read moreread less

Abstract: This paper introduces FireWorks, a workflow software for running high-throughput calculation workflows at supercomputing centers. FireWorks has been used to complete over 50 million CPU-hours worth of computational chemistry and materials science calculations at the National Energy Research Supercomputing Center. It has been designed to serve the demanding high-throughput computing needs of these applications, with extensive support for i concurrent execution through job packing, ii failure detection and correction, iii provenance and reporting for long-running projects, iv automated duplicate detection, and v dynamic workflows i.e., modifying the workflow graph during runtime. We have found that these features are highly relevant to enabling modern data-driven and high-throughput science applications, and we discuss our implementation strategy that rests on Python and NoSQL databases MongoDB. Finally, we present performance data and limitations of our approach along with planned future work. Copyright © 2015 John Wiley & Sons, Ltd.

...read moreread less

405 citations

Proceedings Article•DOI•

XtremWeb: a generic global computing system

[...]

Gilles Fedak, Cécile Germain¹, Vincent Neri¹, Franck Cappello¹•Institutions (1)

University of Paris-Sud¹

15 May 2001

TL;DR: The paper presents the design of XtremWeb and presents two essential features of this design are multi-applications and high-performance, which are ensured by scalability, fault tolerance, efficient scheduling and a large base of volunteer PCs.

...read moreread less

Abstract: Global computing achieves high throughput computing by harvesting a very large number of unused computing resources connected to the Internet. This parallel computing model targets a parallel architecture defined by a very high number of nodes, poor communication performance and continuously varying resources. The unprecedented scale of the global computing architecture paradigm requires us to revisit many basic issues related to parallel architecture programming models, performance models, and class of applications or algorithms suitable for this architecture. XtremWeb is an experimental global computing platform dedicated to provide a tool for such studies. The paper presents the design of XtremWeb. Two essential features of this design are multi-applications and high-performance. Accepting multiple applications allows institutions or enterprises to set up their own global computing applications or experiments. High-performance is ensured by scalability, fault tolerance, efficient scheduling and a large base of volunteer PCs. We also present an implementation of the first global application running on XtremWeb.

...read moreread less

378 citations

Collapse

Network Information

Performance

Metrics

325

Papers

10,533

Citations

No. of papers in the topic in previous years
Year	Papers
2021	15
2020	6
2019	14
2018	17
2017	21
2016	15

High-throughput computing

Papers published on a yearly basis

Papers

Network Information

Related Topics (5)

Performance

Metrics