scispace - formally typeset
Search or ask a question
JournalISSN: 1895-1767

Scalable Computing: Practice and Experience 

Scalable Computing: Practice and Experience
About: Scalable Computing: Practice and Experience is an academic journal published by Scalable Computing: Practice and Experience. The journal publishes majorly in the area(s): Cloud computing & Scalability. It has an ISSN identifier of 1895-1767. It is also open access. Over the lifetime, 732 publications have been published receiving 15460 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: The main purpose is to update the designers and users of parallel numerical algorithms with the latest research in the field and present the novel ideas, results and work in progress and advancing state-of-the-art techniques in the area of parallel and distributed computing for numerical and computational optimization problems in scientific and engineering application.
Abstract: Edited by Tianruo Yang Kluwer Academic Publisher, Dordrech, Netherlands, 1999, 248 pp. ISBN 0-7923-8588-8, $135.00 This book contains a selection of contributed and invited papers presented and the workshop Frontiers of Parallel Numerical Computations and Applications, in the IEEE 7th Symposium on the Frontiers on Massively Parallel Computers (Frontiers '99) at Annapolis, Maryland, February 20-25, 1999. Its main purpose is to update the designers and users of parallel numerical algorithms with the latest research in the field. A broad spectrum of topics on parallel numerical computations, with applications to some of the more challenging engineering problems, is covered. Parallel algorithm designers and engineers who use extensively parallel numerical computations, as well as graduate students in Computer Science, Scientific Computing, various engineering fields and applied mathematics should benefit from reading it. The first part is addressed to a larger audience and presents papers on parallel numerical algorithms. Two new libraries are presented: PSPASSES and PoLAPACK. PSPASSES is a collection of parallel direct solvers, for sparse symmetric positive definite linear systems, which are characterized by high performance and good scalability. PoLAPACK library contains LU and QR codes based on a new blocking strategy that guarantees good performance regardless of the physical block size. Next, an efficient approach to solving stiff ordinary differential equations by diagonal implicitly iterated Runge-Kutta (DIIRK) method is described. DIIRK renders a fast parallel implementation due to a reduced number of function evaluation and an automatic stepsize control mechanism. Finally, minimization of sufficiently smooth non-linear functionals is sought via parallel space decomposition. Here, a theoretical background of the problem and two equivalent algorithms are presented. New research directions for classical solvers are treated in the next three papers: first, reduction of the global synchronization in the biconjugate gradient method, second, a new more efficient Jacobi ordering for the multiple-port hypercubes, and finally, an analysis of the theoretical performance of an improved version of the Quasi-minimal residual method. Parallel numerical applications constitute the second part of the book, with results from fluid mechanics, material sciences, applications to signal and image processing, dynamic systems, semiconductor technology and electronic circuits and systems design. With one exception, the authors expose in detail parallel implementations of the algorithms and numerical results. First, a 3D-elasticity problem is solved using an additive overlapping domain decomposition algorithm. Second, an overlapping mesh technique is used in a parallel solver for the compressible flow problem. Then, a parallel version of a complex numerical algorithm to solve a lubrication problem studied in tribology is introduced. Next, a timid approach to parallel computing of the cavity flow by the finite element method is presented. The problem solved is rather small for today's needs and only up to 6 processors are used. This is also the only paper that does not present results from numerical experiments. The remaining applications discussed in the subsequent chapters are: large scale multidisciplinary design optimization problem with application to the design of a supersonic commercial aircraft, a report on progress in parallel solution of an electromagnetic scattering problem using boundary integral methods and an optimal solution to the convection-diffusion equation modeling the concentration of a pollutant in the air. The book is of definite interest to readers who keep up-to-date with the parallel numerical computation research. The main purpose, to present the novel ideas, results and work in progress and advancing state-of-the-art techniques in the area of parallel and distributed computing for numerical and computational optimization problems in scientific and engineering application is clearly achieved. However, due to its content it cannot serve as a textbook for a computer science or engineering class. Overall, is a reference type book to be kept by specialists and in a library rather than a book to be purchased for self-introduction to the field. Most of the papers presented are results of ongoing research and so they rely heavily on previous results. On the other hand, with only one exception, the results presented in the papers are a great source of information for the researchers currently involved in the field. Michelle Pal, Los Alamos National Laboratory

4,696 citations

Journal ArticleDOI
TL;DR: This comprehensive test/reference provides a foundation for the understanding and implementation of parallel programming skills which are needed to achieve breakthrough results by developing parallel applications that perform well on certain classes of Graphic Processor Units (GPUs).
Abstract: Programming Massively Parallel Processors. A Hands-on Approach David Kirk and Wen-mei Hwu ISBN: 978-0-12-381472-2 Copyright 2010 Introduction This book is designed for graduate/undergraduate students and practitioners from any science and engineering discipline who use computational power to further their field of research. This comprehensive test/reference provides a foundation for the understanding and implementation of parallel programming skills which are needed to achieve breakthrough results by developing parallel applications that perform well on certain classes of Graphic Processor Units (GPUs). The book guides the reader to experience programming by using an extension to C language, in CUDA which is a parallel programming environment supported on NVIDIA GPUs, and emulated on less parallel CPUs. Given the fact that parallel programming on any High Performance Computer is complex and requires knowledge about the underlying hardware in order to write an efficient program, it becomes an advantage of this book over others to be specific toward a particular hardware. The book takes the readers through a series of techniques for writing and optimizing parallel programming for several real-world applications. Such experience opens the door for the reader to learn parallel programming in depth. Outline of the Book Kirk and Hwu effectively organize and link a wide spectrum of parallel programming concepts by focusing on the practical applications in contrast to most general parallel programming texts that are mostly conceptual and theoretical. The authors are both affiliated with NVIDIA; Kirk is an NVIDIA Fellow and Hwu is principle investigator for the first NVIDIA CUDA Center of Excellence at the University of Illinois at Urbana-Champaign. Their coverage in the book can be divided into four sections. The first part (Chapters 1–3) starts by defining GPUs and their modern architectures and later providing a history of Graphics Pipelines and GPU computing. It also covers data parallelism, the basics of CUDA memory/threading models, the CUDA extensions to the C language, and the basic programming/debugging tools. The second part (Chapters 4–7) enhances student programming skills by explaining the CUDA memory model and its types, strategies for reducing global memory traffic, the CUDA threading model and granularity which include thread scheduling and basic latency hiding techniques, GPU hardware performance features, techniques to hide latency in memory accesses, floating point arithmetic, modern computer system architecture, and the common data-parallel programming patterns needed to develop a high-performance parallel application. The third part (Chapters 8–11) provides a broad range of parallel execution models and parallel programming principles, in addition to a brief introduction to OpenCL. They also include a wide range of application case studies, such as advanced MRI reconstruction, molecular visualization and analysis. The last chapter (Chapter 12) discusses the great potential for future architectures of GPUs. It provides commentary on the evolution of memory architecture, Kernel Execution Control Evolution, and programming environments. Summary In general, this book is well-written and well-organized. A lot of difficult concepts related to parallel computing areas are easily explained, from which beginners or even advanced parallel programmers will benefit greatly. It provides a good starting point for beginning parallel programmers who can access a Tesla GPU. The book targets specific hardware and evaluates performance based on this specific hardware. As mentioned in this book, approximately 200 million CUDA-capable GPUs have been actively in use. Therefore, the chances are that a lot of beginning parallel programmers can have access to Telsa GPU. Also, this book gives clear descriptions of Tesla GPU architecture, which lays a solid foundation for both beginning parallel programmers and experienced parallel programmers. The book can also serve as a good reference book for advanced parallel computing courses. Jie Cheng, University of Hawaii Hilo

1,511 citations

Journal ArticleDOI
TL;DR: This book is designed for readers who are interested in studying how to develop general parallel applications on graphics processing unit (GPU) by using CUDA C, a programming language which combines industry standard programming C language and some more features which can exploit CUDA architecture.
Abstract: CUDA by Example: An Introduction to General-Purpose GPU Programming Jason Sanders and Edward Kandrot ISBN-13: 978-0131387683 Addison-Wesley Professional; 1 edition (July 29, 2010) Introduction This book is designed for readers who are interested in studying how to develop general parallel applications on graphics processing unit (GPU) by using CUDA C. CUDA C is a programming language, which combines industry standard programming C language and some more features which can exploit CUDA architecture. With proper introduction to NVIDA's CUDA architecture and in depth explanation for setting up development environment, this book is an easy to read, easy to understand, and hands on book. Readers of this book are assumed to have at least C language as background. Through this book, readers will not only gain experience in CUDA C development languages, but also will understand a lot of important underlying hardware knowledge, which in return can help software developers develop more efficient and effective applications. Outline of the Book This book is very well organized. Each chapter consists of general introduction, chapter objectives and Chapter Review. Both Sanders and Kandrot are senior software engineers in the CUDA Platform group and CUDA Algorithm team in NVIDIA Company, respectively. First chapter provide users background about history of GPU and CUDA architecture. Special features in CUDA architecture enable GPU to perform general purpose computation in addition to carry out traditional graphic computation. Readers can easily understand the benefit of CUDA architecture by reading though three different applications varying from medical field to environmental filed. In Chapter 2, Sanders and Kandrot equip users with complete lists of hardware and software support for running CUDA C applications. All software can be downloaded for free from websites suggested from authors. Then by a familiar Hello world program in Chapter 3, authors demystified that the CUDA C fundamentally is a standard C language with additional features which can allow application developer to specify which code can be run on device (GPU and its memory) or host (CPU and system memory). After setting all of proper background, the use of CUDA C to run parallel programs on GPU are discussed from Chapter 4 to Chapter 7. In Chapter 8, authors try to illustrate how to incorporate rendering and general purpose computation by using CUDA C. Readers without background in OpenGL or DirectX, can skip this chapter and go to the next. However, this chapter is a great addition to the book since it gives readers complete view of CUDA C. Even though CUDA C turns complicated application with single thread execution into easier case by parallel processing, there are some situation that special care should be taken when simple single thread application are tried to implement on massively parallel architecture; Chapter 9 discusses this topic. Compared to parallelism discussed in above chapters, which refers to parallel execution of a function on different sets of data, in Chapter 10, readers are exposed to a different class of parallelism on GPU, which refers to two or more completely independent tasks to be performed in parallel. Chapter 11 covers how to develop CUDA C application on Multiple GPUS. For further study, Chapter 12 shows more tools to aid CUDA C development and more resources to enhance reader's CUBA C development skills to another level. Summary Jason Sanders and Edward Kandrot wrote this book in such a way that is very easy to read and follow. Also, Sanders and Kandrot never forget the great sense of humor throughout the book. Reading this book is not only a discovery about CUDA C but also a joyful journal. It is highly recommended for students who are interested to learn CUDA C application development as Computer Science major. This book is recommended to be adopted as textbook for undergraduate students studying parallel programming. Jie Cheng, University of Hawaii Hilo

937 citations

Journal ArticleDOI
TL;DR: Stephen J. Hartley first provides a complete explanation of the features of Java necessary to write concurrent programs, including topics such as exception handling, interfaces, and packages, and takes a different approach than most Java references.
Abstract: Stephen J. Hartley Oxford University Press, New York, 1998, 260 pp. ISBN 0-19-511315-2, $45.00 Concurrent Programming is a thorough treatment of Java multi-threaded programming for both a stand-alone and distributed environment. Designed mostly for students in concurrent or parallel programming classes, the text is also an excellent reference for the practicing professional developing multi-threaded programs or applets. Hartley first provides a complete explanation of the features of Java necessary to write concurrent programs, including topics such as exception handling, interfaces, and packages. He then gives the reader a solid background to write multi-threaded programs and also presents the problems introduced when writing concurrent programs—namely race conditions, mutual exclusion, and deadlock. Hartley also provides several software solutions that do not require the use of common process and thread mechanisms. Once the groundwork is laid for writing concurrent programs, Hartley then takes a different approach than most Java references. Rather than presenting how Java handles mutual exclusion with the synchronized keyword (although it is covered later), he first looks at semaphore-based solutions to classic concurrent problems such as bounded-buffer, readers-writers, and the dining philosophers. Hartley also uses the same approach to develop Java classes for monitors and message passing. This unique approach to introducing concurrency allows the readers to both understand how Java threads are synchronized and how the basic synchronization mechanism can be used to construct more abstract tools such as semaphores. If there is a shortcoming with the text it is with the lack of sufficient coverage of remote method invocation (RMI), although there is a section covering RMI. This is quite understandable as RMI is a fairly recent phenomenon with the Java community. Also, the classes that Hartley provides could easily implement RMI rather than sockets to handle communication. The strengths of the book include its ease in reading, several examples at the end of chapters, a package similar to Xtango that provides algorithm animation, and a supportive web site by the author (see www.mcs.drexel.edu/~shartley/ConcProgJava/index.html ) including compressed source code. As Java becomes more dominant on the server side of multi-tier applications, writing thread-safe concurrent applications becomes even more important. Concurrent Programming is a strong step towards teaching students and professionals such skills. Greg Gagne, Westminster College of Salt Lake City Salt Lake City, Utah

587 citations

Journal ArticleDOI
TL;DR: Yair Censor and Stavros A. Zenios, Oxford University Press, New York, 1997, 539 pp.
Abstract: Yair Censor and Stavros A. Zenios, Oxford University Press, New York, 1997, 539 pp., ISBN 0-19-510062-X, $85.00

486 citations

Performance
Metrics
No. of papers from the Journal in previous years
YearPapers
20236
202233
20218
202054
201943
201830