scispace - formally typeset
Search or ask a question
Author

Robert Schreiber

Bio: Robert Schreiber is an academic researcher from Hewlett-Packard. The author has contributed to research in topics: Matrix (mathematics) & Shared memory. The author has an hindex of 49, co-authored 182 publications receiving 12755 citations. Previous affiliations of Robert Schreiber include Ames Research Center & Cornell University.


Papers
More filters
Journal ArticleDOI
01 Sep 1991
TL;DR: A new set of benchmarks has been developed for the performance evaluation of highly parallel supercom puters that mimic the computation and data move ment characteristics of large-scale computational fluid dynamics applications.
Abstract: A new set of benchmarks has been developed for the performance evaluation of highly parallel supercom puters. These consist of five "parallel kernel" bench marks and three "simulated application" benchmarks. Together they mimic the computation and data move ment characteristics of large-scale computational fluid dynamics applications. The principal distinguishing feature of these benchmarks is their "pencil and paper" specification-all details of these benchmarks are specified only algorithmically. In this way many of the difficulties associated with conventional bench- marking approaches on highly parallel systems are avoided.

2,246 citations

Book ChapterDOI
23 Jun 2008
TL;DR: This paper describes a CF algorithm alternating-least-squares with weighted-?-regularization(ALS-WR), which is implemented on a parallel Matlab platform and shows empirically that the performance of ALS-WR monotonically improves with both the number of features and thenumber of ALS iterations.
Abstract: Many recommendation systems suggest items to users by utilizing the techniques of collaborative filtering(CF) based on historical records of items that the users have viewed, purchased, or rated Two major problems that most CF approaches have to contend with are scalability and sparseness of the user profiles To tackle these issues, in this paper, we describe a CF algorithm alternating-least-squares with weighted-?-regularization(ALS-WR), which is implemented on a parallel Matlab platform We show empirically that the performance of ALS-WR (in terms of root mean squared error(RMSE)) monotonically improves with both the number of features and the number of ALS iterations We applied the ALS-WR algorithm on a large-scale CF problem, the Netflix Challenge, with 1000 hidden features and obtained a RMSE score of 08985, which is one of the best results based on a pure method In addition, combining with the parallel version of other known methods, we achieved a performance improvement of 591% over Netflix's own CineMatch recommendation system Our method is simple and scales well to very large datasets

776 citations

Journal ArticleDOI
01 Jun 2008
TL;DR: This work believes that in comparison with an electrically-connected many-core alternative that uses the same on-stack interconnect power, Corona can provide 2 to 6 times more performance on many memory intensive workloads, while simultaneously reducing power.
Abstract: We expect that many-core microprocessors will push performance per chip from the 10 gigaflop to the 10 teraflop range in the coming decade. To support this increased performance, memory and inter-core bandwidths will also have to scale by orders of magnitude. Pin limitations, the energy cost of electrical signaling, and the non-scalability of chip-length global wires are significant bandwidth impediments. Recent developments in silicon nanophotonic technology have the potential to meet these off- and on-stack bandwidth requirements at acceptable power levels. Corona is a 3D many-core architecture that uses nanophotonic communication for both inter-core communication and off-stack communication to memory or I/O devices. Its peak floating-point performance is 10 teraflops. Dense wavelength division multiplexed optically connected memory modules provide 10 terabyte per second memory bandwidth. A photonic crossbar fully interconnects its 256 low-power multithreaded cores at 20 terabyte per second bandwidth. We have simulated a 1024 thread Corona system running synthetic benchmarks and scaled versions of the SPLASH-2 benchmark suite. We believe that in comparison with an electrically-connected many-core alternative that uses the same on-stack interconnect power, Corona can provide 2 to 6 times more performance on many memory intensive workloads, while simultaneously reducing power.

688 citations

Book
01 Nov 1993
TL;DR: High Performance Fortran is a set of extensions to Fortran expressing parallel execution at a relatively high level that brings the convenience of sequential Fortran a step closer to today's complex parallel machines.
Abstract: From the Publisher: High Performance Fortran (HPF) is a set of extensions to Fortran expressing parallel execution at a relatively high level. For the thousands of scientists, engineers, and others who wish to take advantage of the power of both vector and parallel supercomputers, five of the principal authors of HPF have teamed up here to write a tutorial for the language. There is an increasing need for a common parallel Fortran that can serve as a programming interface with the new parallel machines that are appearing on the market. While HPF does not solve all the problems of parallel programming, it does provide a portable, high-level expression for data- parallel algorithms that brings the convenience of sequential Fortran a step closer to today's complex parallel machines.

677 citations

Journal ArticleDOI
TL;DR: The matrix computation language and environment MATLAB is extended to include sparse matrix storage and operations, and nearly all the operations of MATLAB now apply equally to full or sparse matrices, without any explicit action by the user.
Abstract: The matrix computation language and environment MATLAB is extended to include sparse matrix storage and operations. The only change to the outward appearance of the MATLAB language is a pair of commands to create full or sparse matrices. Nearly all the operations of MATLAB now apply equally to full or sparse matrices, without any explicit action by the user. The sparse data structure represents a matrix in space proportional to the number of nonzero entries, and most of the operations compute sparse results in time proportional to the number of arithmetic operations on nonzeros.

613 citations


Cited by
More filters
Book
01 Apr 2003
TL;DR: This chapter discusses methods related to the normal equations of linear algebra, and some of the techniques used in this chapter were derived from previous chapters of this book.
Abstract: Preface 1. Background in linear algebra 2. Discretization of partial differential equations 3. Sparse matrices 4. Basic iterative methods 5. Projection methods 6. Krylov subspace methods Part I 7. Krylov subspace methods Part II 8. Methods related to the normal equations 9. Preconditioned iterations 10. Preconditioning techniques 11. Parallel implementations 12. Parallel preconditioners 13. Multigrid methods 14. Domain decomposition methods Bibliography Index.

13,484 citations

Journal ArticleDOI
TL;DR: SciPy as discussed by the authors is an open source scientific computing library for the Python programming language, which includes functionality spanning clustering, Fourier transforms, integration, interpolation, file I/O, linear algebra, image processing, orthogonal distance regression, minimization algorithms, signal processing, sparse matrix handling, computational geometry, and statistics.
Abstract: SciPy is an open source scientific computing library for the Python programming language. SciPy 1.0 was released in late 2017, about 16 years after the original version 0.1 release. SciPy has become a de facto standard for leveraging scientific algorithms in the Python programming language, with more than 600 unique code contributors, thousands of dependent packages, over 100,000 dependent repositories, and millions of downloads per year. This includes usage of SciPy in almost half of all machine learning projects on GitHub, and usage by high profile projects including LIGO gravitational wave analysis and creation of the first-ever image of a black hole (M87). The library includes functionality spanning clustering, Fourier transforms, integration, interpolation, file I/O, linear algebra, image processing, orthogonal distance regression, minimization algorithms, signal processing, sparse matrix handling, computational geometry, and statistics. In this work, we provide an overview of the capabilities and development practices of the SciPy library and highlight some recent technical developments.

12,774 citations

Journal ArticleDOI
TL;DR: As the Netflix Prize competition has demonstrated, matrix factorization models are superior to classic nearest neighbor techniques for producing product recommendations, allowing the incorporation of additional information such as implicit feedback, temporal effects, and confidence levels.
Abstract: As the Netflix Prize competition has demonstrated, matrix factorization models are superior to classic nearest neighbor techniques for producing product recommendations, allowing the incorporation of additional information such as implicit feedback, temporal effects, and confidence levels

9,583 citations

Journal ArticleDOI
TL;DR: DARTEL has been applied to intersubject registration of 471 whole brain images, and the resulting deformations were evaluated in terms of how well they encode the shape information necessary to separate male and female subjects and to predict the ages of the subjects.

6,999 citations

Journal ArticleDOI
TL;DR: An overview of the lattice Boltzmann method, a parallel and efficient algorithm for simulating single-phase and multiphase fluid flows and for incorporating additional physical complexities, is presented.
Abstract: We present an overview of the lattice Boltzmann method (LBM), a parallel and efficient algorithm for simulating single-phase and multiphase fluid flows and for incorporating additional physical complexities. The LBM is especially useful for modeling complicated boundary conditions and multiphase interfaces. Recent extensions of this method are described, including simulations of fluid turbulence, suspension flows, and reaction diffusion systems.

6,565 citations