scispace - formally typeset
Search or ask a question
Author

Michael Hirohama

Bio: Michael Hirohama is an academic researcher from University of California, Berkeley. The author has an hindex of 3, co-authored 3 publications receiving 454 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: The design and implementation decisions made for the three-dimensional data manager POSTGRES are discussed, and attention is restricted to the DBMS backend functions.
Abstract: The design and implementation decisions made for the three-dimensional data manager POSTGRES are discussed. Attention is restricted to the DBMS backend functions. The POSTGRES data model and query language, the rules system, the storage system, the POSTGRES implementation and the current status and performance are discussed. >

432 citations

Book ChapterDOI
01 Dec 2018
TL;DR: The purpose of this paper is to reflect on the design and implementation decisions made and to offer advice to implementors who might follow some of the authors' paths, with particular attention to the DBMS "backend" functions.
Abstract: Currently, POSTGRES is about 90 000 lines of code in C and is being used by assorted "bold and brave" early users. The system has been constructed by a team of five part-time students led by a full-time chief programmer over the last three years. During this period, we have made a large number of design and implementation choices. Moreover, in some areas we would do things quite differently if we were to start from scratch again. The purpose of this paper is to reflect on the design and implementation decisions we made and to offer advice to implementors who might follow some of our paths. In this paper, we restrict our attention to the DBMS "backend" functions. In another paper, some of us treat Picasso, the application development environment that is being built on top of POSTGRES.

16 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This survey describes a wide array of practical query evaluation techniques for both relational and postrelational database systems, including iterative execution of complex query evaluation plans, the duality of sort- and hash-based set-matching algorithms, types of parallel query execution and their implementation, and special operators for emerging database application domains.
Abstract: Database management systems will continue to manage large data volumes. Thus, efficient algorithms for accessing and manipulating large sets and sequences will be required to provide acceptable performance. The advent of object-oriented and extensible database systems will not solve this problem. On the contrary, modern data models exacerbate the problem: In order to manipulate large sets of complex objects as efficiently as today's database systems manipulate simple records, query-processing algorithms and software will become more complex, and a solid understanding of algorithm and architectural issues is essential for the designer of database management software. This survey provides a foundation for the design and implementation of query execution facilities in new database management systems. It describes a wide array of practical query evaluation techniques for both relational and postrelational database systems, including iterative execution of complex query evaluation plans, the duality of sort- and hash-based set-matching algorithms, types of parallel query execution and their implementation, and special operators for emerging database application domains.

1,427 citations

Journal ArticleDOI
TL;DR: This work presents an approach that integrates a relational database retrieval system with a color analysis technique, and shows how a coarse granularity is used for content analysis improves the ability to retrieve images efficiently.
Abstract: Selecting from a large, expanding collection of images requires carefully chosen search criteria. We present an approach that integrates a relational database retrieval system with a color analysis technique. The Chabot project was initiated at our university to study storage and retrieval of a vast collection of digitized images. These images are from the State of California Department of Water Resources. The goal was to integrate a relational database retrieval system with content analysis techniques that would give our querying system a better method for handling images. Our simple color analysis method, if used in conjunction with other search criteria, improves our ability to retrieve images efficiently. The best result is obtained when text-based search criteria are combined with content-based criteria and when a coarse granularity is used for content analysis. >

780 citations

Proceedings ArticleDOI
03 Dec 1995
TL;DR: This paper shows how to use application-disclosed access patterns (hints) to expose and exploit I/O parallelism and to allocate dynamically file buffers among three competing demands: prefetching hinted blocks, caching hinted blocks for reuse, and caching recently used data for unhinted accesses.
Abstract: The underutilization of disk parallelism and file cache buffers by traditional file systems induces I/O stall time that degrades the performance of modern microprocessor-based systems. In this paper, we present aggressive mechanisms that tailor file system resource management to the needs of I/O-intensive applications. In particular, we show how to use application-disclosed access patterns (hints) to expose and exploit I/O parallelism and to allocate dynamically file buffers among three competing demands: prefetching hinted blocks, caching hinted blocks for reuse, and caching recently used data for unhinted accesses. Our approach estimates the impact of alternative buffer allocations on application execution time and applies a cost-benefit analysis to allocate buffers where they will have the greatest impact. We implemented informed prefetching and caching in DEC''s OSF/1 operating system and measured its performance on a 150 MHz Alpha equipped with 15 disks running a range of applications including text search, 3D scientific visualization, relational database queries, speech recognition, and computational chemistry. Informed prefetching reduces the execution time of the first four of these applications by 20% to 87%. Informed caching reduces the execution time of the fifth application by up to 30%.

770 citations

Proceedings ArticleDOI
01 Jun 1993
TL;DR: This paper presents a first detailed study of spatial join processing using R-trees, particularly R*-tree, and presents several techniques for improving its execution time with respect to both, CPU- and I/O-time.
Abstract: Spatial joins are one of the most important operations for combining spatial objects of several relations. The efficient processing of a spatial join is extremely important since its execution time is superlinear in the number of spatial objects of the participating relations, and this number of objects may be very high. In this paper, we present a first detailed study of spatial join processing using R-trees, particularly R*-trees. R-trees are very suitable for supporting spatial queries and the R*-tree is one of the most efficient members of the R-tree family. Starting from a straightforward approach, we present several techniques for improving its execution time with respect to both, CPU- and I/O-time. Eventually, we end up with an algorithm whose total execution time is improved over the first approach by an order of magnitude. Using a buffer of reasonable size, I/O-time is almost optimal, i.e. it almost corresponds to the time for reading each required page of the relations exactly once. The performance of the various approaches is investigated in an experimental performance comparison where several large data sets from real applications are used.

637 citations

Proceedings Article
01 Jan 2005
TL;DR: An in-depth investigation to the reason why database systems tend to achieve only low IPC on modern CPUs in compute-intensive application areas, and a new set of guidelines for designing a query processor for the MonetDB system that follows these guidelines.
Abstract: Database systems tend to achieve only low IPC (instructions-per-cycle) eciency on modern CPUs in compute-intensive application areas like decision support, OLAP and multimedia retrieval. This paper starts with an in-depth investigation to the reason why this happens, focusing on the TPC-H benchmark. Our analysis of various relational systems and MonetDB leads us to a new set of guidelines for designing a query processor. The second part of the paper describes the architecture of our new X100 query engine for the MonetDB system that follows these guidelines. On the surface, it resembles a classical Volcano-style engine, but the crucial dierence to base all execution on the concept of vector processing makes it highly CPU ecien t. We evaluate the power of MonetDB/X100 on the 100GB version of TPC-H, showing its raw execution power to be between one and two orders of magnitude higher than previous technology.

548 citations