scispace - formally typeset
Search or ask a question

Showing papers by "Michael D. Smith published in 1997"


Proceedings ArticleDOI
01 Oct 1997
TL;DR: This paper focuses on the operating system support that is required to collect and manage profile information on an end-user's workstation in an automatic, continuous, and transparent manner and shows that Morph can use statistical profiles to improve application performance.
Abstract: The Morph system provides a framework for automatic collection and management of profile information and application of profile-driven optimizations. In this paper, we focus on the operating system support that is required to collect and manage profile information on an end-user's workstation in an automatic, continuous, and transparent manner. Our implementation for a Digital Alpha machine running Digital UNIX 4.0 achieves run-time overheads of less than 0.3% during profile collection. Through the application of three code layout optimizations, we further show that Morph can use statistical profiles to improve application performance. With appropriate system support, automatic profiling and optimization is both possible and effective.

219 citations


Patent
22 Jan 1997
TL;DR: In this article, the authors proposed an automated data transmission system for communicating warning signals and other information directly to law enforcement or security personnel in order to reduce response times by adding sensing circuitry to a standard security system installed at an industrial or residential site.
Abstract: An automated data transmission system for communicating warning signals and other information directly to law enforcement or security personnel in order to reduce response times. In the disclosed embodiment of the invention, sensing circuitry is added to a standard security system installed at an industrial or residential site. The sensing circuitry detects activation of the security system by sensing an outgoing warning call. Following activation of the security system, an automatic dialer or other automated transmitter circuitry is used to initiate a call to a processing computer at a predetermined phone number. The processing computer utilizes Caller ID™ information to determine the phone number assigned to the automatic dialer. The processing computer also maintains a database that associates assorted pieces of information with each phone number in the list, including the key map code location of the business, address of the protected business, name, date and time, and the phone number of a company contact. This information is communicated, through a paging service provider, to a dedicated group of one or more alphanumeric paging devices. The group of dedicated paging devices is provided to the law enforcement personnel assigned to protect the geographic region in which the secured industrial or residential site is located. Time-consuming human involvement in the notification process is thereby eliminated.

104 citations


Proceedings ArticleDOI
01 Dec 1997
TL;DR: An algorithm for procedure placement, one type of code-placement algorithm, that significantly differs from previous approaches in the type of information used to drive the placement algorithm is described, that gathers temporal ordering information that summarizes the interleaving of procedures in a program trace.
Abstract: Instruction cache performance is very important to instruction fetch efficiency and overall processor performance. The layout of an executable has a substantial effect on the cache miss rate during execution. This means that the performance of an executable can be improved significantly by applying a code-placement algorithm that minimizes instruction cache conflicts. We describe an algorithm for procedure placement, one type of code-placement algorithm, that significantly differs from previous approaches in the type of information used to drive the placement algorithm. In particular, we gather temporal ordering information that summarizes the interleaving of procedures in a program trace. Our algorithm uses this information along with cache configuration and procedure size information to better estimate the conflict cost of a potential procedure ordering. We compare the performance of our algorithm with previously published procedure-placement algorithms and show noticeable improvements in the instruction cache behavior.

92 citations


01 Jan 1997
TL;DR: An ephemeral instrumentation system for gathering branch biases and post-processing that data into a traditional edge profile is described and it is shown that it collects useful profiles with low overhead.
Abstract: Program profiling is a mechanism that is useful for performance evaluation and code optimization. Profiling techniques that provide detailed information with extremely low overhead are especially important for systems that continuously monitor or dynamically optimize running executables. In this paper, we describe an approach for program profiling called ephemeral instrumentation and show that it collects useful profiles with low overhead. This approach builds on ideas from both program instrumentation and statistical sampling; it produces binaries that are able to periodically record aspects of their executions in great detail. It works because program behavior is predictable and because we are able to convert ephemeral profiles into traditional formats. This paper describes an ephemeral instrumentation system for gathering branch biases and post-processing that data into a traditional edge profile. We evaluate the usefulness of such profiles by using them to drive a superblock scheduler. Our experimental results show that we can gather ephemeral profiles with extremely low overheads (1-5%) while acquiring profile data that rivals the usefulness of complete profiles gathered at much higher overheads.

64 citations


Proceedings ArticleDOI
01 May 1997
TL;DR: A branch alignment algorithm that usually achieves the minimum possible pipeline penalty and on the authors' benchmarks averages within 0.3% of a provable optimum, suggesting that greedy is good enough.
Abstract: Branch alignment reorders the basic blocks of a program to minimize pipeline penalties due to control-transfer instructions. Prior work in branch alignment has produced useful heuristic methods. We present a branch alignment algorithm that usually achieves the minimum possible pipeline penalty and on our benchmarks averages within 0.3% of a provable optimum. We compare the control penalties and running times of our algorithm to an older, greedy approach and observe that both the greedy method and our method are close to the lower bound on control penalties, suggesting that greedy is good enough. Surprisingly, in actual execution our method produces programs that run noticeably faster than the greedy method. We also report results from training and testing on different data sets, validating that our results can be achieved in real-world usage. Training and testing on different data sets slightly reduced the benefits from both branch alignment algorithms, but the ranking of the algorithms does not change, and the bulk of the benefits remain.

52 citations