scispace - formally typeset
Search or ask a question
Author

Steven Lucco

Other affiliations: Microsoft
Bio: Steven Lucco is an academic researcher from University of California, Berkeley. The author has contributed to research in topics: Compiler & Dynamic priority scheduling. The author has an hindex of 10, co-authored 15 publications receiving 1741 citations. Previous affiliations of Steven Lucco include Microsoft.

Papers
More filters
Proceedings ArticleDOI
01 Dec 1993
TL;DR: It is demonstrated that for frequently communicating modules, implementing fault isolation in software rather than hardware can substantially improve end-to-end application performance.
Abstract: One way to provide fault isolation among cooperating software modules is to place each in its own address space. However, for tightly-coupled modules, this solution incurs prohibitive context switch overhead. In this paper, we present a software approach to implementing fault isolation within a single address space.Our approach has two parts. First, we load the code and data for a distrusted module into its own fault do main, a logically separate portion of the application's address space. Second, we modify the object code of a distrusted module to prevent it from writing or jumping to an address outside its fault domain. Both these software operations are portable and programming language independent.Our approach poses a tradeoff relative to hardware fault isolation: substantially faster communication between fault domains, at a cost of slightly increased execution time for distrusted modules. We demonstrate that for frequently communicating modules, implementing fault isolation in software rather than hardware can substantially improve end-to-end application performance.

1,370 citations

Proceedings ArticleDOI
01 May 1996
TL;DR: Omniware uses software fault isolation, a technology developed to provide safe extension code for databases and operating systems, to achieve a unique combination of language-independence and excellent performance.
Abstract: This paper evaluates the design and implementation of Omniware: a safe, efficient, and language-independent system for executing mobile program modules. Previous approaches to implementing mobile code rely on either language semantics or abstract machine interpretation to enforce safety. In the former case, the mobile code system sacrifices universality to gain safety by dictating a particular source language or type system. In the latter case, the mobile code system sacrifices performance to gain safety through abstract machine interpretation.Omniware uses software fault isolation, a technology developed to provide safe extension code for databases and operating systems, to achieve a unique combination of language-independence and excellent performance. Software fault isolation uses only the semantics of the underlying processor to determine whether a mobile code module can corrupt its execution environment. This separation of programming language implementation from program module safety enables our mobile code system to use a radically simplified virtual machine as its basis for portability. We measured the performance of Omniware using a suite of four SPEC92 programs on the Pentium, PowerPC, Mips, and Sparc processor architectures. Including the overhead for enforcing safety on all four processors, OmniVM executed the benchmark programs within 21% as fast as the optimized, unsafe code produced by the vendor-supplied compiler.

113 citations

Proceedings ArticleDOI
01 Jul 1992
TL;DR: A fundamental relationship between three quantities that characterize an irregular parallel computation is shown: the total available parallelism, the optimal grain size, and the statistical variance of execution times for individual tasks, which yields a dynamic scheduling algorithm that substantially reduces the overhead of executing irregular parallel operations.
Abstract: This paper develops a methodology for compiling and executing irregular parallel programs. Such programs implement parallel operations whose size and work distribution depend on input data. We show a fundamental relationship between three quantities that characterize an irregular parallel computation: the total available parallelism, the optimal grain size, and the statistical variance of execution times for individual tasks. This relationship yields a dynamic scheduling algorithm that substantially reduces the overhead of executing irregular parallel operations.We incorporated this algorithm into an extended Fortran compiler. The compiler accepts as input a subset of Fortran D which includes blocked and cyclic decompositions and perfect alignment; it outputs Fortran 77 augmented with calls to library routines written in C. For irregular parallel operations, the compiled code gathers information about available parallelism and task execution time variance and uses this information to schedule the operation. On distributed memory architectures, the compiler encodes information about data access patterns for the runtime scheduling system so that it can preserve communication locality.We evaluated these compilation techniques using a set of application programs including climate modeling, circuit simulation, and x-ray tomography, that contain irregular parallel operations. The results demonstrate that, for these applications, the dynamic techniques described here achieve near-optimal efficiency on large numbers of processors. In addition, they perform significantly better, on these problems, than any previously proposed static or dynamic scheduling algorithm.

79 citations

Proceedings ArticleDOI
01 Jun 1993
TL;DR: This paper implemented and evaluated two complementary techniques for reducing the overhead of monitoring memory updates and developed data flow algorithms that eliminate checks on some classes of write instructions but may increase the complexity of the remaining checks.
Abstract: A data breakpoint associates debugging actions with programmer-specified conditions on the memory state of an executing program. Data breakpoints provide a means for discovering program bugs that are tedious or impossible to isolate using control breakpoints alone. In practice, programmers rarely use data breakpoints, because they are either unimplemented or prohibitively slow in available debugging software. In this paper, we present the design and implementation of a practical data breakpoint facility.A data breakpoint facility must monitor all memory updates performed by the program being debugged. We implemented and evaluated two complementary techniques for reducing the overhead of monitoring memory updates. First, we checked write instructions by inserting checking code directly into the program being debugged. The checks use a segmented bitmap data structure that minimizes address lookup complexity. Second, we developed data flow algorithms that eliminate checks on some classes of write instructions but may increase the complexity of the remaining checks.We evaluated these techniques on the SPARC using the SPEC benchmarks. Checking each write instruction using a segmented bitmap achieved an average overhead of 42%. This overhead is independent of the number of breakpoints in use. Data flow analysis eliminated an average of 79% of the dynamic write checks. For scientific programs such the NAS kernels, analysis reduced write checks by a factor of ten or more. On the SPARC these optimizations reduced the average overhead to 25%.

70 citations

01 Jan 1996
TL;DR: This paper evaluated Omniware under the Solaris 2.4 operating system on a SPARCstation 5 using eight C benchmark programs, including five programs from the C SPEC92 benchmark suite, and showed that Omniware modules execute at near native speeds.
Abstract: This paper describes Omniware, a system for producing and executing mobile code. Next generation Web applications will use mobile code to specify dynamic behavior in Web pages, implement new Web protocols and data formats, and dynamically distribute computation between servers and browsers. Like all mobile code systems, Omniware provides portability and safety. The same compiled Omniware module can be executed transparently on different machines, and a module’s access to host resources can be precisely controlled. In addition to portability and safety, Omniware has two unique features. First, Omniware is open. Omniware uses software fault isolation (SFI) to enforce safe execution of standard programming languages, enabling Web developers to leverage the vast store of existing software and programming expertise. For example, Omniware developers can use C++ to create programs for Web pages. Second, Omniware is fast. We evaluated Omniware under the Solaris 2.4 operating system on a SPARCstation 5 using eight C benchmark programs, including five programs from the C SPEC92 benchmark suite. We evaluated the performance of Omniware in two ways. First, we showed that Omniware modules can be represented compactly, reducing the space consumption compared to SunPro cc shared object files by an average of 38%. Second, we showed that Omniware modules execute at near native speeds. Including the runtime overhead necessary to ensure that Omniware modules are both portable and safe, our benchmark programs ran within 6% of native performance.

42 citations


Cited by
More filters
Proceedings ArticleDOI
20 Mar 2004
TL;DR: The design of the LLVM representation and compiler framework is evaluated in three ways: the size and effectiveness of the representation, including the type information it provides; compiler performance for several interprocedural problems; and illustrative examples of the benefits LLVM provides for several challenging compiler problems.
Abstract: We describe LLVM (low level virtual machine), a compiler framework designed to support transparent, lifelong program analysis and transformation for arbitrary programs, by providing high-level information to compiler transformations at compile-time, link-time, run-time, and in idle time between runs. LLVM defines a common, low-level code representation in static single assignment (SSA) form, with several novel features: a simple, language-independent type-system that exposes the primitives commonly used to implement high-level language features; an instruction for typed address arithmetic; and a simple mechanism that can be used to implement the exception handling features of high-level languages (and setjmp/longjmp in C) uniformly and efficiently. The LLVM compiler framework and code representation together provide a combination of key capabilities that are important for practical, lifelong analysis and transformation of programs. To our knowledge, no existing compilation approach provides all these capabilities. We describe the design of the LLVM representation and compiler framework, and evaluate the design in three ways: (a) the size and effectiveness of the representation, including the type information it provides; (b) compiler performance for several interprocedural problems; and (c) illustrative examples of the benefits LLVM provides for several challenging compiler problems.

4,841 citations

Journal ArticleDOI
TL;DR: A structured view of research on information-flow security is given, particularly focusing on work that uses static program analysis to enforce information- flow policies, and some important open challenges are identified.
Abstract: Current standard security practices do not provide substantial assurance that the end-to-end behavior of a computing system satisfies important security policies such as confidentiality. An end-to-end confidentiality policy might assert that secret input data cannot be inferred by an attacker through the attacker's observations of system output; this policy regulates information flow. Conventional security mechanisms such as access control and encryption do not directly address the enforcement of information-flow policies. Previously, a promising new approach has been developed: the use of programming-language techniques for specifying and enforcing information-flow policies. In this paper, we survey the past three decades of research on information-flow security, particularly focusing on work that uses static program analysis to enforce information-flow policies. We give a structured view of work in the area and identify some important open challenges.

2,058 citations

Proceedings ArticleDOI
01 Jan 1997
TL;DR: It is shown in this paper how proof-carrying code might be used to develop safe assembly-language extensions of ML programs and the adequacy of concrete representations for the safety policy, the safety proofs, and the proof validation is proved.
Abstract: This paper describes proof-carrying code (PCC), a mechanism by which a host system can determine with certainty that it is safe to execute a program supplied (possibly in binary form) by an untrusted source. For this to be possible, the untrusted code producer must supply with the code a safety proof that attests to the code's adherence to a previously defined safety policy. The host can then easily and quickly validate the proof without using cryptography and without consulting any external agents.In order to gain preliminary experience with PCC, we have performed several case studies. We show in this paper how proof-carrying code might be used to develop safe assembly-language extensions of ML programs. In the context of this case study, we present and prove the adequacy of concrete representations for the safety policy, the safety proofs, and the proof validation. Finally, we briefly discuss how we use proof-carrying code to develop network packet filters that are faster than similar filters developed using other techniques and are formally guaranteed to be safe with respect to a given operating system safety policy.

1,799 citations

Journal ArticleDOI
TL;DR: It is illustrated how the routers of an IP network could be augmented to perform such customized processing on the datagrams flowing through them, and these active routers could also interoperate with legacy routers, which transparently forwarddatagrams in the traditional manner.
Abstract: Active networks are a novel approach to network architecture in which the switches (or routers) of the network perform customized computations on the messages flowing through them. This approach is motivated by both lead user applications, which perform user-driven computation at nodes within the network today, and the emergence of mobile code technologies that make dynamic network service innovation attainable. The authors discuss two approaches to the realization of active networks and provide a snapshot of the current research issues and activities. They illustrate how the routers of an IP network could be augmented to perform such customized processing on the datagrams flowing through them. These active routers could also interoperate with legacy routers, which transparently forward datagrams in the traditional manner.

1,489 citations

Proceedings ArticleDOI
03 Dec 1995
TL;DR: The prototype exokernel system implemented here is at least five times faster on operations such as exception dispatching and interprocess communication, and allows applications to control machine resources in ways not possible in traditional operating systems.
Abstract: Traditional operating systems limit the performance, flexibility, and functionality of applications by fixing the interface and implementation of operating system abstractions such as interprocess communication and virtual memory. The exokernel operating system architecture addresses this problem by providing application-level management of physical resources. In the exokernel architecture, a small kernel securely exports all hardware resources through a low-level interface to untrusted library operating systems. Library operating systems use this interface to implement system objects and policies. This separation of resource protection from management allows application-specific customization of traditional operating system abstractions by extending, specializing, or even replacing libraries. We have implemented a prototype exokemel operating system. Measurements show that most primitive kernel operations (such as exception handling and protected control transfer) are ten to 100 times faster than in Ultrix, a mature monolithic UNIX operating system. In addition, we demonstrate that an exokernel allows applications to control machine resources in ways not possible in traditional operating systems. For instance, virtual memory and interprocess communication abstractions are implemented entirely within an application-level library. Measurements show that application-level virtual memory and interprocess communication primitives are five to 40 times faster than Ultrix's kernel primitives. Compared to state-of-the-art implementations from the literature, the prototype exokernel system is at least five times faster on operations such as exception dispatching and interprocess communication.

1,309 citations