scispace - formally typeset
Search or ask a question
Conference

Architectural Support for Programming Languages and Operating Systems 

About: Architectural Support for Programming Languages and Operating Systems is an academic conference. The conference publishes majorly in the area(s): Cache & Compiler. Over the lifetime, 1175 publications have been published by the conference receiving 121240 citations.
Topics: Cache, Compiler, Shared memory, CPU cache, Speedup


Papers
More filters
Journal ArticleDOI
12 Nov 2000
TL;DR: Key requirements are identified, a small device is developed that is representative of the class, a tiny event-driven operating system is designed, and it is shown that it provides support for efficient modularity and concurrency-intensive operation.
Abstract: Technological progress in integrated, low-power, CMOS communication devices and sensors makes a rich design space of networked sensors viable. They can be deeply embedded in the physical world and spread throughout our environment like smart dust. The missing elements are an overall system architecture and a methodology for systematic advance. To this end, we identify key requirements, develop a small device that is representative of the class, design a tiny event-driven operating system, and show that it provides support for efficient modularity and concurrency-intensive operation. Our operating system fits in 178 bytes of memory, propagates events in the time it takes to copy 1.25 bytes of memory, context switches in the time it takes to copy 6 bytes of memory and supports two level scheduling. The analysis lays a groundwork for future architectural advances.

3,648 citations

Journal ArticleDOI
12 Nov 2000
TL;DR: OceanStore monitoring of usage patterns allows adaptation to regional outages and denial of service attacks; monitoring also enhances performance through pro-active movement of data.
Abstract: OceanStore is a utility infrastructure designed to span the globe and provide continuous access to persistent information. Since this infrastructure is comprised of untrusted servers, data is protected through redundancy and cryptographic techniques. To improve performance, data is allowed to be cached anywhere, anytime. Additionally, monitoring of usage patterns allows adaptation to regional outages and denial of service attacks; monitoring also enhances performance through pro-active movement of data. A prototype implementation is currently under development.

3,376 citations

Proceedings ArticleDOI
01 Oct 2002
TL;DR: The goal is to use the least energy, storage, and other resources necessary to maintain a reliable system with a very high `data homing' success rate and it is believed that the domain-centric protocols and energy tradeoffs presented here for ZebraNet will have general applicability in other wireless and sensor applications.
Abstract: Over the past decade, mobile computing and wireless communication have become increasingly important drivers of many new computing applications. The field of wireless sensor networks particularly focuses on applications involving autonomous use of compute, sensing, and wireless communication devices for both scientific and commercial purposes. This paper examines the research decisions and design tradeoffs that arise when applying wireless peer-to-peer networking techniques in a mobile sensor network designed to support wildlife tracking for biology research.The ZebraNet system includes custom tracking collars (nodes) carried by animals under study across a large, wild area; the collars operate as a peer-to-peer network to deliver logged data back to researchers. The collars include global positioning system (GPS), Flash memory, wireless transceivers, and a small CPU; essentially each node is a small, wireless computing device. Since there is no cellular service or broadcast communication covering the region where animals are studied, ad hoc, peer-to-peer routing is needed. Although numerous ad hoc protocols exist, additional challenges arise because the researchers themselves are mobile and thus there is no fixed base station towards which to aim data. Overall, our goal is to use the least energy, storage, and other resources necessary to maintain a reliable system with a very high `data homing' success rate. We plan to deploy a 30-node ZebraNet system at the Mpala Research Centre in central Kenya. More broadly, we believe that the domain-centric protocols and energy tradeoffs presented here for ZebraNet will have general applicability in other wireless and sensor applications.

2,128 citations

Proceedings ArticleDOI
01 Oct 2002
TL;DR: This work quantifies the effectiveness of Basic Block Vectors in capturing program behavior across several different architectural metrics, explores the large scale behavior of several programs, and develops a set of algorithms based on clustering capable of analyzing this behavior.
Abstract: Understanding program behavior is at the foundation of computer architecture and program optimization. Many programs have wildly different behavior on even the very largest of scales (over the complete execution of the program). This realization has ramifications for many architectural and compiler techniques, from thread scheduling, to feedback directed optimizations, to the way programs are simulated. However, in order to take advantage of time-varying behavior, we must first develop the analytical tools necessary to automatically and efficiently analyze program behavior over large sections of execution.Our goal is to develop automatic techniques that are capable of finding and exploiting the Large Scale Behavior of programs (behavior seen over billions of instructions). The first step towards this goal is the development of a hardware independent metric that can concisely summarize the behavior of an arbitrary section of execution in a program. To this end we examine the use of Basic Block Vectors. We quantify the effectiveness of Basic Block Vectors in capturing program behavior across several different architectural metrics, explore the large scale behavior of several programs, and develop a set of algorithms based on clustering capable of analyzing this behavior. We then demonstrate an application of this technology to automatically determine where to simulate for a program to help guide computer architecture research.

1,702 citations

Proceedings ArticleDOI
24 Feb 2014
TL;DR: This study designs an accelerator for large-scale CNNs and DNNs, with a special emphasis on the impact of memory on accelerator design, performance and energy, and shows that it is possible to design an accelerator with a high throughput, capable of performing 452 GOP/s in a small footprint.
Abstract: Machine-Learning tasks are becoming pervasive in a broad range of domains, and in a broad range of systems (from embedded systems to data centers). At the same time, a small set of machine-learning algorithms (especially Convolutional and Deep Neural Networks, i.e., CNNs and DNNs) are proving to be state-of-the-art across many applications. As architectures evolve towards heterogeneous multi-cores composed of a mix of cores and accelerators, a machine-learning accelerator can achieve the rare combination of efficiency (due to the small number of target algorithms) and broad application scope. Until now, most machine-learning accelerator designs have focused on efficiently implementing the computational part of the algorithms. However, recent state-of-the-art CNNs and DNNs are characterized by their large size. In this study, we design an accelerator for large-scale CNNs and DNNs, with a special emphasis on the impact of memory on accelerator design, performance and energy. We show that it is possible to design an accelerator with a high throughput, capable of performing 452 GOP/s (key NN operations such as synaptic weight multiplications and neurons outputs additions) in a small footprint of 3.02 mm2 and 485 mW; compared to a 128-bit 2GHz SIMD processor, the accelerator is 117.87x faster, and it can reduce the total energy by 21.08x. The accelerator characteristics are obtained after layout at 65 nm. Such a high throughput in a small footprint can open up the usage of state-of-the-art machine-learning algorithms in a broad set of systems and for a broad set of applications.

1,582 citations

Performance
Metrics
No. of papers from the Conference in previous years
YearPapers
202175
202086
201981
201863
201765
201658