Executing a program on the MIT tagged-token dataflow architecture

doi:10.1109/12.48862

Home
/
Papers
/
Executing a program on the MIT tagged-token dataflow architecture

Journal Article•DOI•

Executing a program on the MIT tagged-token dataflow architecture

Arvind¹, Rishiyur S. Nikhil¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Mar 1990-IEEE Transactions on Computers (IEEE Computer Society)-Vol. 39, Iss: 3, pp 300-318

TL;DR: An overview of current thinking on dataflow architecture is provided by describing example Id programs, their compilation to dataflow graphs, and their execution on the TTDA, a multiprocessor architecture.

read less

Abstract: The MIT Tagged-Token Dataflow Project has an unconventional, but integrated approach to general-purpose high-performance parallel computing. Rather than extending conventional sequential languages, Id, a high-level language with fine-grained parallelism and determinacy implicit in its operational semantics, is used. Id programs are compiled to dynamic dataflow graphs, which constitute a parallel machine language. Dataflow graphs are directly executed on the MIT tagged-token dataglow architecture (TTDA), a multiprocessor architecture. An overview of current thinking on dataflow architecture is provided by describing example Id programs, their compilation to dataflow graphs, and their execution on the TTDA. Related work and the status of the project are described. >

...read moreread less

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

TensorFlow: a system for large-scale machine learning

[...]

Martín Abadi¹, Paul Barham¹, Jianmin Chen¹, Zhifeng Chen¹, Andy Davis¹, Jeffrey Dean¹, Matthieu Devin¹, Sanjay Ghemawat¹, Geoffrey Irving¹, Michael Isard¹, Manjunath Kudlur¹, Josh Levenberg¹, Rajat Monga¹, Sherry Moore¹, Derek G. Murray¹, Benoit Steiner¹, Paul A. Tucker¹, Vijay K. Vasudevan¹, Pete Warden¹, Martin Wicke¹, Yuan Yu¹, Xiaoqiang Zheng¹ - Show less +18 more•Institutions (1)

Google¹

02 Nov 2016

TL;DR: TensorFlow as mentioned in this paper is a machine learning system that operates at large scale and in heterogeneous environments, using dataflow graphs to represent computation, shared state, and the operations that mutate that state.

...read moreread less

Abstract: TensorFlow is a machine learning system that operates at large scale and in heterogeneous environments. Tensor-Flow uses dataflow graphs to represent computation, shared state, and the operations that mutate that state. It maps the nodes of a dataflow graph across many machines in a cluster, and within a machine across multiple computational devices, including multicore CPUs, general-purpose GPUs, and custom-designed ASICs known as Tensor Processing Units (TPUs). This architecture gives flexibility to the application developer: whereas in previous "parameter server" designs the management of shared state is built into the system, TensorFlow enables developers to experiment with novel optimizations and training algorithms. TensorFlow supports a variety of applications, with a focus on training and inference on deep neural networks. Several Google services use TensorFlow in production, we have released it as an open-source project, and it has become widely used for machine learning research. In this paper, we describe the TensorFlow dataflow model and demonstrate the compelling performance that TensorFlow achieves for several real-world applications.

...read moreread less

10,913 citations

Posted Content•

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems

[...]

Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga, Sherry Moore, Derek G. Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul A. Tucker, Vincent Vanhoucke, Vijay K. Vasudevan, Fernanda B. Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, Xiaoqiang Zheng - Show less +36 more

01 Jan 2015-arXiv: Distributed, Parallel, and Cluster Computing

TL;DR: The TensorFlow interface and an implementation of that interface that is built at Google are described, which has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields.

...read moreread less

Abstract: TensorFlow is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines and thousands of computational devices such as GPU cards. The system is flexible and can be used to express a wide variety of algorithms, including training and inference algorithms for deep neural network models, and it has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields, including speech recognition, computer vision, robotics, information retrieval, natural language processing, geographic information extraction, and computational drug discovery. This paper describes the TensorFlow interface and an implementation of that interface that we have built at Google. The TensorFlow API and a reference implementation were released as an open-source package under the Apache 2.0 license in November, 2015 and are available at www.tensorflow.org.

...read moreread less

10,447 citations

Monograph•DOI•

Introduction to Parallel Computing

[...]

Zbigniew J. Czech

01 Jan 2016

TL;DR: In this article, a comprehensive introduction to parallel computing is provided, discussing theoretical issues such as the fundamentals of concurrent processes, models of parallel and distributed computing, and metrics for evaluating and comparing parallel algorithms, as well as practical issues, including methods of designing and implementing shared-and distributed-memory programs, and standards for parallel program implementation.

...read moreread less

Abstract: The constantly increasing demand for more computing power can seem impossible to keep up with. However, multicore processors capable of performing computations in parallel allow computers to tackle ever larger problems in a wide variety of applications. This book provides a comprehensive introduction to parallel computing, discussing theoretical issues such as the fundamentals of concurrent processes, models of parallel and distributed computing, and metrics for evaluating and comparing parallel algorithms, as well as practical issues, including methods of designing and implementing shared- and distributed-memory programs, and standards for parallel program implementation, in particular MPI and OpenMP interfaces. Each chapter presents the basics in one place followed by advanced topics, allowing novices and experienced practitioners to quickly find what they need. A glossary and more than 80 exercises with selected solutions aid comprehension. The book is recommended as a text for advanced undergraduate or graduate students and as a reference for practitioners.

...read moreread less

572 citations

Book•

Reconfigurable Computing: The Theory and Practice of FPGA-Based Computation

[...]

Scott Hauck, André DeHon

02 Nov 2007

TL;DR: This book is intended as an introduction to the entire range of issues important to reconfigurable computing, using FPGAs as the context, or "computing vehicles" to implement this powerful technology.

...read moreread less

Abstract: The main characteristic of Reconfigurable Computing is the presence of hardware that can be reconfigured to implement specific functionality more suitable for specially tailored hardware than on a simple uniprocessor. Reconfigurable computing systems join microprocessors and programmable hardware in order to take advantage of the combined strengths of hardware and software and have been used in applications ranging from embedded systems to high performance computing. Many of the fundamental theories have been identified and used by the Hardware/Software Co-Design research field. Although the same background ideas are shared in both areas, they have different goals and use different approaches.This book is intended as an introduction to the entire range of issues important to reconfigurable computing, using FPGAs as the context, or "computing vehicles" to implement this powerful technology. It will take a reader with a background in the basics of digital design and software programming and provide them with the knowledge needed to be an effective designer or researcher in this rapidly evolving field. · Treatment of FPGAs as computing vehicles rather than glue-logic or ASIC substitutes · Views of FPGA programming beyond Verilog/VHDL · Broad set of case studies demonstrating how to use FPGAs in novel and efficient ways

...read moreread less

531 citations

Proceedings Article•DOI•

Scheduling dynamic dataflow graphs with bounded memory using the token flow model

[...]

J.T. Buck¹, Edward A. Lee¹•Institutions (1)

University of California, Berkeley¹

27 Apr 1993

TL;DR: The authors build upon research by E. A. Lee (1991) concerning the token flow model by analyzing the properties of cycles of the schedule: sequences of actor executions that return the graph to its initial state.

...read moreread less

Abstract: The authors build upon research by E. A. Lee (1991) concerning the token flow model, an analytical model for the behavior of dataflow graphs with data-dependent control flow, by analyzing the properties of cycles of the schedule: sequences of actor executions that return the graph to its initial state. Necessary and sufficient conditions are given for the existence of a bounded cyclic schedule as well as sufficient conditions for execution of the graph in bounded memory. The techniques presented apply to a more general class of dataflow graphs than previous methods. >

...read moreread less

521 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Advanced compiler optimizations for supercomputers

[...]

David Padua¹, Michael Wolfe¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Dec 1986-Communications of The ACM

TL;DR: Compilers for vector or multiprocessor computers must have certain optimization features to successfully generate parallel code to be able to operate on parallel systems.

...read moreread less

Abstract: Compilers for vector or multiprocessor computers must have certain optimization features to successfully generate parallel code.

...read moreread less

758 citations

Proceedings Article•DOI•

Dependence graphs and compiler optimizations

[...]

David J. Kuck¹, Robert H. Kuhn¹, David Padua¹, Bruce Leasure¹, Michael Wolfe¹ - Show less +1 more•Institutions (1)

University of Illinois at Urbana–Champaign¹

26 Jan 1981

TL;DR: This paper defines such graphs and discusses two kinds of transformations, simple rewriting transformations that remove dependence arcs and abstraction transformations that deal more globally with a dependence graph.

...read moreread less

Abstract: Dependence graphs can be used as a vehicle for formulating and implementing compiler optimizations. This paper defines such graphs and discusses two kinds of transformations. The first are simple rewriting transformations that remove dependence arcs. The second are abstraction transformations that deal more globally with a dependence graph. These transformations have been implemented and applied to several different types of high-speed architectures.

...read moreread less

720 citations

Journal Article•DOI•

The NYU Ultracomputer—Designing an MIMD Shared Memory Parallel Computer

[...]

Gottlieb¹, Grishman², McAuliffe², Rudolph³, Snir² - Show less +1 more•Institutions (3)

Courant Institute of Mathematical Sciences¹, New York University², University of Illinois at Urbana–Champaign³

01 Feb 1983-IEEE Transactions on Computers

TL;DR: The design for the NYU Ultracomputer is presented, a shared-memory MIMD parallel machine composed of thousands of autonomous processing elements that uses an enhanced message switching network with the geometry of an Omega-network to approximate the ideal behavior of Schwartz's paracomputers model of computation.

...read moreread less

Abstract: We present the design for the NYU Ultracomputer, a shared-memory MIMD parallel machine composed of thousands of autonomous processing elements. This machine uses an enhanced message switching network with the geometry of an Omega-network to approximate the ideal behavior of Schwartz's paracomputer model of computation and to implement efficiently the important fetch-and-add synchronization primitive. We outine the hardware that would be required to build a 4096 processor system using 1990's technology. We also discuss system software issues, and present analytic studies of the network performance. Finally, we include a sample of our effort to implement and simulate parallel variants of important scientific programs.

...read moreread less

708 citations

Journal Article•DOI•

A New Implementation Technique for Applicative Languages

[...]

D. A. Turner¹•Institutions (1)

University of Kent¹

01 Jan 1979-Software - Practice and Experience

TL;DR: By using results from combinatory logic an applicative language, such as LISP, can be translated into a form from which all bound variables have been removed, and a machine is described which can efficiently execute the resulting code.

...read moreread less

Abstract: SU-Y It is shown how by using results from combinatory logic an applicative language, such as LISP, can be translated into a form from which all bound variables have been removed. A machine is described which can efficiently execute the resulting code. This implementation is compared with a conventional interpreter and found to have a number of advantages. Of these the most important is that programs which exploit higher order functions to achieve great compactness of expression are executed much more e5ciently. WORDS Applicative languages Combinators Bracket abstraction Normal graph reduction Lazy evaluation Substitution machine

...read moreread less

686 citations

Book Chapter•DOI•

First version of a data flow procedure language

[...]

J. B. Dennis¹•Institutions (1)

Massachusetts Institute of Technology¹

09 Apr 1974

TL;DR: The language is being used as a model for study of fundamental semantic constructs for programming, as a target language for evaluating trans-latability of programs expressed at the user-language level, and as a guide for research in advanced computer architecture.

...read moreread less

Abstract: A language for representing computational procedures based on the concept of data flow is presented in terms of a semantic model that permits concurrent execution of noninterfering program parts. Procedures in the language operate on elementary and structured values, and always define functional transformations of values. The language is equivalent in expressive power to a block structured language with internal procedure variables and is a generalization of pure Lisp. The language is being used as a model for study of fundamental semantic constructs for programming, as a target language for evaluating trans-latability of programs expressed at the user-language level, and as a guide for research in advanced computer architecture.

...read moreread less

682 citations