Home
/
Topics
/
FLOPS

Topic

FLOPS

About: FLOPS is a research topic. Over the lifetime, 259 publications have been published within this topic receiving 4315 citations. The topic is also known as: Floating Point Operations Per Second.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1985
1983
1981
1979
1978
1974
1973
1971
1970
1968
1962
1961
1960
1958
1957
1956
1955
1953

Papers

PDF

Open Access

More filters

Book Chapter•DOI•

ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design

[...]

Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng¹, Jian Sun•Institutions (1)

Tsinghua University¹

08 Sep 2018

TL;DR: ShuffleNet V2 as discussed by the authors proposes to evaluate the direct metric on the target platform, beyond only considering FLOPs, based on a series of controlled experiments, and derives several practical guidelines for efficient network design.

...read moreread less

Abstract: Currently, the neural network architecture design is mostly guided by the indirect metric of computation complexity, i.e., FLOPs. However, the direct metric, e.g., speed, also depends on the other factors such as memory access cost and platform characterics. Thus, this work proposes to evaluate the direct metric on the target platform, beyond only considering FLOPs. Based on a series of controlled experiments, this work derives several practical guidelines for efficient network design. Accordingly, a new architecture is presented, called ShuffleNet V2. Comprehensive ablation experiments verify that our model is the state-of-the-art in terms of speed and accuracy tradeoff.

...read moreread less

3,393 citations

Proceedings Article•DOI•

Semi-dynamic and dynamic flip-flops with embedded logic

[...]

F. Klass

11 Jun 1998

TL;DR: A family of semi-dynamic and dynamic edge-triggered flip-flops to be used with static and dynamic circuits, respectively, used in the UltraSPARC-III microprocessor.

...read moreread less

Abstract: Describes a family of semi-dynamic and dynamic edge-triggered flip-flops to be used with static and dynamic circuits, respectively. The flip-flops provide both short latency and the capability of incorporating logic functions with minimum delay penalty, properties which make them very attractive for high-performance microprocessor design. The circuits described are used in the UltraSPARC-III microprocessor.

...read moreread less

192 citations

Journal Article•DOI•

Machine organization of the IBM RISC System/6000 processor

[...]

G. F. Grohoski¹•Institutions (1)

IBM¹

03 Jan 1990-Ibm Journal of Research and Development

TL;DR: The IBM RISC System/6000 processor is a second-generation RISC processor which reduces the execution pipeline penalties caused by branch instructions and also provides high floating-point performance.

...read moreread less

Abstract: The IBM RISC System/6000 processor is a second-generation RISC processor which reduces the execution pipeline penalties caused by branch instructions and also provides high floating-point performance. It employs multiple functional units which operate concurrently to maximize the instruction execution rate. By employing these advanced machine-organization techniques, it can execute up to four instructions simultaneously. Approximately 11 MFLOPS are achieved on the LINPACK benchmarks.

...read moreread less

149 citations

Journal Article•DOI•

Flops Connect Minimal Models

[...]

Yujiro Kawamata

30 Jun 2008-Publications of The Research Institute for Mathematical Sciences

TL;DR: In this paper, a flop of a pair (X, B) is a flip of a model (X + B) which is crepant for KX +B.

...read moreread less

Abstract: A result by Birkar-Cascini-Hacon-McKernan together with the boundedness of length of extremal rays implies that different minimal models can be connected by a sequence of flops. A flop of a pair (X, B) is a flip of a pair (X, B � ) which is crepant for KX +B

...read moreread less

128 citations

Journal Article•DOI•

Tarantula: a vector extension to the alpha architecture

[...]

Roger Espasa, Federico Ardanaz, Joel Emer, Stephen Felix, Julio Gago, Roger Gramunt, Isaac Hernandez, Toni Juan, Geoff Lowney, Matthew Mattina, André Seznec - Show less +7 more

01 May 2002

TL;DR: Tarantula is an aggressive floating point machine targeted at technical, scientific and bioinformatics workloads that fully integrates into a virtual-memory cache-coherent system without changes to its coherency protocol, and achieves excellent "real-computation" per transistor and per watt ratios.

...read moreread less

Abstract: Tarantula is an aggressive floating point machine targeted at technical, scientific and bioinformatics workloads, originally planned as a follow-on candidate to the EV8 processor [6, 5]. Tarantula adds to the EV8 core a vector unit capable of 32 double-precision flops per cycle. The vector unit fetches data directly from a 16 MByte second level cache with a peak bandwidth of sixty four 64-bit values per cycle. The whole chip is backed by a memory controller capable of delivering over 64 GBytes/s of raw band- width. Tarantula extends the Alpha ISA with new vector instructions that operate on new architectural state. Salient features of the architecture and implementation are: (1) it fully integrates into a virtual-memory cache-coherent system without changes to its coherency protocol, (2) provides high bandwidth for non-unit stride memory accesses, (3) supports gather/scatter instructions efficiently, (4) fully integrates with the EV8 core with a narrow, streamlined interface, rather than acting as a co-processor, (5) can achieve a peak of 104 operations per cycle, and (6) achieves excellent "real-computation" per transistor and per watt ratios. Our detailed simulations show that Tarantula achieves an average speedup of 5X over EV8, out of a peak speedup in terms of flops of 8X. Furthermore, performance on gather/scatter intensive benchmarks such as Radix Sort is also remarkable: a speedup of almost 3X over EV8 and 15 sustained operations per cycle. Several benchmarks exceed 20 operations per cycle.

...read moreread less

112 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181

Collapse

Network Information

Performance

Metrics

905

Papers

7,308

Citations

No. of papers in the topic in previous years
Year	Papers
2023	219
2022	409
2021	22
2020	13
2019	9
2018	13

FLOPS

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics