Home
/
Authors
/
Massoud Pedram

Author

Massoud Pedram

Other affiliations: University of California, Berkeley, Syracuse University

Bio: Massoud Pedram is an academic researcher from University of Southern California. The author has contributed to research in topics: Energy consumption & CMOS. The author has an hindex of 77, co-authored 780 publications receiving 23047 citations. Previous affiliations of Massoud Pedram include University of California, Berkeley & Syracuse University.

Topics: Energy consumption, CMOS, Logic gate, Logic synthesis, Power management ...read more

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988

Papers

PDF

Open Access

More filters

Journal Article•DOI•

A Logic Verification Framework for SFQ and AQFP Superconducting Circuits

[...]

Arash Fayyazi¹, Mustafa Munir¹, Aswin Gopikanna¹, Shahin Nazarian¹, Massoud Pedram¹ - Show less +1 more•Institutions (1)

University of Southern California¹

14 Oct 2021-IEEE Transactions on Applied Superconductivity

TL;DR: In this article, a multicycle input dependency circuit model is proposed to explicitly capture the dependency of primary outputs of the circuit on sequences of internal signals and inputs, which can be used in the verification process of superconducting electronics.

...read moreread less

Abstract: Traditional logical equivalence checking (LEC), which plays a major role in the entire chip design process, faces challenges of meeting the requirements demanded by the many emerging technologies that are based on logic models different from the standard complementary metal oxide semiconductor (CMOS). In this article, we propose an LEC framework to be employed in the verification process of superconducting electronics (SCE). Our LEC framework is compatible with the existing CMOS technologies and can also check features and capabilities that are unique to SCE. For instance, the performance of nonresistively biased single-flux quantum (SFQ) circuits benefits from ultradeep pipelining, and verification of such circuits requires new models and algorithms. We, therefore, present the multicycle input dependency circuit model which is a novel model representation of design to explicitly capture the dependency of primary outputs of the circuit on sequences of internal signals and inputs. Embedding the proposed circuit model and several structural checking modules, the process of verification can be independent of the underlying technology and signaling. We benchmark the proposed framework on postsynthesis SFQ and 4-phase adiabatic quantum flux parametron netlists. Results show a comparative verification time of SFQ circuit benchmark, including 16-bit integer divider and ISCAS’85 circuits with respect to the ABC tool for similar CMOS circuits.

...read moreread less

1 citations

Proceedings Article•DOI•

A thermally-aware energy minimization methodology for global interconnects

[...]

Soheil Nazar Shahsavani¹, Alireza Shafaei¹, Shahin Nazarian¹, Massoud Pedram¹•Institutions (1)

University of Southern California¹

27 Mar 2017

TL;DR: If buffers are judiciously inserted in global interconnects, the buffer delay decrease is more pronounced than the interconnect delay increase, resulting in an overall performance improvement at higher temperatures, as shown in this paper.

...read moreread less

Abstract: As a result of the Temperature Effect Inversion (TEI) in FinFET-based designs, gate delays decrease with the increase of temperature. In contrast, the resistive characteristic and hence delay of global interconnects increase with the temperature. However, as shown in this paper, if buffers are judiciously inserted in global interconnects, the buffer delay decrease is more pronounced than the interconnect delay increase, resulting in an overall performance improvement at higher temperatures. More specifically, this work models the delay of buffer-inserted global interconnects vs. temperature in order to derive the optimal number and size of buffers for a given interconnect length and temperature. Furthermore, the paper addresses the problem of minimizing the buffered interconnect energy consumption by changing the supply voltage level or FinFET threshold voltage, and also presents a temperature-aware optimization policy for solving this problem. Simulation results show average interconnect energy savings of 16% with no performance penalty for five different benchmarks implemented on a 14nm FinFET technology.

...read moreread less

1 citations

Posted Content•

Towards Low-Latency Energy-Efficient Deep SNNs via Attention-Guided Compression.

[...]

Souvik Kundu, Gourav Datta, Massoud Pedram, Peter A. Beerel¹•Institutions (1)

University of Southern California¹

16 Jul 2021-arXiv: Neural and Evolutionary Computing

TL;DR: In this article, a non-iterative deep spiking neural network (SNN) training technique is proposed to achieve ultra-high compression with reduced spiking activity while maintaining high inference accuracy.

...read moreread less

Abstract: Deep spiking neural networks (SNNs) have emerged as a potential alternative to traditional deep learning frameworks, due to their promise to provide increased compute efficiency on event-driven neuromorphic hardware. However, to perform well on complex vision applications, most SNN training frameworks yield large inference latency which translates to increased spike activity and reduced energy efficiency. Hence,minimizing average spike activity while preserving accuracy indeep SNNs remains a significant challenge and opportunity.This paper presents a non-iterative SNN training technique thatachieves ultra-high compression with reduced spiking activitywhile maintaining high inference accuracy. In particular, our framework first uses the attention-maps of an un compressed meta-model to yield compressed ANNs. This step can be tuned to support both irregular and structured channel pruning to leverage computational benefits over a broad range of platforms. The framework then performs sparse-learning-based supervised SNN training using direct inputs. During the training, it jointly optimizes the SNN weight, threshold, and leak parameters to drastically minimize the number of time steps required while retaining compression. To evaluate the merits of our approach, we performed experiments with variants of VGG and ResNet, on both CIFAR-10 and CIFAR-100, and VGG16 on Tiny-ImageNet.The SNN models generated through the proposed technique yield SOTA compression ratios of up to 33.4x with no significant drops in accuracy compared to baseline unpruned counterparts. Compared to existing SNN pruning methods, we achieve up to 8.3x higher compression with improved accuracy.

...read moreread less

1 citations

DOI•

LCHC-DFT: A Low-Cost High-Coverage Design-for-Testability Technique to Detect Hard-to-Detect Faults in STT-MRAMs in the Presence of Process Variations

[...]

Shiva Taghipour, Mehdi Kamal, Rahebeh Niaraki Asli, Ali Afzali-Kusha, Massoud Pedram - Show less +1 more

01 Dec 2022-IEEE Transactions on Device and Materials Reliability

TL;DR: In this paper , the authors proposed a low-cost yet high-coverage design-for-testability (DFT) scheme for improving the detection of hard-to-detect (HtD) faults in STT-MRAMs.

...read moreread less

Abstract: This paper proposes a low-cost yet high-coverage design-for-testability (DFT) scheme for improving the detection of hard-to-detect (HtD) faults in STT-MRAMs. It is based on introducing a voltage mismatch in the sense amplifier (SA). This mismatch along with the defects in the cell may bias the SA, leading to an incorrect read output which in turn make the HtD faults detectable. The efficacy of the proposed scheme is evaluated by calculating its detection capability in the presence of resistive defects. Evaluation results obtained for a 20nm STT-MRAM memory design show that the proposed scheme provides HtD fault coverage of 81.8% and 86.5%, on average, for intra-cell and inter-cell defects, respectively, with a negligible area overhead (1.1% for a 2Kbit array). In comparison with conventional March tests, which only detect easy-to-detect (EtD) faults, the proposed DFT technique covers both HtD and EtD faults with a faster approach, namely by using a single read operation. Furthermore, to guarantee that the cells affected by HtD faults trigger an incorrect read output under process variation (PV), a timing strategy based on modifying the control unit of the proposed DFT is suggested. The strategy controls the stress introduced by the DFT unit by altering the timing of DFT control signals. The timing alteration enables the detection of HtD faults in the devices with higher deviations from the nominal one due to PV, avoiding yield loss or test escapes.

...read moreread less

1 citations

Journal Article•DOI•

OPLE: A Heuristic Custom Instruction Selection Algorithm Based on Partitioning and Local Exploration of Application Dataflow Graphs

[...]

Mehdi Kamal¹, Ali Afzali-Kusha¹, Saeed Safari¹, Massoud Pedram²•Institutions (2)

University of Tehran¹, University of Southern California²

09 Sep 2015-ACM Transactions in Embedded Computing Systems

TL;DR: The results reveal higher speedups for the OPLE algorithm, especially for larger identified candidate sets and/or small area budgets compared to those of the nonoptimal solutions, and demonstrate that in many cases OPLE is able to find the optimal solution.

...read moreread less

Abstract: In this article, a heuristic custom instruction (CI) selection algorithm is presented. The proposed algorithm, which is called OPLE for “Optimization based on Partitioning and Local Exploration,” uses a combination of greedy and optimal optimization methods. It searches for the near-optimal solution by reducing the search space based on partitioning the identified CI set. The partitioning of the identified set guarantees the success of the algorithm independent of the size of the identified set. First, the algorithm finds the near-optimal CIs from the candidate CIs for each part. Next, the suggested CIs from different parts are combined to determine the final selected CI set. To improve the set of the selected CIs, the solution is evolved by calling the algorithm iteratively. The efficacy of the algorithm is assessed by comparing its performance to those of optimal and nonoptimal methods. A comparative study is performed for a number of benchmarks under different area budgets and I/O constraints. The results reveal higher speedups for the OPLE algorithm, especially for larger identified candidate sets and/or small area budgets compared to those of the nonoptimal solutions. Compared to the nonoptimal techniques, the proposed algorithm provides 30p higher speedup improvement on average. The maximum improvement is 117p. The results also demonstrate that in many cases OPLE is able to find the optimal solution.

...read moreread less

1 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
…
145
146
147
148
149
150
151
…
152
153
154
155
156
157
158
159
160
161
162
163

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

I and i

[...]

Kevin Barraclough

08 Dec 2001-BMJ

TL;DR: There is, I think, something ethereal about i —the square root of minus one, which seems an odd beast at that time—an intruder hovering on the edge of reality.

...read moreread less

Abstract: There is, I think, something ethereal about i —the square root of minus one. I remember first hearing about it at school. It seemed an odd beast at that time—an intruder hovering on the edge of reality. Usually familiarity dulls this sense of the bizarre, but in the case of i it was the reverse: over the years the sense of its surreal nature intensified. It seemed that it was impossible to write mathematics that described the real world in …

...read moreread less

33,785 citations

Journal Article•

다중혈관 관상동맥 환자에서 y-문합을 이용하여 양쪽 내흉동맥만을 사용한 우회술의 조기 성적

[...]

성기익, 이영탁, 박계현, 전태국, 박표원, 한일용, 장윤희 - Show less +3 more

01 Mar 2003-The Korean Journal of Thoracic and Cardiovascular Surgery

28,685 citations

Journal Article•DOI•

Machine learning

[...]

Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

01 Dec 1996-ACM Computing Surveys

TL;DR: Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis.

...read moreread less

Abstract: Machine Learning is the study of methods for programming computers to learn. Computers are applied to a wide range of tasks, and for most of these it is relatively easy for programmers to design and implement the necessary software. However, there are many tasks for which this is difficult or impossible. These can be divided into four general categories. First, there are problems for which there exist no human experts. For example, in modern automated manufacturing facilities, there is a need to predict machine failures before they occur by analyzing sensor readings. Because the machines are new, there are no human experts who can be interviewed by a programmer to provide the knowledge necessary to build a computer system. A machine learning system can study recorded data and subsequent machine failures and learn prediction rules. Second, there are problems where human experts exist, but where they are unable to explain their expertise. This is the case in many perceptual tasks, such as speech recognition, hand-writing recognition, and natural language understanding. Virtually all humans exhibit expert-level abilities on these tasks, but none of them can describe the detailed steps that they follow as they perform them. Fortunately, humans can provide machines with examples of the inputs and correct outputs for these tasks, so machine learning algorithms can learn to map the inputs to the outputs. Third, there are problems where phenomena are changing rapidly. In finance, for example, people would like to predict the future behavior of the stock market, of consumer purchases, or of exchange rates. These behaviors change frequently, so that even if a programmer could construct a good predictive computer program, it would need to be rewritten frequently. A learning program can relieve the programmer of this burden by constantly modifying and tuning a set of learned prediction rules. Fourth, there are applications that need to be customized for each computer user separately. Consider, for example, a program to filter unwanted electronic mail messages. Different users will need different filters. It is unreasonable to expect each user to program his or her own rules, and it is infeasible to provide every user with a software engineer to keep the rules up-to-date. A machine learning system can learn which mail messages the user rejects and maintain the filtering rules automatically. Machine learning addresses many of the same research questions as the fields of statistics, data mining, and psychology, but with differences of emphasis. Statistics focuses on understanding the phenomena that have generated the data, often with the goal of testing different hypotheses about those phenomena. Data mining seeks to find patterns in the data that are understandable by people. Psychological studies of human learning aspire to understand the mechanisms underlying the various learning behaviors exhibited by people (concept learning, skill acquisition, strategy change, etc.).

...read moreread less

13,246 citations

Pattern Recognition and Machine Learning

[...]

Christopher M. Bishop¹•Institutions (1)

Microsoft¹

01 Jan 2006

TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.

...read moreread less

Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

...read moreread less

10,141 citations

Convex Analysisの二,三の進展について

[...]

徹丸山

01 Feb 1977

5,933 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse