Home
/
Authors
/
Massoud Pedram

Author

Massoud Pedram

Bio: Massoud Pedram is an academic researcher from University of Southern California. The author has contributed to research in topics: Deep learning & Logic gate. The author has an hindex of 14, co-authored 36 publications receiving 464 citations.

Topics: Deep learning, Logic gate, Electronic circuit, Mobile device, Cloud computing ...read more

Papers

PDF

Open Access

More filters

Journal Article•DOI•

JointDNN: An Efficient Training and Inference Engine for Intelligent Mobile Cloud Computing Services

[...]

Amir Erfan Eshratifar¹, Mohammad Saeed Abrishami¹, Massoud Pedram¹•Institutions (1)

University of Southern California¹

01 Feb 2021-IEEE Transactions on Mobile Computing

TL;DR: JointDNN as discussed by the authors proposes an efficient, adaptive, and practical engine, JointDNN, for collaborative computation between a mobile device and cloud for DNNs in both inference and training phase.

...read moreread less

Abstract: Deep learning models are being deployed in many mobile intelligent applications. End-side services, such as intelligent personal assistants, autonomous cars, and smart home services often employ either simple local models on the mobile or complex remote models on the cloud. However, recent studies have shown that partitioning the DNN computations between the mobile and cloud can increase the latency and energy efficiencies. In this paper, we propose an efficient, adaptive, and practical engine, JointDNN, for collaborative computation between a mobile device and cloud for DNNs in both inference and training phase. JointDNN not only provides an energy and performance efficient method of querying DNNs for the mobile side but also benefits the cloud server by reducing the amount of its workload and communications compared to the cloud-only approach. Given the DNN architecture, we investigate the efficiency of processing some layers on the mobile device and some layers on the cloud server. We provide optimization formulations at layer granularity for forward- and backward-propagations in DNNs, which can adapt to mobile battery limitations and cloud server load constraints and quality of service. JointDNN achieves up to 18 and 32 times reductions on the latency and mobile energy consumption of querying DNNs compared to the status-quo approaches, respectively.

...read moreread less

162 citations

Proceedings Article•DOI•

BottleNet: A Deep Learning Architecture for Intelligent Mobile Cloud Computing Services

[...]

Amir Erfan Eshratifar¹, Amirhossein Esmaili¹, Massoud Pedram¹•Institutions (1)

University of Southern California¹

29 Jul 2019

TL;DR: BottleNet as mentioned in this paper proposes a training method for compensating for the potential accuracy loss due to the lossy compression of features before transmitting them to the cloud, which achieves on average 5.1× improvement in end-to-end latency and 6.9× energy consumption compared with the cloud-only approach with no accuracy loss.

...read moreread less

Abstract: Recent studies have shown the latency and energy consumption of deep neural networks can be significantly improved by splitting the network between the mobile device and cloud. This paper introduces a new deep learning architecture, called BottleNet, for reducing the feature size needed to be sent to the cloud. Furthermore, we propose a training method for compensating for the potential accuracy loss due to the lossy compression of features before transmitting them to the cloud. BottleNet achieves on average 5.1× improvement in end-to-end latency and 6.9× improvement in mobile energy consumption compared to the cloud-only approach with no accuracy loss.

...read moreread less

137 citations

Posted Content•

JointDNN: An Efficient Training and Inference Engine for Intelligent Mobile Cloud Computing Services

[...]

Amir Erfan Eshratifar¹, Mohammad Saeed Abrishami¹, Massoud Pedram¹•Institutions (1)

University of Southern California¹

25 Jan 2018-arXiv: Distributed, Parallel, and Cluster Computing

TL;DR: This paper proposes an efficient, adaptive, and practical engine, JointDNN, for collaborative computation between a mobile device and cloud for DNNs in both inference and training phase, and achieves up to 18 and 32 times reductions on the latency and mobile energy consumption of querying Dnns compared to the status-quo approaches.

...read moreread less

91 citations

Journal Article•DOI•

PBMap: A Path Balancing Technology Mapping Algorithm for Single Flux Quantum Logic Circuits

[...]

Ghasem Pasandi¹, Massoud Pedram¹•Institutions (1)

University of Southern California¹

01 Jun 2019-IEEE Transactions on Applied Superconductivity

TL;DR: A dynamic programming based algorithm for path balancing technology mapping is presented, which generates optimal solutions for dc-biased SFQ (e.g., rapid SFQ or RSFQ) circuits with tree structure and acts as an effective heuristic for circuits with general directed acyclic graph structure.

...read moreread less

Abstract: This paper presents a path balancing technology mapping algorithm, which is a new algorithm for generating a mapping solution for a given Boolean network such that the average logic level difference among fanin gates of each gate in the network is minimized. Path balancing technology mapping is required in dc-biased single flux quantum (SFQ) circuits for guaranteeing the correct operation, and it is beneficial in CMOS circuits to reduce the hazard issues. We present a dynamic programming based algorithm for path balancing technology mapping, which generates optimal solutions for dc-biased SFQ (e.g., rapid SFQ or RSFQ) circuits with tree structure and acts as an effective heuristic for circuits with general directed acyclic graph structure. Experimental results show that our path balancing technology mapper reduces the balancing overhead by up to 2.7 × and with an average of 21% compared to the state-of-the-art academic technology mappers.

...read moreread less

43 citations

Proceedings Article•DOI•

SFQmap: A Technology Mapping Tool for Single Flux Quantum Logic Circuits

[...]

Ghasem Pasandi¹, Alireza Shafaei¹, Massoud Pedram¹•Institutions (1)

University of Southern California¹

27 May 2018

TL;DR: A novel technology mapping tool, called SFQmap, is presented, which provides optimization methods for minimizing first the circuit depth and path balancing overhead and then the worst-case stage delay of mapped SFQ circuits.

...read moreread less

Abstract: Single flux quantum (SFQ) logic is a promising candidate to replace the CMOS logic for high speed and low power applications due to its superiority in providing high performance and energy efficient circuits. However, developing effective Electronic Design Automation (EDA) tools, which cater to special characteristics and requirements of SFQ circuits such as depth minimization and path balancing, are essential to automate the whole process of designing large SFQ circuits. In this paper, a novel technology mapping tool, called SFQmap, is presented, which provides optimization methods for minimizing first the circuit depth and path balancing overhead and then the worst-case stage delay of mapped SFQ circuits. Compared with the state-of-the-art technology mappers, SFQmap reduces the depth and path balancing overhead by an average of 14% and 31%, respectively.

...read moreread less

37 citations

1
2
3
4
…
5
6
7
8

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Edge AI: On-Demand Accelerating Deep Neural Network Inference via Edge Computing

[...]

En Li¹, Liekang Zeng¹, Zhi Zhou¹, Xu Chen¹•Institutions (1)

Sun Yat-sen University¹

01 Jan 2020-IEEE Transactions on Wireless Communications

TL;DR: In this article, the authors proposed Edgent, a framework that leverages edge computing for DNN collaborative inference through device-edge synergy, which adaptively partitions computation between device and edge for purpose of coordinating the powerful cloud resource and the proximal edge resource for real-time DNN inference.

...read moreread less

Abstract: As a key technology of enabling Artificial Intelligence (AI) applications in 5G era, Deep Neural Networks (DNNs) have quickly attracted widespread attention. However, it is challenging to run computation-intensive DNN-based tasks on mobile devices due to the limited computation resources. What’s worse, traditional cloud-assisted DNN inference is heavily hindered by the significant wide-area network latency, leading to poor real-time performance as well as low quality of user experience. To address these challenges, in this paper, we propose Edgent , a framework that leverages edge computing for DNN collaborative inference through device-edge synergy. Edgent exploits two design knobs: (1) DNN partitioning that adaptively partitions computation between device and edge for purpose of coordinating the powerful cloud resource and the proximal edge resource for real-time DNN inference; (2) DNN right-sizing that further reduces computing latency via early exiting inference at an appropriate intermediate DNN layer. In addition, considering the potential network fluctuation in real-world deployment, Edgent is properly design to specialize for both static and dynamic network environment. Specifically, in a static environment where the bandwidth changes slowly, Edgent derives the best configurations with the assist of regression-based prediction models, while in a dynamic environment where the bandwidth varies dramatically, Edgent generates the best execution plan through the online change point detection algorithm that maps the current bandwidth state to the optimal configuration. We implement Edgent prototype based on the Raspberry Pi and the desktop PC and the extensive experimental evaluations demonstrate Edgent ’s effectiveness in enabling on-demand low-latency edge intelligence.

...read moreread less

329 citations

Journal Article•DOI•

Transformative effects of IoT, Blockchain and Artificial Intelligence on cloud computing: Evolution, vision, trends and open challenges

[...]

Sukhpal Singh Gill¹, Shreshth Tuli², Minxian Xu³, Inderpreet Singh⁴, Karan Singh⁵, Karan Singh⁶, Dominic Lindsay⁷, Shikhar Tuli², Daria Smirnova⁷, Manmeet Singh⁸, Manmeet Singh², Udit Jain², Haris Pervaiz⁷, Bhanu Sehgal⁹, Sukhwinder Singh Kaila, Sanjay Misra¹⁰, Sanjay Misra¹¹, Mohammad Sadegh Aslanpour¹², Harshit Mehta¹³, Vlado Stankovski¹⁴, Peter Garraghan⁷ - Show less +17 more•Institutions (14)

Queen Mary University of London¹, Indian Institutes of Technology², Chinese Academy of Sciences³, Simon Fraser University⁴, University of Waterloo⁵, Amazon.com⁶, Lancaster University⁷, Indian Institute of Tropical Meteorology⁸, Accenture⁹, Atılım University¹⁰, Covenant University¹¹, Islamic Azad University¹², University of Texas at Austin¹³, University of Ljubljana¹⁴

01 Dec 2019

TL;DR: A conceptual model for cloud futurology is proposed in this article to explore the influence of emerging paradigms and technologies on evolution of cloud computing. But, the model is limited to three technologies: Blockchain, IoT and Artificial Intelligence.

...read moreread less

Abstract: Cloud computing plays a critical role in modern society and enables a range of applications from infrastructure to social media. Such system must cope with varying load and evolving usage reflecting societies’ interaction and dependency on automated computing systems whilst satisfying Quality of Service (QoS) guarantees. Enabling these systems are a cohort of conceptual technologies, synthesized to meet demand of evolving computing applications. In order to understand current and future challenges of such system, there is a need to identify key technologies enabling future applications. In this study, we aim to explore how three emerging paradigms (Blockchain, IoT and Artificial Intelligence) will influence future cloud computing systems. Further, we identify several technologies driving these paradigms and invite international experts to discuss the current status and future directions of cloud computing. Finally, we proposed a conceptual model for cloud futurology to explore the influence of emerging paradigms and technologies on evolution of cloud computing.

...read moreread less

247 citations

On the decomposition of switching functions

[...]

Sze-Tsen Hu

01 Jun 1961

TL;DR: In this article, the Ashenhurst chart method is generalized to non-junctive decompositions by means of the don't care conditions, which leads to designs of more economical switching circuits to realize the given switching function.

...read moreread less

Abstract: : A given switching function of n variables can frequently be decomposed into a composite function of several essentially simpler switching functions. Such decompositions lead to designs of more economical switching circuits to realize the given switching function. Ashenhurst's chart method is generalized to nondisjunctive decompositions by means of the don't care conditions. This extension provides an effective method of constructing all decompositions of switching functions. (Author)

...read moreread less

227 citations

Journal Article•DOI•

Recent thermal management techniques for microprocessors

[...]

Joonho Kong¹, Sung Woo Chung¹, Kevin Skadron²•Institutions (2)

Korea University¹, University of Virginia²

14 Jun 2012-ACM Computing Surveys

TL;DR: The overall objective of this survey is to give microprocessor designers a broad perspective on various aspects of designing thermal-aware microprocessors and to guide future thermal management studies.

...read moreread less

Abstract: Microprocessor design has recently encountered many constraints such as power, energy, reliability, and temperature. Among these challenging issues, temperature-related issues have become especially important within the past several years. We summarize recent thermal management techniques for microprocessors, focusing on those that affect or rely on the microarchitecture. We categorize thermal management techniques into six main categories: temperature monitoring, microarchitectural techniques, floorplanning, OS/compiler techniques, liquid cooling techniques, and thermal reliability/security. Temperature monitoring, a requirement for Dynamic Thermal Management (DTM), includes temperature estimation and sensor placement techniques for accurate temperature measurement or estimation. Microarchitectural techniques include both static and dynamic thermal management techniques that control hardware structures. Floorplanning covers a range of thermal-aware floorplanning techniques for 2D and 3D microprocessors. OS/compiler techniques include thermal-aware task scheduling and instruction scheduling techniques. Liquid cooling techniques are higher-capacity alternatives to conventional air cooling techniques. Thermal reliability/security issues cover temperature-dependent reliability modeling, Dynamic Reliability Management (DRM), and malicious codes that specifically cause overheating. Temperature-related issues will only become more challenging as process technology continues to evolve and transistor densities scale up faster than power per transistor scales down. The overall objective of this survey is to give microprocessor designers a broad perspective on various aspects of designing thermal-aware microprocessors and to guide future thermal management studies.

...read moreread less

201 citations

Proceedings Article•DOI•

Approximation algorithm for the temperature-aware scheduling problem

[...]

Sushu Zhang¹, Karam S. Chatha¹•Institutions (1)

Arizona State University¹

05 Nov 2007

TL;DR: It is proved that the problem of performance optimization for a set of periodic tasks with discrete voltage/frequency states under thermal constraints is NP-hard, and a pseudo-polynomial optimal algorithm and a fully polynomial time approximation technique (FPTAS) are presented.

...read moreread less

Abstract: The paper addresses the problem of performance optimization for a set of periodic tasks with discrete voltage/frequency states under thermal constraints. We prove that the problem is NP-hard, and present a pseudo-polynomial optimal algorithm and a fully polynomial time approximation technique (FPTAS) for the problem. The FPTAS technique is able to generate solutions in polynomial time that are guaranteed to be within a designer specified quality bound (QB) (say within 1% of the optimal). We evaluate our techniques by experimentation with multimedia and synthetic benchmarks mapped on the 70 nm CMOS technology processor. The experimental results demonstrate our techniques are able to match optimal solutions when QB is set at 5%, can generate solutions that arc quite close to optimal ( 25%) for large task sets with 120 nodes (while the optimal solution takes several hundred seconds). We also analyze the effect of different thermal parameters, such as the initial temperature, the final temperature and the thermal resistance.

...read moreread less

181 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122

Collapse