scispace - formally typeset
Search or ask a question

Showing papers by "Shahin Nazarian published in 2020"


Proceedings ArticleDOI
20 Apr 2020
TL;DR: The proposed VRoC, a tweet-level variational autoencoder-based rumor classification system, consistently outperforms several state-of-the-art techniques, on both observed and unobserved rumors, by up to 26.9%, in terms of macro-F1 scores.
Abstract: Social media became popular and percolated almost all aspects of our daily lives. While online posting proves very convenient for individual users, it also fosters fast-spreading of various rumors. The rapid and wide percolation of rumors can cause persistent adverse or detrimental impacts. Therefore, researchers invest great efforts on reducing the negative impacts of rumors. Towards this end, the rumor classification system aims to to detect, track, and verify rumors in social media. Such systems typically include four components: (i) a rumor detector, (ii) a rumor tracker, (iii) a stance classifier, and (iv) a veracity classifier. In order to improve the state-of-the-art in rumor detection, tracking, and verification, we propose VRoC, a tweet-level variational autoencoder-based rumor classification system. VRoC consists of a co-train engine that trains variational autoencoders (VAEs) and rumor classification components. The co-train engine helps the VAEs to tune their latent representations to be classifier-friendly. We also show that VRoC is able to classify unseen rumors with high levels of accuracy. For the PHEME dataset, VRoC consistently outperforms several state-of-the-art techniques, on both observed and unobserved rumors, by up to 26.9%, in terms of macro-F1 scores.

42 citations


Journal ArticleDOI
31 Jul 2020
TL;DR: DeepTrust identifies proper multi-layered neural network (NN) topologies that have high projected trust probabilities, even when trained with untrusted data, and shows that uncertain opinion of data is not always malicious while evaluating NN's opinion and trustworthiness, whereas the disbelief opinion hurts trust the most.
Abstract: Artificial Intelligence (AI) plays a fundamental role in the modern world, especially when used as an autonomous decision maker. One common concern nowadays is "how trustworthy the AIs are." Human operators follow a strict educational curriculum and performance assessment that could be exploited to quantify how much we entrust them. To quantify the trust of AI decision makers, we must go beyond task accuracy especially when facing limited, incomplete, misleading, controversial or noisy datasets. Toward addressing these challenges, we describe DeepTrust, a Subjective Logic (SL) inspired framework that constructs a probabilistic logic description of an AI algorithm and takes into account the trustworthiness of both dataset and inner algorithmic workings. DeepTrust identifies proper multi-layered neural network (NN) topologies that have high projected trust probabilities, even when trained with untrusted data. We show that uncertain opinion of data is not always malicious while evaluating NN's opinion and trustworthiness, whereas the disbelief opinion hurts trust the most. Also trust probability does not necessarily correlate with accuracy. DeepTrust also provides a projected trust probability of NN's prediction, which is useful when the NN generates an over-confident output under problematic datasets. These findings open new analytical avenues for designing and improving the NN topology by optimizing opinion and trustworthiness, along with accuracy, in a multi-objective optimization formulation, subject to space and time constraints.

27 citations


Journal ArticleDOI
TL;DR: The proposed H2O-Cloud is highly scalable and considers comprehensive information, such as various workload scenarios, cloud platform configurations, user request information, and dynamic pricing model, to improve resource usage effectiveness while maintaining quality of service (QoS).
Abstract: Cloud computing has attracted both end-users and cloud service providers (CSPs) in recent years. Improving resource utilization rate (RUtR), such as CPU and memory usages on servers, while maintaining quality of service (QoS) is one key challenge faced by CSPs with warehouse-scale datacenters. Prior works proposed various algorithms to reduce energy cost or to improve RUtR, which either lack the fine-grained task scheduling capabilities, or fail to take a comprehensive system model into consideration. This article presents H2O-Cloud, a Hierarchical and Hybrid Online task scheduling framework for warehouse-scale Cloud service providers, to improve resource usage effectiveness while maintaining QoS. H2O-Cloud is highly scalable and considers comprehensive information, such as various workload scenarios, cloud platform configurations, user request information, and dynamic pricing model. The hierarchy and hybridity of the framework, combined with its deep reinforcement learning (DRL) engines, enable H2O-Cloud to efficiently start on-the-go scheduling and learning in an unpredictable environment without pretraining. Our experiments confirm the high efficiency of the proposed H2O-Cloud when compared to baseline approaches, in terms of energy and cost while maintaining QoS. Compared with a state-of-the-art DRL-based algorithm, H2O-Cloud achieves up to 201.17% energy cost efficiency improvement, 47.88% energy efficiency improvement, and 551.76% reward rate improvement.

16 citations


Proceedings ArticleDOI
TL;DR: This paper aims at integrating three powerful techniques namely Deep Learning, Approximate Computing, and Low Power Design into a strategy to optimize logic at the synthesis level to reduce the dynamic power and area of a digital CMOS circuit.
Abstract: This paper aims at integrating three powerful techniques namely Deep Learning, Approximate Computing, and Low Power Design into a strategy to optimize logic at the synthesis level. We utilize advances in deep learning to guide an approximate logic synthesis engine to minimize the dynamic power consumption of a given digital CMOS circuit, subject to a predetermined error rate at the primary outputs. Our framework, Deep-PowerX, focuses on replacing or removing gates on a technology-mapped network and uses a Deep Neural Network (DNN) to predict error rates at primary outputs of the circuit when a specific part of the netlist is approximated. The primary goal of Deep-PowerX is to reduce the dynamic power whereas area reduction serves as a secondary objective. Using the said DNN, Deep-PowerX is able to reduce the exponential time complexity of standard approximate logic synthesis to linear time. Experiments are done on numerous open source benchmark circuits. Results show significant reduction in power and area by up to 1.47 times and 1.43 times compared to exact solutions and by up to 22% and 27% compared to state-of-the-art approximate logic synthesis tools while having orders of magnitudes lower run-time.

6 citations


Book ChapterDOI
06 Oct 2020
TL;DR: SANSCrypt as mentioned in this paper adopts a new temporal dimension to logic encryption, by requiring the user to sporadically perform multiple authentications according to a protocol based on pseudo-random number generation.
Abstract: Sequential logic encryption is a countermeasure against reverse engineering of sequential circuits based on modifying the original finite state machine of the circuit such that the circuit enters a wrong state upon being reset. A user must apply a certain sequence of input patterns, i.e., a key sequence, for the circuit to transition to the correct state. The circuit then remains functional unless it is powered off or reset again. Most sequential encryption methods require the correct key to be applied only once. In this paper, we propose a novel Sporadic-Authentication-Based Sequential Logic Encryption method (SANSCrypt) that circumvents the potential vulnerability associated with a single-authentication mechanism. SANSCrypt adopts a new temporal dimension to logic encryption, by requiring the user to sporadically perform multiple authentications according to a protocol based on pseudo-random number generation. We provide implementation details of SANSCrypt and present a design that is amenable to time-sensitive applications. In SANSCrypt, the authentication task does not significantly disrupt the normal circuit operation, as it can be interrupted or postponed upon request from a high-priority task with minimal impact on the overall performance. Analysis and validation results on a set of benchmark circuits show that SANSCrypt offers a substantial output corruptibility if the key sequences are applied incorrectly. Moreover, it exhibits exponential resilience to existing attacks, including SAT-based attacks, while maintaining a reasonably low overhead.

4 citations


Proceedings ArticleDOI
01 Oct 2020
TL;DR: WAANSO, a scalable framework that incorporates a Wavelet Clustering based approach to cluster application tasks, is presented and it is shown that WAANSO can significantly increase the MCS energy and performance efficiencies.
Abstract: System-on-chip (SoC) has migrated from single core to manycore architectures to cope with the increasing complexity of real-life applications. Application task mapping has a significant impact on the efficiency of manycore system (MCS) computation and communication. We present WAANSO, a scalable framework that incorporates a Wavelet Clustering based approach to cluster application tasks. We also introduce Ant Swarm Optimization (ASO) based on iterative execution of Ant Colony Optimization (ACO) and Particle Swarm Optimization (PSO) for task clustering and mapping to the MCS processing elements. We have shown that WAANSO can significantly increase the MCS energy and performance efficiencies. Based on our experiments on a 64-core system, WAANSO improves energy efficiency by 19%, compared to baseline approaches, namely DPSO, ACO and branch and bound (B&B). Additionally, the performance improves by 65.86% compared to Density-Based Spatial Clustering of Applications with Noise (DBSCAN) baseline.

3 citations


Proceedings ArticleDOI
10 Aug 2020
TL;DR: Deep-PowerX1 as discussed by the authors uses a Deep Neural Network (DNN) to predict error rates at primary outputs of the circuit when a specific part of the netlist is approximated.
Abstract: This paper aims at integrating three powerful techniques namely Deep Learning, Approximate Computing, and Low Power Design into a strategy to optimize logic at the synthesis level. We utilize advances in deep learning to guide an approximate logic synthesis engine to minimize the dynamic power consumption of a given digital CMOS circuit, subject to a predetermined error rate at the primary outputs. Our framework, Deep-PowerX1, focuses on replacing or removing gates on a technology-mapped network and uses a Deep Neural Network (DNN) to predict error rates at primary outputs of the circuit when a specific part of the netlist is approximated. The primary goal of Deep-PowerX is to reduce the dynamic power whereas area reduction serves as a secondary objective. Using the said DNN, Deep-PowerX is able to reduce the exponential time complexity of standard approximate logic synthesis to linear time. Experiments are done on numerous open source benchmark circuits. Results show significant reduction in power and area by up to 1.47× and 1.43× compared to exact solutions and by up to 22% and 27% compared to state-of-the-art approximate logic synthesis tools while having orders of magnitudes lower run-time.

3 citations


Proceedings ArticleDOI
12 Oct 2020
TL;DR: In this article, the authors propose a framework for the design and optimization of a secure self-optimizing, self-adapting system-on-chip (S4oC) architecture.
Abstract: We propose a framework for the design and optimization of a secure self-optimizing, self-adapting system-on-chip (S4oC) architecture. The goal is to minimize the impact of attacks such as hardware Trojan and side-channel, by making real-time adjustments. S4oC learns to reconfigure itself, subject to various security measures and attacks, some of which possibly unknown at design time. Furthermore, the data types and patterns of the target applications, environmental conditions, and sources of variations are incorporated. S4oC is a manycore system, modeled as a four-layer graph, representing the model of computation (MoCp), model of connection (MoCn), model of memory (MoM) and model of storage (MoS), with a large number of elements including heterogeneous reconfigurable processing elements in MoCp, and memory elements in the MoM layer. Security driven community detection, and neural networks are utilized for application task clustering, and distributed reinforcement learning (RL) for task mapping.

3 citations


Posted Content
TL;DR: This work proposes a framework for the design and optimization of a secure self-optimizing, self-adapting system-on-chip (S4oC) architecture, to minimize the impact of attacks such as hardware Trojan and side-channel, by making real-time adjustments.
Abstract: We propose a framework for the design and optimization of a secure self-optimizing, self-adapting system-on-chip (S4oC) architecture. The goal is to minimize the impact of attacks such as hardware Trojan and side-channel, by making real-time adjustments. S4oC learns to reconfigure itself, subject to various security measures and attacks, some of which possibly unknown at design time. Furthermore, the data types and patterns of the target applications, environmental conditions, and sources of variations are incorporated. S4oC is a manycore system, modeled as a four-layer graph, representing the model of computation (MoCp), model of connection (MoCn), model of memory (MoM) and model of storage (MoS), with a large number of elements including heterogeneous reconfigurable processing elements in MoCp, and memory elements in the MoM layer. Security driven community detection, and neural networks are utilized for application task clustering, and distributed reinforcement learning (RL) for task mapping.

3 citations


Proceedings ArticleDOI
12 Feb 2020
TL;DR: In this article, the authors propose a method that replaces the augmentation in the raw input space with an approximate one that acts purely in the embedding space, which drastically reduces the computation, while the accuracy of models is negligibly compromised.
Abstract: Recent advances in the field of artificial intelligence have been made possible by deep neural networks. In applications where data are scarce, transfer learning and data augmentation techniques are commonly used to improve the generalization of deep learning models. However, fine-tuning a transfer model with data augmentation in the raw input space has a high computational cost to run the full network for every augmented input. This is particularly critical when large models are implemented on embedded devices with limited computational and energy resources. In this work, we propose a method that replaces the augmentation in the raw input space with an approximate one that acts purely in the embedding space. Our experimental results show that the proposed method drastically reduces the computation, while the accuracy of models is negligibly compromised.

3 citations


Proceedings ArticleDOI
TL;DR: Analysis and validation results show that SANSCrypt offers a substantial output corruptibility if the key sequences are applied incorrectly, and exhibits an exponential resilience to existing attacks, including SAT-based attacks, while maintaining a reasonably low overhead.
Abstract: We propose SANSCrypt, a novel sequential logic encryption scheme to protect integrated circuits against reverse engineering. Previous sequential encryption methods focus on modifying the circuit state machine such that the correct functionality can be accessed by applying the correct key sequence only once. Considering the risk associated with one-time authentication, SANSCrypt adopts a new temporal dimension to logic encryption, by requiring the user to sporadically perform multiple authentications according to a protocol based on pseudo-random number generation. Analysis and validation results on a set of benchmark circuits show that SANSCrypt offers a substantial output corruptibility if the key sequences are applied incorrectly. Moreover, it exhibits an exponential resilience to existing attacks, including SAT-based attacks, while maintaining a reasonably low overhead.

Proceedings ArticleDOI
05 Oct 2020
TL;DR: SANSCrypt as mentioned in this paper adopts a new temporal dimension to logic encryption, by requiring the user to sporadically perform multiple authentications according to a protocol based on pseudorandom number generation.
Abstract: We propose SANSCrypt, a novel sequential logic encryption scheme to protect integrated circuits against reverse engineering. Previous sequential encryption methods focus on modifying the circuit state machine such that the correct functionality can be accessed by applying the correct key sequence only once. Considering the risk associated with one-time authentication, SANSCrypt adopts a new temporal dimension to logic encryption, by requiring the user to sporadically perform multiple authentications according to a protocol based on pseudorandom number generation. Analysis and validation results on a set of benchmark circuits show that SANSCrypt offers a substantial output corruptibility if the key sequences are applied incorrectly. Moreover, it exhibits an exponential resilience to existing attacks, including SAT-based attacks, while maintaining a reasonably low overhead.

Posted Content
TL;DR: A vertex cut framework for partitioning LLVM IR graphs into clusters while taking into consideration the data communication and workload balance among clusters is designed and a memory-centric run-time mapping of the linear time complexity to map clusters generated from the vertex cut algorithms onto a multi-core platform is proposed.
Abstract: High-level applications, such as machine learning, are evolving from simple models based on multilayer perceptrons for simple image recognition to much deeper and more complex neural networks for self-driving vehicle control systems.The rapid increase in the consumption of memory and computational resources by these models demands the use of multi-core parallel systems to scale the execution of the complex emerging applications that depend on them. However, parallel programs running on high-performance computers often suffer from data communication bottlenecks, limited memory bandwidth, and synchronization overhead due to irregular critical sections. In this paper, we propose a framework to reduce the data communication and improve the scalability and performance of these applications in multi-core systems. We design a vertex cut framework for partitioning LLVM IR graphs into clusters while taking into consideration the data communication and workload balance among clusters. First, we construct LLVM graphs by compiling high-level programs into LLVM IR, instrumenting code to obtain the execution order of basic blocks and the execution time for each memory operation, and analyze data dependencies in dynamic LLVM traces. Next, we formulate the problem as Weight Balanced $p$-way Vertex Cut, and propose a generic and flexible framework, wherein four different greedy algorithms are proposed for solving this problem. Lastly, we propose a memory-centric run-time mapping of the linear time complexity to map clusters generated from the vertex cut algorithms onto a multi-core platform. We conclude that our best algorithm, WB-Libra, provides performance improvements of 1.56x and 1.86x over existing state-of-the-art approaches for 8 and 1024 clusters running on a multi-core platform, respectively.

Posted Content
TL;DR: The Multi-Cycle Input Dependency (MCID) circuit model is presented which is a novel model representation of design to explicitly capture the dependency of primary outputs of the circuit on sequences of internal signals and inputs.
Abstract: Traditional logical equivalence checking (LEC) which plays a major role in entire chip design process faces challenges of meeting the requirements demanded by the many emerging technologies that are based on logic models different from standard complementary metal oxide semiconductor (CMOS). In this paper, we propose a LEC framework to be employed in the verification process of beyond-CMOS circuits. Our LEC framework is compatible with existing CMOS technologies, but, also able to check features and capabilities that are unique to beyond-CMOS technologies. For instance, the performance of some emerging technologies benefits from ultra-deep pipelining and verification of such circuits requires new models and algorithms. We, therefore, present the Multi-Cycle Input Dependency (MCID) circuit model which is a novel model representation of design to explicitly capture the dependency of primary outputs of the circuit on sequences of internal signals and inputs. Embedding the proposed circuit model and several structural checking modules, the process of verification can be independent of the underlying technology and signaling. We benchmark the proposed framework on post-synthesis rapid single-flux-quantum (RSFQ) netlists. Results show a comparative verification time of RSFQ circuit benchmark including 32-bit Kogge-Stone adder, 16-bit integer divider, and ISCAS'85 circuits with respect to ABC tool for similar CMOS circuits.

Proceedings ArticleDOI
25 Mar 2020
TL;DR: NN-PARS as discussed by the authors is a neural network-based and parallelized circuit simulation framework with optimized event-driven scheduling of simulation tasks to maximize concurrency, according to the underlying GPU parallel processing capabilities.
Abstract: The shrinking of transistor geometries as well as the increasing complexity of integrated circuits, significantly aggravate nonlinear design behavior. This demands accurate and fast circuit simulation to meet the design quality and time-to-market constraints. The existing circuit simulators which utilize lookup tables and/or closed-form expressions are either slow or inaccurate in analyzing the nonlinear behavior of designs with billions of transistors. To address these shortcomings, we present NN-PARS, a neural network (NN) based and parallelized circuit simulation framework with optimized event-driven scheduling of simulation tasks to maximize concurrency, according to the underlying GPU parallel processing capabilities. NN-PARS replaces the required memory queries in traditional techniques with parallelized NN-based computation tasks. Experimental results show that compared to a state-of-the-art current-based simulation method, NN-PARS reduces the simulation time by over two orders of magnitude in large circuits. NN-PARS also provides high accuracy levels in signal waveform calculations, with less than 2% error compared to HSPICE.

Posted Content
TL;DR: N-PARS is presented, a neural network (NN) based and parallelized circuit simulation framework with optimized event-driven scheduling of simulation tasks to maximize concurrency, according to the underlying GPU parallel processing capabilities.
Abstract: The shrinking of transistor geometries as well as the increasing complexity of integrated circuits, significantly aggravate nonlinear design behavior. This demands accurate and fast circuit simulation to meet the design quality and time-to-market constraints. The existing circuit simulators which utilize lookup tables and/or closed-form expressions are either slow or inaccurate in analyzing the nonlinear behavior of designs with billions of transistors. To address these shortcomings, we present NN-PARS, a neural network (NN) based and parallelized circuit simulation framework with optimized event-driven scheduling of simulation tasks to maximize concurrency, according to the underlying GPU parallel processing capabilities. NN-PARS replaces the required memory queries in traditional techniques with parallelized NN-based computation tasks. Experimental results show that compared to a state-of-the-art current-based simulation method, NN-PARS reduces the simulation time by over two orders of magnitude in large circuits. NN-PARS also provides high accuracy levels in signal waveform calculations, with less than $2\%$ error compared to HSPICE.

Posted Content
TL;DR: WAANSO as discussed by the authors is a scalable framework that incorporates a Wavelet clustering based approach to cluster application tasks and introduces Ant Swarm Optimization (ASO) based on iterative execution of ACO and PSO for task clustering and mapping to MCS processing elements.
Abstract: System-on-chip (SoC) has migrated from single core to manycore architectures to cope with the increasing complexity of real-life applications. Application task mapping has a significant impact on the efficiency of manycore system (MCS) computation and communication. We present WAANSO, a scalable framework that incorporates a Wavelet Clustering based approach to cluster application tasks. We also introduce Ant Swarm Optimization (ASO) based on iterative execution of Ant Colony Optimization (ACO) and Particle Swarm Optimization (PSO) for task clustering and mapping to the MCS processing elements. We have shown that WAANSO can significantly increase the MCS energy and performance efficiencies. Based on our experiments on a 64-core system, WAANSO improves energy efficiency by 19%, compared to baseline approaches, namely DPSO, ACO and branch and bound (B&B). Additionally, the performance improves by 65.86% compared to Density-Based Spatial Clustering of Applications with Noise (DBSCAN) baseline.

Posted Content
TL;DR: This work proposes a method that replaces the augmentation in the raw input space with an approximate one that acts purely in the embedding space, and results show that the proposed method drastically reduces the computation, while the accuracy of models is negligibly compromised.
Abstract: Recent advances in the field of artificial intelligence have been made possible by deep neural networks. In applications where data are scarce, transfer learning and data augmentation techniques are commonly used to improve the generalization of deep learning models. However, fine-tuning a transfer model with data augmentation in the raw input space has a high computational cost to run the full network for every augmented input. This is particularly critical when large models are implemented on embedded devices with limited computational and energy resources. In this work, we propose a method that replaces the augmentation in the raw input space with an approximate one that acts purely in the embedding space. Our experimental results show that the proposed method drastically reduces the computation, while the accuracy of models is negligibly compromised.

Posted Content
TL;DR: CSM-NN as discussed by the authors is a scalable simulation framework with optimized neural network structures and processing algorithms, aimed at optimizing the simulation time by accounting for the latency of the required memory query and computation, given the underlying CPU and GPU parallel processing capabilities.
Abstract: The miniaturization of transistors down to 5nm and beyond, plus the increasing complexity of integrated circuits, significantly aggravate short channel effects, and demand analysis and optimization of more design corners and modes. Simulators need to model output variables related to circuit timing, power, noise, etc., which exhibit nonlinear behavior. The existing simulation and sign-off tools, based on a combination of closed-form expressions and lookup tables are either inaccurate or slow, when dealing with circuits with more than billions of transistors. In this work, we present CSM-NN, a scalable simulation framework with optimized neural network structures and processing algorithms. CSM-NN is aimed at optimizing the simulation time by accounting for the latency of the required memory query and computation, given the underlying CPU and GPU parallel processing capabilities. Experimental results show that CSM-NN reduces the simulation time by up to $6\times$ compared to a state-of-the-art current source model based simulator running on a CPU. This speedup improves by up to $15\times$ when running on a GPU. CSM-NN also provides high accuracy levels, with less than $2\%$ error, compared to HSPICE.