scispace - formally typeset
Search or ask a question

Showing papers on "Computation published in 2021"


Book
21 Aug 2021
TL;DR: Finite-Volume Methods for Incompressible Flows for Computation of Turbulent Flows and Acceleration of Computations for Solution of Algebraic Systems of Equations.
Abstract: Modeling of Continuum Mechanical Problems.- Discretization of Problem Domain.- Finite-Volume Methods.- Finite-Element Methods.- Time Discretization.- Solution of Algebraic Systems of Equations.- Properties of Numerical Methods.- Finite-Element Methods in Structural Mechanics.- Finite-Volume Methods for Incompressible Flows.- Computation of Turbulent Flows.- Acceleration of Computations.

135 citations


Journal ArticleDOI
15 Mar 2021
TL;DR: This work implements new randomized protocols that find very high quality contraction paths for arbitrary and large tensor networks, and introduces a hyper-optimization approach, where both the method applied and its algorithmic parameters are tuned during the path finding.
Abstract: Tensor networks represent the state-of-the-art in computational methods across many disciplines, including the classical simulation of quantum many-body systems and quantum circuits. Several applications of current interest give rise to tensor networks with irregular geometries. Finding the best possible contraction path for such networks is a central problem, with an exponential effect on computation time and memory footprint. In this work, we implement new randomized protocols that find very high quality contraction paths for arbitrary and large tensor networks. We test our methods on a variety of benchmarks, including the random quantum circuit instances recently implemented on Google quantum chips. We find that the paths obtained can be very close to optimal, and often many orders or magnitude better than the most established approaches. As different underlying geometries suit different methods, we also introduce a hyper-optimization approach, where both the method applied and its algorithmic parameters are tuned during the path finding. The increase in quality of contraction schemes found has significant practical implications for the simulation of quantum many-body systems and particularly for the benchmarking of new quantum chips. Concretely, we estimate a speed-up of over 10,000$\times$ compared to the original expectation for the classical simulation of the Sycamore `supremacy' circuits.

101 citations


Journal ArticleDOI
TL;DR: A novel GPU-accelerated placement framework DREAMPlace is proposed, by casting the analytical placement problem equivalently to training a neural network, to achieve speedup in global placement without quality degradation compared to the state-of-the-art multithreaded placer RePlAce.
Abstract: Placement for very large-scale integrated (VLSI) circuits is one of the most important steps for design closure We propose a novel GPU-accelerated placement framework DREAMPlace, by casting the analytical placement problem equivalently to training a neural network Implemented on top of a widely adopted deep learning toolkit PyTorch , with customized key kernels for wirelength and density computations, DREAMPlace can achieve around $40\times $ speedup in global placement without quality degradation compared to the state-of-the-art multithreaded placer RePlAce We believe this work shall open up new directions for revisiting classical EDA problems with advancements in AI hardware and software

87 citations


Journal Article
TL;DR: Predictive coding converges asymptotically (and in practice, rapidly) to exact backprop gradients on arbitrary computation graphs using only local learning rules, raising the potential that standard machine learning algorithms could in principle be directly implemented in neural circuitry.
Abstract: The backpropagation of error (backprop) is a powerful algorithm for training machine learning architectures through end-to-end differentiation. Recently it has been shown that backprop in multilayer-perceptrons (MLPs) can be approximated using predictive coding, a biologically-plausible process theory of cortical computation which relies solely on local and Hebbian updates. The power of backprop, however, lies not in its instantiation in MLPs, but rather in the concept of automatic differentiation which allows for the optimisation of any differentiable program expressed as a computation graph. Here, we demonstrate that predictive coding converges asymptotically (and in practice rapidly) to exact backprop gradients on arbitrary computation graphs using only local learning rules. We apply this result to develop a straightforward strategy to translate core machine learning architectures into their predictive coding equivalents. We construct predictive coding CNNs, RNNs, and the more complex LSTMs, which include a non-layer-like branching internal graph structure and multiplicative interactions. Our models perform equivalently to backprop on challenging machine learning benchmarks, while utilising only local and (mostly) Hebbian plasticity. Our method raises the potential that standard machine learning algorithms could in principle be directly implemented in neural circuitry, and may also contribute to the development of completely distributed neuromorphic architectures.

81 citations


Journal ArticleDOI
TL;DR: For a given DNN-based application, DNNOff first rewrites the source code to implement a special program structure supporting on-demand offloading and, at runtime, automatically determines the offloading scheme.
Abstract: Deep neural network (DNN) has become increasingly popular in industrial IoT scenarios. Due to high demands on computational capability, it is hard for DNN-based applications to directly run on intelligent end devices with limited resources. Computation offloading technology offers a feasible solution by offloading some computation-intensive tasks to the cloud or edges. Supporting such capability is not easy due to two aspects: (1) Adaptability: offloading should be dynamically occur among computation nodes. (2) Effectiveness: it needs to be determined which parts are worth offloading. This paper proposed a novel approach, called DNNOff. For a given DNN-based application, DNNOff first rewrites the source code to implement a special program structure supporting on-demand offloading, and at runtime, automatically determines the offloading scheme. We evaluated DNNOff on a real-world intelligent application, with three DNN models. Our results show that, compared with other approaches, DNNOff saves response time by 12.4%-66.6% on average.

73 citations


Journal ArticleDOI
TL;DR: A novel framework, named LyDROO, is proposed that combines the advantages of Lyapunov optimization and deep reinforcement learning (DRL), and guarantees to satisfy all the long-term constraints by solving the per-frame MINLP subproblems that are much smaller in size.
Abstract: Opportunistic computation offloading is an effective method to improve the computation performance of mobile-edge computing (MEC) networks under dynamic edge environment. In this paper, we consider a multi-user MEC network with time-varying wireless channels and stochastic user task data arrivals in sequential time frames. In particular, we aim to design an online computation offloading algorithm to maximize the network data processing capability subject to the long-term data queue stability and average power constraints. The online algorithm is practical in the sense that the decisions for each time frame are made without the assumption of knowing the future realizations of random channel conditions and data arrivals. We formulate the problem as a multi-stage stochastic mixed integer non-linear programming (MINLP) problem that jointly determines the binary offloading (each user computes the task either locally or at the edge server) and system resource allocation decisions in sequential time frames. To address the coupling in the decisions of different time frames, we propose a novel framework, named LyDROO, that combines the advantages of Lyapunov optimization and deep reinforcement learning (DRL). Specifically, LyDROO first applies Lyapunov optimization to decouple the multi-stage stochastic MINLP into deterministic per-frame MINLP subproblems. By doing so, it guarantees to satisfy all the long-term constraints by solving the per-frame subproblems that are much smaller in size. Then, LyDROO integrates model-based optimization and model-free DRL to solve the per-frame MINLP problems with very low computational complexity. Simulation results show that under various network setups, the proposed LyDROO achieves optimal computation performance while stabilizing all queues in the system. Besides, it induces very low computation time that is particularly suitable for real-time implementation in fast fading environments.

68 citations


Journal ArticleDOI
01 Jan 2021
TL;DR: In this article, a Burger-like equation with complex solutions is defined in Hilbert space and solved with an example in order to smoothen the sonic processing of a simple turbulence flow.
Abstract: Emerging as a new field, quantum computation has reinvented the fundamentals of Computer Science and knowledge theory in a manner consistent with quantum physics. The fact that quantum computation has superior features and new events than classical computation provides benefits in proving mathematical theories. With advances in technology, the nonlinear partial differential equations are used in almost every area, and many difficulties have been overcome by the solutions of these equations. In particular, the complex solutions of KdV and Burgers equations have been shown to be used in modeling a simple turbulence flow. In this study, Burger-like equation with complex solutions is defined in Hilbert space and solved with an example. In addition, these solutions were analyzed. Thanks to the Quantum Burgers-Like equation, the nonlinear differential equation is solved by linearizing. The pattern changes of time made the result linear. This means that the Quantum Burgers-Like equation can be used to smoothen the sonic processing

63 citations


Journal ArticleDOI
TL;DR: This work provides a fast algorithm for computing Jacobians for heterogeneous agents, a technique to substantially reduce dimensionality, a rapid procedure for likelihood-based estimation, a determinacy condition for the sequence space, and a method to solve nonlinear perfect-foresight transitions.
Abstract: We propose a general and highly efficient method for solving and estimating general equilibrium heterogeneous-agent models with aggregate shocks in discrete time. Our approach relies on the rapid computation of sequence-space Jacobians—the derivatives of perfect-foresight equilibrium mappings between aggregate sequences around the steady state. Our main contribution is a fast algorithm for calculating Jacobians for a large class of heterogeneous-agent problems. We combine this algorithm with a systematic approach to composing and inverting Jacobians to solve for general equilibrium impulse responses. We obtain a rapid procedure for likelihood-based estimation and computation of nonlinear perfect-foresight transitions. We apply our methods to three canonical heterogeneous-agent models: a neoclassical model, a New Keynesian model with one asset, and a New Keynesian model with two assets.

62 citations


Journal ArticleDOI
TL;DR: A quantum solver of contracted eigenvalue equations is introduced, the quantum analog of classical methods for the energies and reduced density matrices of ground and excited states and achieves an exponential speed-up over its classical counterpart.
Abstract: The accurate computation of ground and excited states of many-fermion quantum systems is one of the most consequential, contemporary challenges in the physical and computational sciences whose solution stands to benefit significantly from the advent of quantum computing devices. Existing methodologies using phase estimation or variational algorithms have potential drawbacks such as deep circuits requiring substantial error correction or nontrivial high-dimensional classical optimization. Here, we introduce a quantum solver of contracted eigenvalue equations, the quantum analog of classical methods for the energies and reduced density matrices of ground and excited states. The solver does not require deep circuits or difficult classical optimization and achieves an exponential speed-up over its classical counterpart. We demonstrate the algorithm though computations on both a quantum simulator and two IBM quantum processing units.

60 citations


Journal ArticleDOI
TL;DR: This work presents an optimization based method that can accurately compute the phase factors using standard double precision arithmetic operations and demonstrates the performance of this approach with applications to Hamiltonian simulation, eigenvalue filtering, and the quantum linear system problems.
Abstract: Quantum signal processing (QSP) is a powerful quantum algorithm to exactly implement matrix polynomials on quantum computers. Asymptotic analysis of quantum algorithms based on QSP has shown that asymptotically optimal results can in principle be obtained for a range of tasks, such as Hamiltonian simulation and the quantum linear system problem. A further benefit of QSP is that it uses a minimal number of ancilla qubits, which facilitates its implementation on near-to-intermediate term quantum architectures. However, there is so far no classically stable algorithm allowing computation of the phase factors that are needed to build QSP circuits. Existing methods require the use of variable precision arithmetic and can only be applied to polynomials of a relatively low degree. We present here an optimization-based method that can accurately compute the phase factors using standard double precision arithmetic operations. We demonstrate the performance of this approach with applications to Hamiltonian simulation, eigenvalue filtering, and quantum linear system problems. Our numerical results show that the optimization algorithm can find phase factors to accurately approximate polynomials of a degree larger than $10\phantom{\rule{0.16em}{0ex}}000$ with errors below ${10}^{\ensuremath{-}12}$.

59 citations


Proceedings ArticleDOI
Ang Li1, Jingwei Sun1, Xiao Zeng2, Mi Zhang2, Hai Li1, Yiran Chen1 
15 Nov 2021
TL;DR: FedMask as discussed by the authors is a communication and computation efficient federated learning framework for on-device deep learning applications, where each device learns a sparse binary mask (i.e., 1 bit per network parameter) while keeping the parameters of each local model unchanged.
Abstract: Recent advancements in deep neural networks (DNN) enabled various mobile deep learning applications. However, it is technically challenging to locally train a DNN model due to limited data on devices like mobile phones. Federated learning (FL) is a distributed machine learning paradigm which allows for model training on decentralized data residing on devices without breaching data privacy. Hence, FL becomes a natural choice for deploying on-device deep learning applications. However, the data residing across devices is intrinsically statistically heterogeneous (i.e., non-IID data distribution) and mobile devices usually have limited communication bandwidth to transfer local updates. Such statistical heterogeneity and communication bandwidth limit are two major bottlenecks that hinder applying FL in practice. In addition, considering mobile devices usually have limited computational resources, improving computation efficiency of training and running DNNs is critical to developing on-device deep learning applications. In this paper, we present FedMask - a communication and computation efficient FL framework. By applying FedMask, each device can learn a personalized and structured sparse DNN, which can run efficiently on devices. To achieve this, each device learns a sparse binary mask (i.e., 1 bit per network parameter) while keeping the parameters of each local model unchanged; only these binary masks will be communicated between the server and the devices. Instead of learning a shared global model in classic FL, each device obtains a personalized and structured sparse model that is composed by applying the learned binary mask to the fixed parameters of the local model. Our experiments show that compared with status quo approaches, FedMask improves the inference accuracy by 28.47% and reduces the communication cost and the computation cost by 34.48X and 2.44X. FedMask also achieves 1.56X inference speedup and reduces the energy consumption by 1.78X.

Journal ArticleDOI
TL;DR: In this article, the authors proposed power control algorithms for the parallel computation and successive computation in the expanded compute-and-forward (ECF) framework, respectively, to exploit the performance gain and then improve the system performance.
Abstract: Cell-free massive multiple-input multiple-output (MIMO) employs a large number of distributed access points (APs) to serve a small number of user equipments (UEs) via the same time/frequency resource. Due to the strong macro diversity gain, cell-free massive MIMO can considerably improve the achievable sum-rate compared to conventional cellular massive MIMO. However, the performance of cell-free massive MIMO is upper limited by inter-user interference (IUI) when employing simple maximum ratio combining (MRC) at receivers. To harness IUI, the expanded compute-and-forward (ECF) framework is adopted. In particular, we propose power control algorithms for the parallel computation and successive computation in the ECF framework, respectively, to exploit the performance gain and then improve the system performance. Furthermore, we propose an AP selection scheme and the application of different decoding orders for the successive computation. Finally, numerical results demonstrate that ECF frameworks outperform the conventional CF and MRC frameworks in terms of achievable sum-rate.

Proceedings Article
03 May 2021
TL;DR: GShard as mentioned in this paper is a module composed of a set of lightweight annotation APIs and an extension to the XLA compiler to enable large scale models with up to trillions of parameters, which is critical for improving the model quality in many real-world machine learning applications with vast amounts of training data and compute.
Abstract: Neural network scaling has been critical for improving the model quality in many real-world machine learning applications with vast amounts of training data and compute. Although this trend of scaling is affirmed to be a sure-fire approach for better model quality, there are challenges on the path such as the computation cost,ease of programming, and efficient implementation on parallel devices. In this paper we demonstrate conditional computation as a remedy to the above mentioned impediments, and demonstrate its efficacy and utility. We make extensive use of GShard, a module composed of a set of lightweight annotation APIs and an extension to the XLA compiler to enable large scale models with up to trillions of parameters. GShard and conditional computation enable us to scale up multilingual neural machine translation Transformer model with Sparsely-Gated Mixture-of-Experts. We demonstrate that such a giant model with 600 billion parameters can efficiently be trained on 2048 TPU v3 cores in 4 days to achieve far superior quality for translation from 100 languages to English compared to the prior art.

Journal ArticleDOI
TL;DR: This paper designs online computation offloading mechanisms to minimize the time average expected task execution delay under the constraint of average energy consumption, and combines the multi-armed bandit framework for an online learning based MEC server selection algorithm.
Abstract: By offloading tasks from the mobile device (MD) to its nearby deployed access points (APs), each of which is connected to one server for task processing, computation offloading can strike a balance between MD’s task execution delay and energy consumption in mobile edge computing (MEC) systems. Considering communication and computation dynamics in MEC systems, we aim to design online computation offloading mechanisms in this paper to minimize the time average expected task execution delay under the constraint of average energy consumption. Firstly, with known current channel gains between the MD and APs as well as available computing capability at MEC servers, we leverage the Lyapunov optimization framework to make an optimal one-slot decision on MD’s transmit power allocation and MEC server selection. On this basis, we then consider a more realistic scenario, where it is difficult to capture current available computing capability at MEC servers, and combine the multi-armed bandit framework for an online learning based MEC server selection algorithm. Finally, through theoretical analyses and extensive simulations, we demonstrate the near-optimality and feasibility of our proposed algorithms, and present that our proposed algorithms fully explore the interplay between communication and computation with enriched user experience and reduced energy consumption.

Proceedings Article
Ziang Cao1, Changhong Fu1, Junjie Ye1, Bowen Li1, Yiming Li2 
31 Jul 2021
TL;DR: Huang et al. as mentioned in this paper proposed an efficient and effective hierarchical feature transformer (HiFT) for aerial tracking, where hierarchical similarity maps generated by multi-level convolutional layers are fed into the feature transformer to achieve the interactive fusion of spatial (shallow layers) and semantics cues.
Abstract: Most existing Siamese-based tracking methods execute the classification and regression of the target object based on the similarity maps. However, they either employ a single map from the last convolutional layer which degrades the localization accuracy in complex scenarios or separately use multiple maps for decision making, introducing intractable computations for aerial mobile platforms. Thus, in this work, we propose an efficient and effective hierarchical feature transformer (HiFT) for aerial tracking. Hierarchical similarity maps generated by multi-level convolutional layers are fed into the feature transformer to achieve the interactive fusion of spatial (shallow layers) and semantics cues (deep layers). Consequently, not only the global contextual information can be raised, facilitating the target search, but also our end-to-end architecture with the transformer can efficiently learn the interdependencies among multi-level features, thereby discovering a tracking-tailored feature space with strong discriminability. Comprehensive evaluations on four aerial benchmarks have proven the effectiveness of HiFT. Real-world tests on the aerial platform have strongly validated its practicability with a real-time speed. Our code is available at this https URL.

Posted Content
TL;DR: In this article, the authors presented a method, classical entanglement forging, that harnesses classical resources to capture quantum correlations and double the size of the system that can be simulated on quantum hardware.
Abstract: Quantum computers are promising for simulations of chemical and physical systems, but the limited capabilities of today's quantum processors permit only small, and often approximate, simulations. Here we present a method, classical entanglement forging, that harnesses classical resources to capture quantum correlations and double the size of the system that can be simulated on quantum hardware. Shifting some of the computation to classical post-processing allows us to represent ten spin-orbitals on five qubits of an IBM Quantum processor to compute the ground state energy of the water molecule in the most accurate simulation to date. We discuss conditions for applicability of classical entanglement forging and present a roadmap for scaling to larger problems.


Proceedings ArticleDOI
TL;DR: MixFaceNets as discussed by the authors is a set of extremely efficient and high throughput models for accurate face verification, which are inspired by Mixed Depthwise Convolutional Kernels (MDCK).
Abstract: In this paper, we present a set of extremely efficient and high throughput models for accurate face verification, Mix-FaceNets which are inspired by Mixed Depthwise Convolutional Kernels. Extensive experiment evaluations on Label Face in the Wild (LFW), Age-DB, MegaFace, and IARPA Janus Benchmarks IJB-B and IJB-C datasets have shown the effectiveness of our MixFaceNets for applications requiring extremely low computational complexity. Under the same level of computation complexity (≤ 500M FLOPs), our MixFaceNets outperform MobileFaceNets on all the evaluated datasets, achieving 99.60% accuracy on LFW, 97.05% accuracy on AgeDB-30, 93.60 TAR (at FAR1e-6) on MegaFace, 90.94 TAR (at FAR1e-4) on IJB-B and 93.08 TAR (at FAR1e-4) on IJB-C. With computational complexity between 500M and 1G FLOPs, our MixFaceNets achieved results comparable to the top-ranked models, while using significantly fewer FLOPs and less computation over-head, which proves the practical value of our proposed Mix-FaceNets. All training codes, pre-trained models, and training logs have been made available https://github.com/fdbtrs/mixfacenets.

Journal ArticleDOI
TL;DR: In this article, the effect of buoyancy parameters along with radiation on magneto-hydrodynamic (MHD) micro-polar nano-fluid flow over a stretching/shrinking sheet is taken into consideration.

Journal ArticleDOI
TL;DR: A quantum primitive called fast inversion is introduced, which can be used as a preconditioner for solving quantum linear systems, and two efficient approaches for computing matrix functions, based on the contour integral formulation and the inverse transform respectively are introduced.
Abstract: Preconditioning is the most widely used and effective way for treating ill-conditioned linear systems in the context of classical iterative linear system solvers. We introduce a quantum primitive called fast inversion, which can be used as a preconditioner for solving quantum linear systems. The key idea of fast inversion is to directly block encode a matrix inverse through a quantum circuit implementing the inversion of eigenvalues via classical arithmetics. We demonstrate the application of preconditioned linear system solvers for computing single-particle Green's functions of quantum many-body systems, which are widely used in quantum physics, chemistry, and materials science. We analyze the complexities in three scenarios: the Hubbard model, the quantum many-body Hamiltonian in the plane-wave-dual basis, and the Schwinger model. We also provide a method for performing Green's function calculation in second quantization within a fixed-particle manifold and note that this approach may be valuable for simulation more broadly. Aside from solving linear systems, fast inversion also allows us to develop fast algorithms for computing matrix functions, such as the efficient preparation of Gibbs states. We introduce two efficient approaches for such a task, based on the contour-integral formulation and the inverse transform, respectively.

Posted ContentDOI
TL;DR: Li et al. as mentioned in this paper proposed a lightweight self-attentive network (LSAN) for sequential recommendation, where each item embedding is composed by merging a group of selected base embedding vectors derived from substantially smaller embedding matrices.
Abstract: Modern deep neural networks (DNNs) have greatly facilitated the development of sequential recommender systems by achieving state-of-the-art recommendation performance on various sequential recommendation tasks. Given a sequence of interacted items, existing DNN-based sequential recommenders commonly embed each item into a unique vector to support subsequent computations of the user interest. However, due to the potentially large number of items, the over-parameterised item embedding matrix of a sequential recommender has become a memory bottleneck for efficient deployment in resource-constrained environments, e.g., smartphones and other edge devices. Furthermore, we observe that the widely-used multi-head self-attention, though being effective in modelling sequential dependencies among items, heavily relies on redundant attention units to fully capture both global and local item-item transition patterns within a sequence. In this paper, we introduce a novel lightweight self-attentive network (LSAN) for sequential recommendation. To aggressively compress the original embedding matrix, LSAN leverages the notion of compositional embeddings, where each item embedding is composed by merging a group of selected base embedding vectors derived from substantially smaller embedding matrices. Meanwhile, to account for the intrinsic dynamics of each item, we further propose a temporal context-aware embedding composition scheme. Besides, we develop an innovative twin-attention network that alleviates the redundancy of the traditional multi-head self-attention while retaining full capacity for capturing long- and short-term (i.e., global and local) item dependencies. Comprehensive experiments demonstrate that LSAN significantly advances the accuracy and memory efficiency of existing sequential recommenders.

Journal ArticleDOI
TL;DR: A reduction method based on direct normal form computation for large finite element (FE) models is detailed, avoiding the computation of the complete eigenfunctions spectrum and making a direct link with the parametrisation of invariant manifolds.
Abstract: Dimensionality reduction in mechanical vibratory systems poses challenges for distributed structures including geometric nonlinearities, mainly because of the lack of invariance of the linear subspaces. A reduction method based on direct normal form computation for large finite element (FE) models is here detailed. The main advantage resides in operating directly from the physical space, hence avoiding the computation of the complete eigenfunctions spectrum. Explicit solutions are given, thus enabling a fully non-intrusive version of the reduction method. The reduced dynamics is obtained from the normal form of the geometrically nonlinear mechanical problem, free of non-resonant monomials, and truncated to the selected master coordinates, thus making a direct link with the parametrisation of invariant manifolds. The method is fully expressed with a complex-valued formalism by detailing the homological equations in a systematic manner, and the link with real-valued expressions is established. A special emphasis is put on the treatment of second-order internal resonances and the specific case of a 1:2 resonance is made explicit. Finally, applications to large-scale models of micro-electro-mechanical structures featuring 1:2 and 1:3 resonances are reported, along with considerations on computational efficiency.

Journal ArticleDOI
TL;DR: Both qualitative and quantitative evaluations were conducted to verify that the proposed I2I translation method can achieve better performance in terms of image quality, diversity and semantic similarity to the input and reference images compared to state-of-the-art works.
Abstract: Image-to-Image (I2I) translation is a heated topic in academia, and it also has been applied in real-world industry for tasks like image synthesis, super-resolution, and colorization. However, traditional I2I translation methods train data in two or more domains together. This requires lots of computation resources. Moreover, the results are of lower quality, and they contain many more artifacts. The training process could be unstable when the data in different domains are not balanced, and modal collapse is more likely to happen. We proposed a new I2I translation method that generates a new model in the target domain via a series of model transformations on a pretrained StyleGAN2 model in the source domain. After that, we proposed an inversion method to achieve the conversion between an image and its latent vector. By feeding the latent vector into the generated model, we can perform I2I translation between the source domain and target domain. Both qualitative and quantitative evaluations were conducted to prove that the proposed method can achieve outstanding performance in terms of image quality, diversity and semantic similarity to the input and reference images compared to state-of-the-art works.

Journal ArticleDOI
TL;DR: In this article, a finite element model based on a new hyperbolic sheareformation theory was established to investigate the static bending, free vibration, and buckling of the functionally graded sandwich plates with porosity.

Journal ArticleDOI
TL;DR: In this paper, the authors explore quantum phase estimation in its adaptive version, which exploits dynamic circuits, and compare the results to a nonadaptive implementation of the same algorithm, and demonstrate that the version of real-time quantum computing with dynamic circuits can yield results comparable to an approach involving classical asynchronous postprocessing.
Abstract: To date, quantum computation on real, physical devices has largely been limited to simple, time-ordered sequences of unitary operations followed by a final projective measurement. As hardware platforms for quantum computing continue to mature in size and capability, it is imperative to enable quantum circuits beyond their conventional construction. Here we break into the realm of dynamic quantum circuits on a superconducting-based quantum system. Dynamic quantum circuits not only involve the evolution of the quantum state throughout the computation but also periodic measurements of qubits midcircuit and concurrent processing of the resulting classical information on timescales shorter than the execution times of the circuits. Using noisy quantum hardware, we explore one of the most fundamental quantum algorithms, quantum phase estimation, in its adaptive version, which exploits dynamic circuits, and compare the results to a nonadaptive implementation of the same algorithm. We demonstrate that the version of real-time quantum computing with dynamic circuits can yield results comparable to an approach involving classical asynchronous postprocessing, thus opening the door to a new realm of available algorithms on real quantum systems.

Book ChapterDOI
01 Jan 2021
TL;DR: The previous literature on quantum machine learning is reviewed and the current status of it is provided, postulating that quantum computers may overtake classical computers on machine learning tasks.
Abstract: Quantum machine learning is at the intersection of two of the most sought after research areas—quantum computing and classical machine learning. Quantum machine learning investigates how results from the quantum world can be used to solve problems from machine learning. The amount of data needed to reliably train a classical computation model is evergrowing and reaching the limits which normal computing devices can handle. In such a scenario, quantum computation can aid in continuing training with huge data. Quantum machine learning looks to devise learning algorithms faster than their classical counterparts. Classical machine learning is about trying to find patterns in data and using those patterns to predict further events. Quantum systems, on the other hand, produce atypical patterns which are not producible by classical systems, thereby postulating that quantum computers may overtake classical computers on machine learning tasks. Here, we review the previous literature on quantum machine learning and provide the current status of it.

Journal ArticleDOI
TL;DR: This work proposes a lightweight graph reordering methodology, incorporated with a GCN accelerator architecture that equips a customized cache design to fully utilize the graph-level data reuse, and proposes a mapping methodology aware of data reuse and task-level parallelism to handle various graphs inputs effectively.
Abstract: Graph convolutional network (GCN) emerges as a promising direction to learn the inductive representation in graph data commonly used in widespread applications, such as E-commerce, social networks, and knowledge graphs. However, learning from graphs is non-trivial because of its mixed computation model involving both graph analytics and neural network computing. To this end, we decompose the GCN learning into two hierarchical paradigms: graph-level and node-level computing. Such a hierarchical paradigm facilitates the software and hardware accelerations for GCN learning. We propose a lightweight graph reordering methodology, incorporated with a GCN accelerator architecture that equips a customized cache design to fully utilize the graph-level data reuse. We also propose a mapping methodology aware of data reuse and task-level parallelism to handle various graphs inputs effectively. Results show that Rubik accelerator design improves energy efficiency by 26.3x to 1375.2x than GPU platforms across different datasets and GCN models.

Proceedings Article
18 May 2021
TL;DR: In this paper, the complexity of computing the SHAP explanation is shown to be #P-hard for logistic regression models over fully-factorized data distributions, and even for naive Bayes distributions.
Abstract: SHAP explanations are a popular feature-attribution mechanism for explainable AI. They use game-theoretic notions to measure the influence of individual features on the prediction of a machine learning model. Despite a lot of recent interest from both academia and industry, it is not known whether SHAP explanations of common machine learning models can be computed efficiently. In this paper, we establish the complexity of computing the SHAP explanation in three important settings. First, we consider fully-factorized data distributions, and show that the complexity of computing the SHAP explanation is the same as the complexity of computing the expected value of the model. This fully-factorized setting is often used to simplify the SHAP computation, yet our results show that the computation can be intractable for commonly used models such as logistic regression. Going beyond fully-factorized distributions, we show that computing SHAP explanations is already intractable for a very simple setting: computing SHAP explanations of trivial classifiers over naive Bayes distributions. Finally, we show that even computing SHAP over the empirical distribution is #P-hard.

Journal ArticleDOI
TL;DR: In this paper, an event-triggered heuristic dynamic programming (HDP) (λ)-based optimal control strategy, which takes a long-term prediction parameter λ into account using an iterative manner, accelerates the learning rate and reduces the computation complexity.
Abstract: The heuristic dynamic programming (HDP) (λ)-based optimal control strategy, which takes a long-term prediction parameter λ into account using an iterative manner, accelerates the learning rate obviously The computation complexity caused by the state-associated extra variable in λ-return value computing of the traditional value-gradient learning method can be reduced However, as the iteration number increases, calculation costs have grown dramatically that bring huge challenge for the optimal control process with limited bandwidth and computational units In this article, we propose an event-triggered HDP (ETHDP) (λ) optimal control strategy for nonlinear discrete-time (NDT) systems with unknown dynamics The iterative relation for λ-return of the final target value is derived first The event-triggered condition ensuring system stability is designed to reduce the computation and communication requirements Next, we build a model-actor-critic neural network (NN) structure, in which the model NN evaluates the system state for getting λ-return of the current time target value, which is used to obtain the critic NN real-time update errors The event-triggered optimal control signal and one-step-return value are approximated by actor and critic NN, respectively Then, the event trigger-based uniformly ultimately bounded (UUB) stability of the system state and NN weight errors are demonstrated by applying the Lyapunov technology Finally, we illustrate the effectiveness of our proposed ETHDP (λ) strategy by two cases

Journal ArticleDOI
TL;DR: In this article, materials requirements for such integrated systems, with a focus on problems that hinder current progress towards practical quantum computation, are considered, and suggestions for how materials scientists and trapped ion technologists can work together to develop materials-based integration and noise-mitigation strategies to enable the next generation of trapped-ion quantum computers.
Abstract: Trapped-ion quantum information processors store information in atomic ions maintained in position in free space by electric fields. Quantum logic is enacted through manipulation of the ions’ internal and shared motional quantum states using optical and microwave signals. Although trapped ions show great promise for quantum-enhanced computation, sensing and communication, materials research is needed to design traps that allow for improved performance by means of integration of system components, including optics and electronics for ion-qubit control, while minimizing the near-ubiquitous electric-field noise produced by trap-electrode surfaces. In this Review, we consider the materials requirements for such integrated systems, with a focus on problems that hinder current progress towards practical quantum computation. We give suggestions for how materials scientists and trapped-ion technologists can work together to develop materials-based integration and noise-mitigation strategies to enable the next generation of trapped-ion quantum computers. Trapped-ion qubits have great potential for quantum computation, but materials improvements are needed. This Review surveys materials opportunities to improve the performance of trapped-ion qubits, from understanding the surface science that leads to electric-field noise to developing methods for building ion traps with integrated optics and electronics.