scispace - formally typeset
Search or ask a question

Showing papers on "Gate count published in 2022"



Journal ArticleDOI
TL;DR: In this paper , a generalized Toffoli gate is realized using higher-dimensional qudits to attain a logarithmic depth decomposition without ancilla qudit, and the circuit for Grover's algorithm has then been designed for any $d$-ary quantum system, where $d\ensuremath{\ge}2
Abstract: The progress in building quantum computers to execute quantum algorithms has recently been remarkable. Grover's search algorithm in a binary quantum system provides a considerable speed-up over the classical paradigm. It can be extended to a $d$-ary (qudit) quantum system also for utilizing the advantage of larger state space, which helps to reduce the runtime of the algorithm as compared to the traditional binary quantum systems. In a qudit quantum system, an $n$-qudit Toffoli gate plays a significant role in the accurate implementation of Grover's algorithm. In this article, a generalized $n$-qudit Toffoli gate is realized using higher-dimensional qudits to attain a logarithmic depth decomposition without ancilla qudit. The circuit for Grover's algorithm has then been designed for any $d$-ary quantum system, where $d\ensuremath{\ge}2$, with the proposed $n$-qudit Toffoli gate to obtain optimized depth compared to earlier approaches. The technique for decomposing an $n$-qudit Toffoli gate requires access to two immediately higher-energy levels, making the design susceptible to errors. Nevertheless, we show that the percentage decrease in the probability of error is significant with both gate count and circuit depth reduced as compared to that in state-of-the-art works.

7 citations


Journal ArticleDOI
TL;DR: In this paper , the authors proposed a variant of quantum gate verification (QGV) that is robust to practical gate imperfections and experimentally realized efficient QGV on a 2-qubit controlled-not gate and a 3-quit Toffoli gate using only local state preparations and measurements.
Abstract: Verifying the correct functioning of quantum gates is a crucial step toward reliable quantum information processing, but it becomes an overwhelming challenge as the system size grows due to the dimensionality curse. Recent theoretical breakthroughs show that it is possible to verify various important quantum gates with the optimal sample complexity of O(1/ε) using local operations only, where ε is the estimation precision. In this Letter, we propose a variant of quantum gate verification (QGV) that is robust to practical gate imperfections and experimentally realize efficient QGV on a 2-qubit controlled-not gate and a 3-qubit Toffoli gate using only local state preparations and measurements. The experimental results show that, by using only 1600 and 2600 measurements on average, we can verify with 95% confidence level that the implemented controlled-not gate and Toffoli gate have fidelities of at least 99% and 97%, respectively. Demonstrating the superior low sample complexity and experimental feasibility of QGV, our work promises a solution to the dimensionality curse in verifying large quantum devices in the quantum era.

6 citations


Journal ArticleDOI
TL;DR: This work presents a 500×500 SPAD image sensor that achieves 100% temporal aperture with two contiguous gates and can generate dual-gated binary images in rolling shutter at up to 49.8 kfps.
Abstract: In this article, we report on SwissSPAD3 (SS3), a 500 $\times$ 500 pixel single-photon avalanche diode (SPAD) array, fabricated in 0.18- $\mu \text{m}$ CMOS technology. In this sensor, we introduce a novel dual-gate architecture with two contiguous temporal windows, or gates, guaranteed by the circuit architecture to be nonoverlapping and covering the totality of the sensor’s exposure period. The gates can be adjusted with a temporal resolution of 17.9 ps, and the minimum measured gate width is 0.99 ns; to our knowledge, the shortest reported to date among large-format SPAD imagers. In the dual-channel mode, the burst frame rate is 49.8 and 97.7 kframes/s in the single-channel mode. A 2690-MB/s PCI express (PCIe) interface has been added to the data acquisition framework, enabling continuous operation at approximately 44 and 88 kframes/s. Due to optimizations of the gate-signal tree, we achieved a significant reduction to gate skew and gate width variation, which is negligible with respect to the SPAD temporal jitter. These improvements, along with sub-10-cps dark count rate (DCR) per pixel and 50% maximum photon detection probability (PDP), result in a sensor particularly well suited for fast acquisition fluorescence lifetime imaging microscopy (FLIM) experiments, for which we demonstrate reduced dispersion versus a single-gated sensor.

4 citations




Journal ArticleDOI
Haisheng Li1
01 Jan 2022
TL;DR: In this article , the authors reported four 3-bit Hermitian gates named LI gates, whose realized circuits have the same T-count, T-depth, and CNOT-count as the Peres gate, and two decomposition methods of a multiple control Toffoli gate are proposed for different primary optimization goals.
Abstract: The well-known 3-bit Hermitian gate (a Toffoli gate) has been implemented using Clifford + T circuits. Compared with the Peres gate, its implementation circuit requires more controlled-NOT (CNOT) gates. However, the Peres gate is not Hermitian. This paper reports four 3-bit Hermitian gates named LI gates. Whose realized circuits have the same T-count, T-depth, and CNOT-count as the Peres gate. Furthermore, two decomposition methods of a multiple control Toffoli (MCT) gate are proposed for different primary optimization goals. Then, we design the equality, less-than, and full comparators with the minimum circuit width using proposed Hermitian gates and optimized MCT gates. A fault-tolerant circuit is required for robust quantum computing. Clifford+T circuits are accepted solutions for fault-tolerant implementation. Considering T-count, T-depth, CNOT-count, and circuit width as the primary optimization goals, we design the optimized Clifford + T circuits of three comparators using LI gates and optimized MCT gates. Comparison and analysis show that the proposed comparators have better overall performances for T-count, T-depth, CNOT-count, and circuit width than the best-known comparators without quantum measurements.

1 citations


Posted ContentDOI
08 Mar 2022
TL;DR: In this paper , a novel quantum gate approximation algorithm based on the application of parametric two-qubit gates in the synthesis process is reported. But this algorithm is not suitable for quantum circuits.
Abstract: In this work, we report on a novel quantum gate approximation algorithm based on the application of parametric two-qubit gates in the synthesis process. The utilization of these parametric two-qubit gates in the circuit design allows us to transform the discrete combinatorial problem of circuit synthesis into an optimization problem over continuous variables. The circuit is then compressed by a sequential removal of two-qubit gates from the design, while the remaining building blocks are continuously adapted to the reduced gate structure by iterated learning cycles. We implemented the developed algorithm in the SQUANDER software package and benchmarked it against several state-of-the-art quantum gate synthesis tools. Our numerical experiments revealed outstanding circuit compression capabilities of our compilation algorithm providing the most optimal gate count in the majority of the addressed quantum circuits.

1 citations


Journal ArticleDOI
TL;DR: It is determined from the analysis that the proposed square root circuit employing slow-division algorithms results in a T-count reduction and qubit cost savings of 80.51% and 72.65% over the existing work.

1 citations


Posted ContentDOI
21 Aug 2022
TL;DR: In this article , the authors present a space efficient implementation of the quantum verification of matrix products (QVMP) algorithm and demonstrate its functionality by running it on the Aer simulator with two simulation methods: state vector and matrix product state (MPS).
Abstract: We present a space-efficient implementation of the quantum verification of matrix products (QVMP) algorithm and demonstrate its functionality by running it on the Aer simulator with two simulation methods: statevector and matrix product state (MPS). We report circuit metrics (gate count, qubit count, circuit depth), transpilation time, simulation time, and a proof of Grover oracle correctness. Our study concludes that while QVMP can be simulated on moderately sized inputs, it cannot scale to a degree where we can observe any quantum advantage on current quantum hardware due to circuit depth and qubit count constraints. Further, the choice of simulation method has a noticeable impact on the size of the transpiled circuit which slows down development.

Posted ContentDOI
08 Aug 2022
TL;DR: In this paper , an easily extendable 12-transistor 2-4 line decoder core is presented for the random access memory interface such as translation lookaside buffer and the first level data cache.
Abstract: Abstract An easily-extendable 12-transistor 2-4 line decoder core is presented for the random-access memory interface such as translation lookaside buffer and the first level data cache in this brief. The core idea is to design the line decoder based on the truth table straightforwardly without assistant of the basic gate circuits. The 3-8 line decoder and 4-16 line decoder can be constructed with three and seven of the proposed 2-4 decoder core, respectively, resulting in a low transistor count and high power-delay performance. Simulation results shows that the proposed decoder topologies have the minimum area overhand compared with the state of the art in 65nm CMOS process. Meanwhile, the delay of the 2-4 line decoder is reduced to 120.7 ps, 57.5 ps, and 37 ps at 0.8 V, 1 V and 1.2 V, respectively, resulting in a better PNPD performance. Besides, the PNPD of the proposed 2-4 and 4-16 topology is optimized by 1.7%, and 10.94% compared with that of the HP topologies, while the PNPD of the 3-8 line decoder is optimized by 32.59% compared with that of the predecoder structure at a 1V supply voltage.

Posted ContentDOI
20 Dec 2022
TL;DR: In this article , the authors present Reqomp, a method to automatically synthesize correct and efficient ancillae while respecting hardware constraints, which can offer a wide range of trade-offs between tightly constraining qubit count or gate count.
Abstract: Quantum circuits must run on quantum computers with tight limits on qubit and gate counts. To generate circuits respecting both limits, a promising opportunity is exploiting uncomputation to trade qubits for gates. We present Reqomp, a method to automatically synthesize correct and efficient uncomputation of ancillae while respecting hardware constraints. For a given circuit, Reqomp can offer a wide range of trade-offs between tightly constraining qubit count or gate count. Our evaluation demonstrates that Reqomp can significantly reduce the number of required ancilla qubits by up to 96%. On 80% of our benchmarks, the ancilla qubits required can be reduced by at least 25% while never incurring a gate count increase beyond 28%.

Posted ContentDOI
27 Jun 2022
TL;DR: TopAS as discussed by the authors is a topology aware synthesis tool built with the BQSKit framework that preconditions quantum circuits before mapping, which can be used to reduce the depth and gate count of wide quantum circuits.
Abstract: Unitary synthesis is an optimization technique that can achieve optimal multi-qubit gate counts while mapping quantum circuits to restrictive qubit topologies. Because synthesis algorithms are limited in scalability by their exponentially growing run time and memory requirements, application to circuits wider than 5 qubits requires divide-and-conquer partitioning of circuits into smaller components. In this work, we will explore methods to reduce the depth (program run time) and multi-qubit gate instruction count of wide (16-100 qubit) mapped quantum circuits optimized with synthesis. Reducing circuit depth and gate count directly impacts program performance and the likelihood of successful execution for quantum circuits on parallel quantum machines. We present TopAS, a topology aware synthesis tool built with the \emph{BQSKit} framework that preconditions quantum circuits before mapping. Partitioned subcircuits are optimized and fitted to sparse qubit subtopologies in a way that balances the often opposing demands of synthesis and mapping algorithms. This technique can be used to reduce the depth and gate count of wide quantum circuits mapped to the sparse qubit topologies of Google and IBM. Compared to large scale synthesis algorithms which focus on optimizing quantum circuits after mapping, TopAS is able to reduce depth by an average of 35.2% and CNOT gate count an average of 11.5% when targeting a 2D mesh topology. When compared with traditional quantum compilers using peephole optimization and mapping algorithms from the Qiskit or $t|ket\rangle$ toolkits, our approach is able to provide significant improvements in performance, reducing CNOT counts by 30.3% and depth by 38.2% on average.

Journal ArticleDOI
TL;DR: In this paper , an area-optimized and power-efficient implementation of the Cipher Block Chaining (CBC) mode for an ultra-lightweight block cipher, PRESENT, and the Keyed-Hash Message Authentication Code (HMAC)-expanded PHOTON by using a feedback path for a single block in the scheme is introduced.
Abstract: This paper introduces an area-optimized and power-efficient implementation of the Cipher Block Chaining (CBC) mode for an ultra-lightweight block cipher, PRESENT, and the Keyed-Hash Message Authentication Code (HMAC)-expanded PHOTON by using a feedback path for a single block in the scheme. The proposed scheme is designed, taped out, and integrated as a System-on-a-Chip (SoC) in a 65-nm CMOS process. An experimental analysis and comparison between a conventional implementation of CBC-PRESENT/HMAC-PHOTON with the proposed feedback basis is performed. The proposed CBC-PRESENT/HMAC-PHOTON has 128-bit plaintext/text and a 128-bit secret key, which have a gate count of 5683/20,698 and low power consumption of 1.03/2.62 mW with a throughput of 182.9/14.9 Mbps at the maximum clock frequency of 100 MHz, respectively. The overall improvement in area and power dissipation is 13/50.34% and 14.87/75.28% when compared to a conventional design.

Posted ContentDOI
15 Mar 2022
TL;DR: In this article , an n-qubit quantum Fourier transform (QFT) circuit with T-count of 4nlog_2(n/\varepsilon) was constructed using Toffoli gates and quantum adders.
Abstract: The quantum Fourier transform (QFT) is a ubiquitous quantum operation that is used in numerous quantum computing applications. The major obstacle to constructing a QFT circuit is that numerous elementary gates are required. Among the elementary gates, T gates dominate the cost of fault-tolerant implementation. Currently, the smallest-known T-count required to construct an n-qubit QFT circuit approximated to error O(\varepsilon) is ~8nlog_2(n/\varepsilon). Moreover, the depth of T gates (T-depth) in the approximate QFT circuit is ~2nlog_2(n/\varepsilon). This approximate QFT circuit was constructed using Toffoli gates and quantum adders. In this study, we present a new n-qubit QFT circuit approximated to error O(\varepsilon). Our approximate QFT circuit shows a T-count of ~4nlog_2(n/\varepsilon) and a T-depth of ~nlog_2(n/\varepsilon). Toffoli gates, which account for half of the T-count in the approximate QFT circuit reported in the previous study, are unnecessary in our construction. Quantum adders, which dominate the leading order term of T-depth in our approximate QFT circuit, are arranged in parallel to reduce T-depth.

Posted ContentDOI
02 Dec 2022
TL;DR: In this paper , the authors proposed a depth-optimized synthesis algorithm that automatically produces a quantum circuit for any given diagonal unitary matrix, which not only ensures the asymptotically optimal gate-count, but also nearly halves the total circuit depth compared with the previous method.
Abstract: Current noisy intermediate-scale quantum (NISQ) devices can only execute small circuits with shallow depth, as they are still constrained by the presence of noise: quantum gates have error rates and quantum states are fragile due to decoherence. Hence, it is of great importance to optimize the depth/gate-count when designing quantum circuits for specific tasks. Diagonal unitary matrices are well-known to be key building blocks of many quantum algorithms or quantum computing procedures. Prior work has discussed the synthesis of diagonal unitary matrices over the primitive gate set $\{\text{CNOT}, R_Z\}$. However, the problem has not yet been fully understood, since the existing synthesis methods have not optimized the circuit depth. In this paper, we propose a depth-optimized synthesis algorithm that automatically produces a quantum circuit for any given diagonal unitary matrix. Specially, it not only ensures the asymptotically optimal gate-count, but also nearly halves the total circuit depth compared with the previous method. Technically, we discover a uniform circuit rewriting rule well-suited for reducing the circuit depth. The performance of our synthesis algorithm is both theoretically analyzed and experimentally validated by evaluations on two examples. First, we achieve a nearly 50\% depth reduction over Welch's method for synthesizing random diagonal unitary matrices with up to 16 qubits. Second, we achieve an average of 22.05\% depth reduction for resynthesizing the diagonal part of specific quantum approximate optimization algorithm (QAOA) circuits with up to 14 qubits.

Posted ContentDOI
06 Apr 2022
TL;DR: In this article , the authors proposed techniques such as, predecoding logic and qubit reset to reduce the depth and gate count of QROM circuits to target wider address ranges such as 8-bits.
Abstract: Quantum computing is a rapidly expanding field with applications ranging from optimization all the way to complex machine learning tasks. Quantum memories, while lacking in practical quantum computers, have the potential to bring quantum advantage. In quantum machine learning applications for example, a quantum memory can simplify the data loading process and potentially accelerate the learning task. Quantum memory can also store intermediate quantum state of qubits that can be reused for computation. However, the depth, gate count and compilation time of quantum memories such as, Quantum Read Only Memory (QROM) scale exponentially with the number of address lines making them impractical in state-of-the-art Noisy Intermediate-Scale Quantum (NISQ) computers beyond 4-bit addresses. In this paper, we propose techniques such as, predecoding logic and qubit reset to reduce the depth and gate count of QROM circuits to target wider address ranges such as, 8-bits. The proposed approach reduces the number of gates and depth count by at least 2X compared to the naive implementation at only 36% qubit overhead. A reduction in circuit depth and gate count as high as 75X and compilation time by 85X at the cost of a maximum of 2.28X qubit overhead is observed. Experimentally, the fidelity with the proposed predecoding circuit compared to existing optimization approach is also higher (as much as 73% compared to 40.8%) under reduced error rates.

Journal ArticleDOI
TL;DR: A hardware-agnostic circuit optimization algorithm to reduce the overall circuit cost for Hamiltonian simulation problems and employ a novel sub-circuit synthesis in intermediate representation and a greedy ordering scheme for gate cancellation to minimize the gate count and circuit depth.
Abstract: Simulating quantum systems is believed to be one of the most important applications of quantum computers. On noisy intermediate-scale quantum (NISQ) devices, the high-level circuit designed by quantum algorithms for Hamiltonian simulation needs to consider hardware limitations such as gate errors and circuit depth before it can be efficiently executed. In this work, we develop a hardware-agnostic circuit optimization algorithm to reduce the overall circuit cost for Hamiltonian simulation problems. Our method employ a novel sub-circuit synthesis in intermediate representation and propose a greedy ordering scheme for gate cancellation to minimize the gate count and circuit depth. To quantify the benefits of this approach, we benchmark proposed algorithm on different Hamiltonian models. Compared with state-of-the-art generic quantum compilers and specific quantum simulation compiler, the benchmarking results of our algorithm show an average reduction in circuit depth by 16.5× (up to 64.1×) and in gate count by 7.8× (up to 23.7×). This significant improvement helps enhance the performance of Hamiltonian simulation in the NISQ era.

Posted ContentDOI
06 Sep 2022
TL;DR: In this paper , the authors present a procedure for combining the strengths of analytical native gate-level optimization with numerical optimization to optimize Toffoli gates on the IBMQ native gate set, which is generalizable to any gate and superconducting qubit architecture.
Abstract: While quantum computing holds great potential in combinatorial optimization, electronic structure calculation, and number theory, the current era of quantum computing is limited by noisy hardware. Many quantum compilation approaches can mitigate the effects of imperfect hardware by optimizing quantum circuits for objectives such as critical path length. Few approaches consider quantum circuits in terms of the set of vendor-calibrated operations (i.e., native gates) available on target hardware. This manuscript expands the analytical and numerical approaches for optimizing quantum circuits at this abstraction level. We present a procedure for combining the strengths of analytical native gate-level optimization with numerical optimization. Although we focus on optimizing Toffoli gates on the IBMQ native gate set, the methods presented are generalizable to any gate and superconducting qubit architecture. Our optimized Toffoli gate implementation demonstrates an $18\%$ reduction in infidelity compared with the canonical implementation as benchmarked on IBM Jakarta with quantum process tomography. Assuming the inclusion of multi-qubit cross-resonance (MCR) gates in the IBMQ native gate set, we produce Toffoli implementations with only six multi-qubit gates, a $25\%$ reduction from the canonical eight multi-qubit implementations for linearly connected qubits.