How many independent snapshots can be used to reduce the prediction error?

(4)To reduce the prediction error, the authors use N independent snapshots and symmetrize over all possible pairs: 1 N(N−1) ∑ i 6=j tr(Oρ̂i ⊗ ρ̂j).

What is the advantage of the derandomized version of classical shadows?

due to the dependence on the specific set of observables for choosing the measurement bases, the derandomized version can exploit advantageous structures in the set of observables the authors want to measure.

What is the widely used conjecture for building post-quantum cryptography?

(S27)One of the most widely used conjectures for building post-quantum cryptography is the hardness of learning with error (LWE) [63].

What is the prominent example of a stabilizer sketching approach?

To circumvent the exponential scaling in representing quantum states, Gosset and Smolin [30] have proposed a stabilizer sketching approach that compresses a classical description of quantum states to an accurate sketch of subexponential size.

What is the way to test the performance of feature prediction with classical shadows?

To test the performance of feature prediction with classical shadows the authors first have to simulate the (quantum) data acquisition phase.

How many random Pauli measurements do the authors need to predict a collection of M quadratic?

Hence the authors only need a total number of Ntot = O(log(M)4k/ 2) random Pauli basis measurements to predict M quadratic functions tr(Oiρ⊗ ρ).

What are the main reasons why classical shadows are useful in near-term experiments?

The authors therefore anticipate that classical shadows will be useful in near-term experiments characterizing noise in quantum devices and exploring variational quantum algorithms for optimization, materials science, and chemisty.

(Open Access) Predicting many properties of a quantum system from very few measurements (2020) | Hsin-Yuan Huang

Q: What are the contributions in "Predicting many properties of a quantum system from very few measurements" ?

The authors present an efficient method for constructing an approximate classical description of a quantum state using very few measurements of the state.

Predicting Many Properties of a Quantum System from Very Few Measurements

Hsin-Yuan Huang,

1, 2, ∗

Richard Kueng,

1, 2, 3

and John Preskill

1, 2, 4

Institute for Quantum Information and Matter, Caltech, Pasadena, CA, USA

Department of Computing and Mathematical Sciences, Caltech, Pasadena, CA, USA

Institute for Integrated Circuits, Johannes Kepler University Linz, Austria

Walter Burke Institute for Theoretical Physics, Caltech, Pasadena, CA, USA

(Dated: April 23, 2020)

Predicting properties of complex, large-scale quantum systems is essential for developing quantum

technologies. We present an eﬃcient method for constructing an approximate classical description

of a quantum state using very few measurements of the state. This description, called a classical

shadow, can be used to predict many diﬀerent properties: order log M measurements suﬃce to

accurately predict M diﬀerent functions of the state with high success probability. The number of

measurements is independent of the system size, and saturates information-theoretic lower bounds.

Moreover, target properties to predict can be selected after the measurements are completed. We

support our theoretical ﬁndings with extensive numerical experiments. We apply classical shadows

to predict quantum ﬁdelities, entanglement entropies, two-point correlation functions, expectation

values of local observables, and the energy variance of many-body local Hamiltonians. The numerical

results highlight the advantages of classical shadows relative to previously known methods.

Making predictions based on empirical observations is a central topic in statistical learning theory and is

at the heart of many scientiﬁc disciplines, including quantum physics. There, predictive tasks, like estimating

target ﬁdelities, verifying entanglement, and measuring correlations, are essential for building, calibrating and

controlling quantum systems. Recent advances in the size of quantum platforms [59] have pushed traditional

prediction techniques — like quantum state tomography — to the limit of their capabilities. This is mainly due

to a curse of dimensionality: the number of parameters needed to describe a quantum system scales exponen-

tially with the number of its constituents. Moreover, these parameters cannot be accessed directly, but must

be estimated by measuring the system. An informative quantum mechanical measurement is both destructive

(wave-function collapse) and only yields probabilistic outcomes (Born’s rule). Hence, many identically prepared

samples are required to estimate accurately even a single parameter of the underlying quantum state. Further-

more, all of these measurement outcomes must be processed and stored in memory for subsequent prediction of

relevant features. In summary, reconstructing a full description of a quantum system with n constituents (e.g.

qubits) necessitates a number of measurement repetitions exponential in n, as well as an exponential amount

of classical memory and computing power.

Several approaches have been proposed to overcome this fundamental scaling problem. These include matrix

product state (MPS) tomography [18] and neural network tomography [15, 69]. Both only require a polynomial

number of samples, provided that the underlying state has suitable properties. However, for general quantum

systems, these techniques still require an exponential number of samples. We refer to the related work section

(Supplementary Section 3) for details.

Pioneering a conceptually very diﬀerent line of research, Aaronson [1] pointed out that demanding full classical

descriptions of quantum systems may be excessive for many concrete tasks. Instead it is often suﬃcient to

accurately predict certain properties of the quantum system. In quantum mechanics, interesting properties

are often linear functions of the underlying density matrix ρ, such as the expectation values {o

} of a set of

observables {O

(ρ) =trace(O

ρ) 1 ≤ i ≤ M. (1)

The ﬁdelity with a pure target state, entanglement witnesses, and the probability distribution governing the

possible outcomes of a measurement are all examples that ﬁt this framework. A nonlinear function of ρ such

as entanglement entropy, may also be of interest. Aaronson coined the term [1, 3] shadow tomography

for the

task of predicting properties without necessarily fully characterizing the quantum state, and he showed that a

polynomial number of state copies already suﬃce to predict an exponential number of target functions. While

very eﬃcient in terms of samples, Aaronson’s procedure is very demanding in terms of quantum hardware

— a concrete implementation of the proposed protocol requires exponentially long quantum circuits that act

collectively on all the copies of the unknown state stored in a quantum memory.

In this work, we combine the mindset of shadow tomography [1] (predict target functions, not the full state)

with recent insights from quantum state tomography [35] (rigorous statistical convergence guarantees) and

∗

Electronic address: hsinyuan@caltech.edu

According to Ref. [1] it was actually S.T. Flammia who originally suggested the name shadow tomography.

arXiv:2002.08953v2 [quant-ph] 22 Apr 2020

Measurements

Few Repetitions

Predicting …

Quantum System

Local Observables

Entanglement

Entropy

2-point Correlations

Hamiltonian

Possible Properties

Data Acquisition Phase Prediction Phase

Quantum Fidelity

Entanglement

Witness

Unitary

Evolution

Random

Unitary

Classical

Representation

Figure 1: An illustration for constructing a classical representation, the classical shadow, of a quantum system from

randomized measurements. In the data acquisition phase, we perform a random unitary evolution and measurements

on independent copies of an n-qubit system to obtain a classical representation of the quantum system — the classical

shadow. Such classical shadows facilitate accurate prediction of a large number of diﬀerent properties using a simple

median-of-means protocol.

the stabilizer formalism [31] (eﬃcient implementation). The result is a highly eﬃcient protocol that learns a

minimal classical sketch S

– the classical shadow – of an unknown quantum state ρ that can be used to predict

arbitrary linear function values (1) by a simple median-of-means protocol. A classical shadow is created by

repeatedly performing a simple procedure: Apply a unitary transformation ρ 7→ U ρU

†

, and then measure all

the qubits in the computational basis. The number of times this procedure is repeated is called the size of

the classical shadow. The transformation U is randomly selected from an ensemble of unitaries, and diﬀerent

ensembles lead to diﬀerent versions of the procedure that have characteristic strengths and weaknesses. In

a practical scheme, each ensemble unitary should be realizable as an eﬃcient quantum circuit. We consider

random n-qubit Cliﬀord circuits and tensor products of random single-qubit Cliﬀord circuits as important

special cases. These two procedures turn out to complement each other nicely. We refer to Figure 1 for a

visualization and a list of important properties that can be predicted eﬃciently.

Our main theoretical contribution equips this procedure with rigorous performance guarantees. Classical

shadows with size of order log(M) suﬃce to predict M target functions in Eq. (1) simultaneously. Most impor-

tantly, the actual system size (number of qubits) does not enter directly. Instead, the number of measurement

repetitions N is determined by a (squared) norm kO

shadow

. This norm depends on the target functions and

the particular measurement procedure used to produce the classical shadow. For example, random n-qubit

Cliﬀord circuits lead to the Hilbert-Schmidt norm. On the other hand, random single-qubit Cliﬀord circuits

produce a norm that scales exponentially in the locality of target functions, but is independent of system

size. The resulting prediction technique is applicable to current laboratory experiments and facilitates the

eﬃcient prediction of few-body properties, such as two-point correlation functions, entanglement entropy of

small subsystems, and expectation values of local observables.

In some cases, this scaling may seem unfavorable. However, we rigorously prove that this is not a ﬂaw of the

method, but an unavoidable limitation rooted in quantum information theory. By relating the prediction task

to a communication task [25], we establish fundamental lower bounds highlighting that classical shadows are

(asymptotically) optimal.

We support our theoretical ﬁndings by conducting numerical simulations for predicting various physically

relevant properties over a wide range of system sizes. These include quantum ﬁdelity, two-point correlation

functions, entanglement entropy, and local observables. We conﬁrm that prediction via classical shadows scales

favorably and improves on powerful existing techniques — such as machine learning — in a variety of well-

motivated test cases. An open source release for predicting many properties from very few measurements is

available at https://github.com/momohuang/predicting-quantum-properties.

Algorithm 1 Median of means prediction based on a classical shadow S(ρ, N).

1 function LinearPredictions(O

, . . . , O

, S(ρ; N), K)

2 Import S(ρ; N) = [ˆρ

, . . . , ˆρ

]  Load classical shadow

3 Split the shadow into K equally-sized parts and set  Construct K estimators of ρ

ˆρ

(k)

bN/Kc

kbN/Kc

i=(k−1)bN/Kc+1

ˆρ

4 for i = 1 to M do

5 Output ˆo

(N, K) = median





ˆρ

(1)



, . . . , tr



ˆρ

(K)



.  Median of means estimation

PROCEDURE

Throughout this work we restrict attention to n-qubit systems and ρ is a ﬁxed, but unknown, quantum state in

d = 2

dimensions. To extract meaningful information, we repeatedly perform a simple measurement procedure:

apply a random unitary to rotate the state (ρ 7→ UρU

†

) and perform a computational-basis measurement.

The unitary U is selected randomly from a ﬁxed ensemble. Upon receiving the n-bit measurement outcome

bi ∈ {0, 1}

, we store an (eﬃcient) classical description of U

†

bih

b|U in classical memory. It is instructive to

view the average (over both the choice of unitary and the outcome distribution) mapping from ρ to its classical

snapshot U

†

bih

b|U as a quantum channel:

†

bih

b|U

= M(ρ) =⇒ ρ = E

−1



†

bih

b|U

i

. (2)

This quantum channel M depends on the ensemble of (random) unitary transformations. Although the inverted

channel M

−1

is not physical (it is not completely positive), we can still apply M

−1

to the (classically stored)

measurement outcome U

†

bih

b|U in a completely classical post-processing step.

In doing so, we produce a single

classical snapshot ˆρ = M

−1



†

bih

b|U



of the unknown state ρ from a single measurement. By construction,

this snapshot exactly reproduces the underlying state in expectation (over both unitaries and measurement

outcomes): E[ˆρ] = ρ. Repeating this procedure N times results in an array of N independent, classical

snapshots of ρ:

S(ρ; N) =

ˆρ

= M

−1



†



, . . . , ˆρ

= M

−1



†

o

. (3)

We call this array the classical shadow of ρ. Classical shadows of suﬃcient size N are expressive enough

to predict many properties of the unknown quantum state eﬃciently. To avoid outlier corruption, we split

the classical shadow up into equally-sized chunks and construct several, independent sample mean estimators.

Subsequently, we predict linear function values (1) via median of means estimation [41, 55]. This procedure

is summarized in Algorithm 1. For many physically relevant properties O

and measurement channels M,

Algorithm 1 can be carried out very eﬃciently without explicitly constructing the large matrix ˆρ

Median of means prediction with classical shadows can be deﬁned for any distribution of random unitary

transformations. Two prominent examples are: (i) random n-qubit Cliﬀord circuits; and (ii) tensor products

of random single-qubit Cliﬀord circuits. Example (i) results in a clean and powerful theory, but also practical

drawbacks, because n

/ log(n) entangling gates are needed to sample from n-qubit Cliﬀord unitaries. The

corresponding inverted quantum channel is M

−1

(X) = (2

+ 1)X −I. Example (ii) is equivalent to measuring

each qubit independently in a random Pauli basis. Such measurements can be routinely carried out in many

experimental platforms. The corresponding inverted quantum channel is M

−1

i=1

−1

. We refer to

examples (i) / (ii) as random Cliﬀord / Pauli measurements, respectively. In both cases, the resulting classical

shadow can be stored eﬃciently in a classical memory using the stabilizer formalism.

RIGOROUS PERFORMANCE GUARANTEES

Theorem 1 (informal version). Classical shadows of size N suﬃce to predict M arbitrary linear target functions

tr(O

ρ), . . . , tr(O

ρ) up to additive error  given that N ≥ (order) log(M) max

shadow

/

. The deﬁnition

M is invertible if the ensemble of unitary transformations deﬁnes a tomographically complete set of measurements. See Supple-

mentary Section 1.

of the norm kO

shadow

depends on the ensemble of unitary transformations used to create the classical shadow.

We refer to Section 1 in the Supplementary Information for background, a detailed statement and proofs.

Theorem 1 is most powerful when the linear functions have a bounded norm that is independent of system size.

In this case, classical shadows allow for predicting a large number of properties from only a logarithmic number

of quantum measurements.

The norm kO

shadow

in Theorem 1 plays an important role in deﬁning the space of linear functions that can

be predicted eﬃciently. For random Cliﬀord measurements, kOk

shadow

is closely related to the Hilbert-Schmidt

norm tr(O

). As a result, a large collection of (global) observables with a bounded Hilbert-Schmidt norm can

be predicted eﬃciently. For random Pauli measurements, the norm scales exponentially in the locality of the

observable, not the actual number of qubits. For an observable O

that acts non-trivially on (at most) k qubits,

shadow

≤ 4

∞

, where k·k

∞

denotes the operator norm

. This guarantees the accurate prediction of

many local observables from only a much smaller number of measurements.

ILLUSTRATIVE EXAMPLE APPLICATIONS

Quantum ﬁdelity estimation. Suppose we wish to certify that an experimental device prepares a desired

n-qubit state. Typically, this target state |ψihψ| is pure and highly structured, e.g. a a GHZ state [32] for

quantum communication protocols, or a toric code ground state [21] for fault-tolerant quantum computation.

Theorem 1 asserts that a classical shadow (Cliﬀord measurements) of dimension-independent size suﬃces to

accurately predict the ﬁdelity of any state in the lab with any pure target state. This improves on the best

existing result on direct ﬁdelity estimation [27] which requires O(2

/

) samples in the worst case. Moreover,

a classical shadow of polynomial size allows for estimating an exponential number of (pure) target ﬁdelities all

at once.

Entanglement veriﬁcation. Fidelities with pure target states can also serve as (bipartite) entanglement

witnesses [36]. For every (bipartite) entangled state ρ, there exists a constant α and an observable O = |ψihψ|

such that tr(Oρ) > α ≥ tr(Oρ

), for all (bipartite) separable states ρ

. Establishing tr(Oρ) > α veriﬁes the

existence of entanglement in the state ρ. Any O = |ψihψ| that satisﬁes the above condition is known as an

entanglement witness for the state ρ. Classical shadows (Cliﬀord measurements) of logarithmic size allow for

checking a large number of potential entanglement witnesses simultaneously.

Predicting expectation values of local observables. Many near-term applications of quantum devices rely on

repeatedly estimating a large number of local observables. For example, low-energy eigenstates of a many-body

Hamiltonian may be prepared and studied using a variational method, in which the Hamiltonian, a sum of

local terms, is measured many times. Classical shadows constructed from a logarithmic number of random

Pauli measurements can eﬃciently estimate polynomially many such local observables. Because only single-

qubit Pauli measurements suﬃce, this measurement procedure is highly eﬃcient. Potential applications include

quantum chemistry [43] and lattice gauge theory [46].

Predicting expectation values of global observables (non-example). Classical shadows are not without limi-

tations. In our examples, the size of classical shadows must either scale with tr(O

) (Cliﬀord measurements)

or must scale exponentially in the locality of O

(Pauli measurements). Both quantities can simultaneously

become exponentially large for nonlocal observables with large Hilbert-Schmidt norm. A concrete example is

the Pauli expectation value of a spin chain: hP

⊗ ··· ⊗ P

= tr (O

ρ), where tr(O

) = 2

and k = n

(non-local observable). In this case, classical shadows of exponential size may be required to accurately predict

a single expectation value. In contrast, a direct spin measurement achieves the same accuracy with only of

order 1/

copies of the state ρ.

MATCHING INFORMATION-THEORETIC LOWER BOUNDS

The non-example above raises an important question: does the scaling of the required number of measure-

ments with Hilbert-Schmidt norm or with the locality of observables arise from a fundamental limitation, or

is it merely an artifact of prediction with classical shadows? A rigorous analysis reveals that this scaling is no

mere artifact; rather it stems from information-theoretic reasons.

Theorem 2 (informal version). Any procedure based on single-copy measurements, that can predict any M lin-

ear functions tr(O

ρ) up to additive error , requires at least (order) log(M) max

shadow

/

measurements.

This scaling can be further improved to 3

if O

is a tensor product of k single-qubit observables.

Here, kO

shadow

could be taken as the Hilbert-Schmidt norm tr(O

) or as a function scaling exponentially in

the locality of O

. The proof results from embedding the abstract prediction procedure into a communication

protocol. Quantum information theory imposes fundamental restrictions on any quantum communication

protocol and allows us to deduce stringent lower bounds. We refer to Supplementary Section 7 and 8 for details

and proofs.

The two main technical results complement each other nicely. Theorem 1 equips classical shadows with a

constructive performance guarantee: an order of log(M ) max

shadow

/

single-copy measurements suﬃce

to accurately predict an arbitrary collection of M target functions. Theorem 2 highlights that this number of

measurements is unavoidable in general.

PREDICTING NONLINEAR FUNCTIONS

The classical shadow S(ρ; N) = {ˆρ

, . . . , ˆρ

} of the unknown quantum state ρ may also be used to predict

non-linear functions f(ρ). We illustrate this with a quadratic function f (ρ) = tr(Oρ ⊗ρ), where O acts on two

copies of the state. Because ˆρ

is equal to the quantum state ρ in expectation, one could predict tr(Oρ ⊗ ρ)

using two independent snapshots ˆρ

, ˆρ

, i 6= j. Because of independence, tr(Oˆρ

⊗ ˆρ

) correctly predicts the

quadratic function in expectation:

E tr(Oˆρ

⊗ ˆρ

) = tr(O E ˆρ

⊗ E ˆρ

) = tr(Oρ ⊗ ρ). (4)

To reduce the prediction error, we use N independent snapshots and symmetrize over all possible pairs:

N(N−1)

i6=j

tr(Oˆρ

⊗ ˆρ

). We then repeat this procedure several times and form their median to further

reduce the likelihood of outlier corruption (similar to median of means). Rigorous performance guarantees

are given in Supplementary Section 6. This approach readily generalizes to higher order polynomials using

U-statistics [38].

One particularly interesting nonlinear function is the second-order Rényi entanglement entropy:

−log(tr(ρ

)), where A is a subsystem of the n-qubit quantum system. We can rewrite the argument in

the log as tr(ρ

) = tr (S

ρ ⊗ρ) — where S

is the local swap operator of two copies of the subsystem A —

and use classical shadows to obtain very accurate predictions. The required number of measurements scales

exponentially in the size of the subsystem A, but is independent of total system size. Probing this entanglement

entropy is a useful task and a highly eﬃcient specialized approach has been proposed in [12]. We compare this

Brydges et al. method to classical shadows in the numerical experiments.

For nonlinear functions, unlike linear ones, we have have not derived an information-theoretic lower bound

on the number of measurements needed, though it may be possible to do so by generalizing our methods.

NUMERICAL EXPERIMENTS

One of the key features of prediction with classical shadows is scalability. The data acquisition phase is

designed to be tractable for state of the art platforms (Pauli measurements) and future quantum computers

(Cliﬀord measurements), respectively. The resulting classical shadow can be stored eﬃciently in classical

memory. For may important features – such as local observables or global features with eﬃcient stabilizer

decompositions – scalability moreover extends to the computational cost associated with median of means

prediction.

These design features allowed us to conduct numerical experiments for a wide range of problems and sys-

tem sizes (up to 160 qubits). The computational bottleneck is not feature prediction with classical shadows,

but generating synthetic data, i.e. classically generating target states and simulating quantum measurements.

Needless to say, this classical bottle-neck does not occur in actual experiments. We then use this synthetic data

to learn a classical representation of ρ and use this representation to predict various interesting properties.

Machine learning based approaches [15, 69] are among the most promising alternative methods that have

applications in this regime, where the Hilbert space dimension is roughly comparable to the total number

of silicon atoms on earth (2

160

' 10

). For example, a recent version of neural network quantum state

tomography (NNQST) is a generative model that is based on a deep neural network trained on independent

quantum measurement outcomes (local SIC/tetrahedral POVMs [64]). In this section, we consider the task

of learning a classical representation of an unknown quantum state, and using the representation to predict

various properties, addressing the relative merit of classical shadows and alternative methods.

Predicting many properties of a quantum system from very few measurements

Figures

Citations

Variational Quantum Algorithms

Variational Quantum Algorithms

Quantum phases of matter on a 256-atom programmable quantum simulator

Noisy intermediate-scale quantum algorithms

Power of data in quantum machine learning

References

Quantum Computing in the NISQ era and beyond

A Class of Statistics with Asymptotically Normal Distribution

Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets

Many-Body Localization and Thermalization in Quantum Statistical Mechanics

Stabilizer Codes and Quantum Error Correction

Related Papers (5)

Quantum Computing in the NISQ era and beyond

Quantum supremacy using a programmable superconducting processor

Hardware-efficient variational quantum eigensolver for small molecules and quantum magnets

Supplementary information for "Quantum supremacy using a programmable superconducting processor"

A variational eigenvalue solver on a photonic quantum processor

Frequently Asked Questions (10)

Q1. What are the contributions in "Predicting many properties of a quantum system from very few measurements" ?

Q2. How many independent snapshots can be used to reduce the prediction error?

Q3. What is the advantage of the derandomized version of classical shadows?

Q4. What is the widely used conjecture for building post-quantum cryptography?

Q5. What is the prominent example of a stabilizer sketching approach?

Q6. What is the way to test the performance of feature prediction with classical shadows?

Q7. What is the resulting technique for predicting quantum properties?

Q8. Why is the number of parameters needed to describe a quantum system so large?

Q9. How many random Pauli measurements do the authors need to predict a collection of M quadratic?

Q10. What are the main reasons why classical shadows are useful in near-term experiments?