scispace - formally typeset
Open AccessProceedings ArticleDOI

Measuring Information Leakage Using Generalized Gain Functions

Reads0
Chats0
TLDR
G-leakage is introduced, a rich generalization of the min-entropy model of quantitative information flow and bounds between min-capacity, g- capacity, and Shannon capacity are proved, and a deep connection between a strong leakage ordering on two channels is shown, and the Lattice of Information is proposed from deterministic to probabilistic channels.
Abstract
This paper introduces g-leakage, a rich generalization of the min-entropy model of quantitative information flow. In g-leakage, the benefit that an adversary derives from a certain guess about a secret is specified using a gain function g. Gain functions allow a wide variety of operational scenarios to be modeled, including those where the adversary benefits from guessing a value close to the secret, guessing a part of the secret, guessing a property of the secret, or guessing the secret within some number of tries. We prove important properties of g-leakage, including bounds between min-capacity, g-capacity, and Shannon capacity. We also show a deep connection between a strong leakage ordering on two channels, C1 and C2, and the possibility of factoring C1 into C2C3, for some C3. Based on this connection, we propose a generalization of the Lattice of Information from deterministic to probabilistic channels.

read more

Content maybe subject to copyright    Report

HAL Id: hal-00734044
https://hal.inria.fr/hal-00734044
Submitted on 20 Sep 2012
HAL is a multi-disciplinary open access
archive for the deposit and dissemination of sci-
entic research documents, whether they are pub-
lished or not. The documents may come from
teaching and research institutions in France or
abroad, or from public or private research centers.
L’archive ouverte pluridisciplinaire HAL, est
destinée au dépôt et à la diusion de documents
scientiques de niveau recherche, publiés ou non,
émanant des établissements d’enseignement et de
recherche français ou étrangers, des laboratoires
publics ou privés.
Measuring Information Leakage using Generalized Gain
Functions
Mário S. Alvim, Konstantinos Chatzikokolakis, Catuscia Palamidessi, Georey
Smith
To cite this version:
Mário S. Alvim, Konstantinos Chatzikokolakis, Catuscia Palamidessi, Georey Smith. Measuring
Information Leakage using Generalized Gain Functions. Computer Security Foundations, 2012, Cam-
bridge MA, United States. pp.265-279, �10.1109/CSF.2012.26�. �hal-00734044�

Measuring Information Leakage using Generalized Gain Functions
M´ario S. Alvim
, Konstantinos Chatzikokolakis
, Catuscia Palamidessi
, and Geoffrey Smith
University of Pennsylvania, USA, msalvim@sas.upenn.edu
INRIA, CNRS and LIX,
´
Ecole Polytechnique, France, {kostas,catuscia}@lix.polytechnique.fr
Florida International University, USA, smithg@cis.fiu.edu
Abstract—This paper introduces g-leakage, a rich general-
ization of the min-entropy model of quantitative information
flow. In g-leakage, the benefit that an adversary derives from
a certain guess about a secret is specified using a gain function
g. Gain functions allow a wide variety of operational scenarios
to be modeled, including those where the adversary benefits
from guessing a value close to the secret, guessing a part of the
secret, guessing a property of the secret, or guessing the secret
within some number of tries. We prove important properties of
g-leakage, including bounds between min-capacity, g-capacity,
and Shannon capacity. We also show a deep connection between
a strong leakage ordering on two channels, C
1
and C
2
, and
the possibility of factoring C
1
into C
2
C
3
, for some C
3
. Based
on this connection, we propose a generalization of the Lattice
of Information from deterministic to probabilistic channels.
I. INTRODUCTION
A fundamental concern in computer security is to control
information flow, whether to protect confidential information
from being leaked, or to protect trusted information from
being tainted. In view of the pragmatic difficulty of prevent-
ing undesirable flows completely, there is now much interest
in theories that allow information flow to be quantified, so
that “small” leaks can be tolerated. (See, for example, [1],
[2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12].) For
any leakage measure, a key challenge is to establish its
operational significance, so that a certain amount of leakage
implies a definite security guarantee.
Min-entropy leakage [10], [13] is a leakage measure based
on the amount by which a channel increases the vulnerability
of a secret to being guessed correctly in one try by an
adversary.
1
This clear operational significance is a strength
of min-entropy, but it also leads to questions about whether
min-entropy leakage is relevant across the wide range of
possible application scenarios. For instance, what if the
adversary is allowed to make multiple guesses? Or what if
the adversary could gain some benefit by guessing the secret
only partially or approximately?
With respect to guessing the secret partially, we can note
that we could in fact analyze a sub-channel that models
1
The precise definition is reviewed in Section II.
This work has been partially supported by the European Union Seventh Framework
Programme under grant agreement no. 295261 (MEALS), and by the Inria large scale
initiative CAPPRIS: Collaborative Action for the Protection of Privacy Rights in the
Information Society.
the processing of whatever piece of a larger secret that we
wish to consider. While this can be useful, it is clumsy to
need to analyze multiple sub-channels of the same channel.
Also, such an analysis is misleading in the case of a channel
that poses little threat to any particular piece of the secret,
yet is very likely to leak some piece of the secret. To
illustrate, suppose that the secret is an array X containing
10-bit, uniformly-distributed passwords for 1000 users. Now
consider the following probabilistic channel, which leaks
some randomly-chosen user’s password:
u
?
{0..999};
Y = (u, X[u]);
(Ex1)
If we analyze the min-entropy leakage of (Ex1), we find
that the prior vulnerability is 2
10000
, since there are 10000
completely unknown bits, while the posterior vulnerability
is 2
9990
, since Y reveals 10 of the bits. The min-entropy
leakage is the logarithm of the ratio of the posterior and
prior vulnerabilities:
L = lo g
2
9990
2
10000
= log 2
10
= 10 bits.
If we instead analyze the sub-channel focused on any partic-
ular user is password, the prior vulnerability is 2
10
, and the
posterior vulnerability is 0.001 ·1 + 0.999 ·2
10
0.00198,
since with probability 0.001 , the adversary learns user is
password from Y , and with probability 0 .999, he must still
make a blind guess. Thus the min-entropy leakage of the
sub-channel is log 2.023 1.016 bits. Hence we see that
the threat of (Ex1) is not well described by min-entropy
leakage—the whole channel leaks just 10 bits out of 10000,
and the sub-channel just 1.016 bits out of 10, even though
some users password is always leaked completely.
In light of the wide range of possible operational threat
scenarios, there is growing appreciation that no single leak-
age measure is likely to be appropriate in all cases. For
this reason, in this paper we introduce a generalization of
min-entropy leakage, called g-leakage. The key idea is to
generalize the notion of vulnerability to incorporate what
we call a gain function g that models the benefit that the
adversary gains by making a certain guess about the secret. If
the adversary makes guess w when the secret’s actual value
is x, then g(w, x) models the benefit that the adversary gains
from this guess, ranging from 0 (if w has no value at all)

to 1 (if w is ideal). Given gain function g, g-vulnerability is
defined as the maximum expected gain over all guesses.
As we will see in Section III, gain functions let us
model a wide variety of scenarios, including those where
the adversary benefits from guessing a value close to the
secret, guessing a part of the secret, guessing a property of
the secret, or guessing the secret within k tries. We can also
model the case when there is a penalty for incorrect guesses.
Thus g-leakage seems fruitful in addressing a great number
of practical situations.
In addition to introducing the new concept of g-leakage,
we also make significant technical contributions, principally
in Sections V and VI.
In Section V, we establish important bounds on capacity,
the maximum leakage over all prior distributions. We prove
that min-capacity is an upper bound on g-capacity, for any
gain function g—this means that a channel with small min-
capacity is (in a sense) safe in every possible scenario.
Moreover, we prove that min-capacity is also an upper bound
on Shannon capacity, settling a conjecture in [14].
In Section VI, we consider the problem of comparing
two channels, C
1
and C
2
, asking whether on every prior the
leakage of C
1
is less than or equal to that of C
2
. Yasuoka
and Terauchi [15] and Malacaria [16] recently explored this
strong ordering in the case where C
1
and C
2
are determin-
istic, focusing on the fact that deterministic channels induce
partitions on the space of secrets. They showed that the
orderings produced by min-entropy leakage and Shannon
leakage are the same and, moreover, they coincide with the
partition refinement ordering in the Lattice of Information
[17]. Since partition refinement applies only to deterministic
channels but leakage ordering makes sense for any channels,
this equivalence suggests an approach to generalizing the
Lattice of Information to probabilistic channels.
Our first result in Section VI identifies a promising
generalization of partition refinement . We show that on
deterministic channels, C
1
C
2
iff there exists a factoriza-
tion of C
1
into a cascade: C
1
= C
2
C
3
, for some channel
C
3
. In this case we say that C
1
is composition refined by
C
2
, written C
1
C
2
. In the most technically challenging
part of our paper, we show a deep connection between
and leakage ordering. We show first in Theorem 6.2 that
C
1
C
2
implies that C
1
s g-leakage is less than or equal
to C
2
s, for every prior and every g; we denote this by
C
1
G
C
2
. We conjecture that the converse implication,
C
1
G
C
2
implies C
1
C
2
, is also true, but it turns out to
be extremely subtle and we have been unable so far to prove
it in full generality. We have proved it in important special
cases (e.g. when C
2
s columns are linearly independent)
even limiting to a very restricted kind of gain function; we
have also shown that the unproved case is inherently harder,
in that much richer gain functions are required.
The rest of the paper is structured as follows. Sections II,
III, and IV present preliminaries, define g-leakage, and show
its basic properties. Sections V and VI present our results on
capacity and on comparing channels. Finally, Sections VII
and VIII discuss related work and conclude.
II. PRELIMINARIES
In this section, we briefly recall the basic definitions of
information-theoretic channels [18], vulnerability, and min-
entropy leakage [10], introducing the non-standard notation
that we will use.
A channel is a triple (X, Y, C), where X and Y are finite
sets (of secret input values and observable output values) and
C is a channel matrix, an |X|×|Y| matrix whose entries are
between 0 and 1 and whose rows each sum to 1; the intent
is that C[x, y] is the probability of getting output y when
the input is x. Channel C is deterministic if each entry of C
is either 0 or 1, implying that each row contains exactly one
1, which means that each input produces a unique output.
Given a prior distribution π on X, we can define a joint
distribution p on X ×Y by p(x, y) = π[x]C[x, y]. This gives
jointly distributed random variables X and Y with marginal
probabilities p(x) =
P
y
p(x, y), conditional probabilities
p(y|x) =
p(x,y)
p(x)
(if p(x) is nonzero), and similarly p(y) and
p(x|y). As shown in [19], p is the unique joint distribution
that recovers π and C, in that p(x) = π[x] and p(y|x) =
C[x, y] (if p(x) is nonzero).
We now define vulnerability, introducing a new notation.
2
Definition 2.1: Given prior π and channel C, the prior
vulnerability is given by
V (π ) = max
x∈X
π[x],
and the posterior vulnerability is given by
V (π , C) =
X
y∈Y
max
x∈X
π[x]C[x, y].
We assume in this paper that the prior distribution π and
channel C are known to the adversary A. Then V (π) is the
prior probability that A could guess the value of X correctly
in one try. To understand posterior vulnerability, note that
V (π , C) =
P
y
max
x
p(x, y)
=
P
y
p(y) max
x
p(x|y)
=
P
y
p(y)V (p
X|y
)
making it the (weighted) average of the vulnerabilities of
the posterior distributions p
X|y
.
We convert from vulnerability to min-entropy by taking
the negative logarithm (to base 2):
Definition 2.2:
H
(π) = log V (π)
H
(π, C) = log V (π, C).
2
We deviate from the standard notation V (X) and V (X|Y ) used in
[14] and elsewhere, because we wish to express explicitly the dependence
on Xs prior distribution.

Note that vulnerability is a probability, while min-entropy
is a measure of bits of uncertainty.
Next we define min-entropy leakage L(π, C) and min-
capacity ML(C):
Definition 2.3:
L(π, C) = H
(π) H
(π, C) = log
V (π , C)
V (π )
ML(C) = sup
π
L(π, C).
The min-entropy leakage L(π, C) is the amount by which
channel C decreases the uncertainty about the secret; equiv-
alently, it is the logarithm of the factor by which C increases
the vulnerability. The min-capacity ML(C) is the maximum
min-entropy leakage over all priors π; it can be seen as the
worst-case leakage of C.
Finally, we recall [13] that the min-capacity of C is easy
to calculate, as it is simply the logarithm of the sum of the
column maximums of C:
Theorem 2.1: ML(C) = log
P
y
max
x
C[x, y], and it is
realized on a uniform prior π.
III. GAIN FUNCTIONS, g-VULNERABILITY, AND
g-LEAKAGE
We now develop the theory of gain functions and the
leakage measures that they give.
Implicit in the definition of prior and posterior vulnerabil-
ity V (π) and V (π , C) is the assumption that the adversary
benefits only by guessing the entire secret exactly. But,
as motivated in Section I, there are certainly situations
where this assumption is not appropriate. This leads us to
introduce what we call gain functions as abstract models of
the particular operational scenario. The idea is that in any
such scenario, there will be some set of guesses that the
adversary could make about the secret, and for any guess w
and secret value x, there will be some gain that the adversary
gets by choosing w when the secret’s actual value is x. A
gain function g will specify this gain as g(w, x), using scores
that range from 0 to 1.
A first question, however, is what should be the set of
allowable guesses. One might be tempted to assume that this
should just be X, the set of possible values of the secret.
But given our desire to model scenarios where the adversary
gains by guessing a piece of the secret, or a value close to
the secret, or some property of the secret, we instead let a
gain function use an arbitrary set W of allowable guesses.
Definition 3.1: Given a set X of possible secrets and a
finite, nonempty set W of allowable guesses, a gain function
is a function g : W × X [0, 1].
Sometimes it is convenient to represent a gain function g
as a |W|×|X| matrix G, where G[w, x] = g(w, x); the rows
of G correspond to guesses and the columns to secrets.
We now adapt the definition of vulnerability to take
account of the gain function:
Definition 3.2: Given gain function g and prior π, the
prior g-vulnerability is
V
g
(π) = max
wW
X
x∈X
π[x]g(w, x).
The idea is that adversary A should make a guess w that
maximizes the expected gain; we therefore take the weighted
average of g(w, x), for every possible value x of X.
3
Definition 3.3: Given gain function g, prior π, and chan-
nel C, the posterior g-vulnerability is
V
g
(π, C) =
X
y∈Y
max
wW
X
x∈X
π[x]C[x, y]g(w, x)
=
X
y∈Y
max
wW
X
x∈X
p(x, y)g(w, x)
=
X
y∈Y
p(y)V
g
(p
X|y
)
Now we define g-entropy, g-leakage, and g-capacity in
exactly the same way as in Section II:
Definition 3.4:
H
g
(π) = log V
g
(π)
H
g
(π, C) = log V
g
(π, C)
L
g
(π, C) = H
g
(π) H
g
(π, C) = log
V
g
(π, C)
V
g
(π)
ML
g
(C) = sup
π
L
g
(π, C)
In Section IV, we will explore the mathematical properties
of g-leakage. But first we present a number of example gain
functions that illustrate the usefulness of g-leakage.
A. The identity gain function
One obvious (and often appropriate) gain function is the
one that says that a correct guess is worth 1 and an incorrect
guess is worth 0:
Definition 3.5: The identity gain function g
id
: X ×X
[0, 1] is given by
g
id
(w, x) =
1, if w = x,
0, if w 6= x.
Note that for g
id
we assume that W = X, since there is
no gain to be had from a guess outside of X. In terms of
representing a gain function as a matrix, g
id
corresponds to
the identity matrix I
|X |
. Also notice that g
id
is the Kronecker
delta, since g
id
(w, x) = δ
wx
.
Now we can show that g-vulnerability is a generalization
of ordinary vulnerability:
Proposition 3.1: Vulnerability under g
id
coincides with
vulnerability:
V
g
id
(π) = V (π).
3
We remark that our assumption that gain values are between 0 and 1 is
unimportant. Allowing g to return a value in [0, a], for some constant a,
just scales all g-vulnerabilities by a factor of a and therefore has no effect
on g-leakage.

Proof: Note for any w,
P
x
π[x]g
id
(w, x) = π[w]. So
V
g
id
(π) = max
w
π[w] = V (π).
This means that g
id
-leakage coincides with min-entropy
leakage.
B. Gain functions induced from metrics or other distance
functions
Exploring other gain functions, one quite natural kind of
structure that X may exhibit is a notion of distance between
secrets. That is, there may be a metric d on X, which is a
function
d : X × X [0, )
satisfying the properties
(identity of indiscernibles) d(x
1
, x
2
) = 0 iff x
1
= x
2
,
(symmetry) d(x
1
, x
2
) = d(x
2
, x
1
), and
(triangle inequality) d(x
1
, x
3
) d(x
1
, x
2
) + d(x
2
, x
3
).
Given a metric d, we can first form a normalized metric
¯
d
by dividing all distances by the maximum value of d, and
then we can define a gain function g
d
by
g
d
(w, x) = 1
¯
d(w, x).
(Note that here we are taking W = X.) In this case we say
that g
d
is the gain function induced from metric d.
4
Metrics induce a large class of gain functions—note in
particular that the identity gain function is induced by the
discrete metric, which assigns distance 1 to any two distinct
values. However, there are several reasons why it is useful
to allow more generality.
For one thing, it may make sense to generalize to a metric
on a set W that is a superset of X. To see why, suppose
that the space of secrets is the set of corner points of a
unit square: X = {(0, 0), (0, 1), (1, 0), (1, 1)}. Suppose that
we use the gain function g(w, x) = 1
¯
d(w, x), where the
metric
¯
d is the normalized Euclidean distance:
¯
d((x
1
, y
1
), (x
2
, y
2
)) =
r
(x
1
x
2
)
2
+ (y
1
y
2
)
2
2
Now,
V
g
d
(π) = max
w
X
x
π[x](1
¯
d(w, x))
and if π is uniform, then it is easy to see that any of the
four corner points are equally-good guesses, giving
V
g
d
(π) =
1
4
(1 + 2(1
1
2
) + 0) 0.3964
But the adversary could actually do better by guessing
(
1
2
,
1
2
), a value that is not in X, since that guess has
normalized distance
1
2
from each of the four corner points,
giving V
g
d
(π) =
1
2
, which is larger than the previous
vulnerability.
4
However, it is also rather natural to define a gain function from a metric
by g(w, x) = e
d(w,x)
; note that here we would actually want d to be
an extended metric, so that a gain of 0 becomes possible.
Moreover, the assumption of symmetry is sometimes in-
appropriate. Suppose that the secret is the time (rounded to
the nearest minute) that the last RER B train will depart
from Loz`ere back to Paris.
5
The adversary (i.e. the weary
traveler) wants to guess this time as accurately as possible,
but note that guessing 23:44 when the actual time is 23:47
is completely different from guessing 23:47 when the actual
time is 23:44! If we normalize so that a wait of an hour
or more is considered intolerable, then we would want the
distance function
d(w, x) =
xw
60
if x 60 < w x
1 otherwise
and the gain function
g(w, x) = 1 d(w, x).
But d(w, x) is not a metric, because it is not symmetric.
6
C. Binary gain functions
The family of gain functions that return either 0 or 1
(and no values in between) are of particular interest, since
we can characterize them concretely. For given such a gain
function, each guess exactly corresponds to the subset of X
for which that guess gives gain 1. (Moreover we can assume
without loss of generality that no two guesses correspond to
the same subset of X, since such guesses may as well be
merged into one.) Hence we can use the subsets themselves
as the guesses, leading to the following definition:
Definition 3.6: Given W 2
X
, W nonempty, the binary
gain function g
W
is given by
g
W
(W, x) =
1, if x W
0, otherwise.
Now we can identify a number of interesting gain func-
tions by considering different choices of W.
1) 2-block gain functions: If W = {W, X \W } then we
can see W as a property that the secret X might or might
not satisfy, and g
W
is the gain function corresponding to
an adversary that just wants to decide whether or not X
satisfies that property.
Such 2-block gain functions are reminiscent of the cryp-
tographic notion of indistinguishability, which demands that
from a ciphertext an adversary should not be able to decide
any property of the corresponding plaintext.
2) Partition gain functions: More generally, W could be
any partition of X into one or more disjoint blocks, where
the adversary just wants to determine which block the secret
belongs to.
This is equivalent to saying that W = X/, where is
an equivalence relation on X.
5
It is well known that RATP uses sophisticated techniques, such as the
droit de retrait, to make this time as unpredictable as possible.
6
Such a function is sometimes called a quasimetric.

Citations
More filters
Proceedings ArticleDOI

An operational measure of information leakage

TL;DR: Given two discrete random variables X and Y, an operational approach is undertaken to quantify the “leakage” of information from X to Y, and the resulting measure ℒ(X→Y ) is called maximal leakage, and is shown to be equal to the Sibson mutual information of order infinity.
Journal ArticleDOI

Privacy Games: Optimal User-Centric Data Obfuscation

TL;DR: This paper designs user-centric obfuscation mechanisms that impose the minimum utility loss for guaranteeing user’s privacy, and model the optimization problem as a leader-follower game between the designer of obfuscation mechanism and the potential adversary, and design adaptive mechanisms that anticipate and protect against optimal inference algorithms.
Journal ArticleDOI

An Operational Approach to Information Leakage

TL;DR: The framework used to define maximal leakage is used to give operational interpretations of commonly used leakage measures, such as Shannon capacity, maximal correlation, and local differential privacy.
References
More filters
Book

Elements of information theory

TL;DR: The author examines the role of entropy, inequality, and randomness in the design of codes and the construction of codes in the rapidly changing environment.
Book

Data Mining: Practical Machine Learning Tools and Techniques

TL;DR: This highly anticipated third edition of the most acclaimed work on data mining and machine learning will teach you everything you need to know about preparing inputs, interpreting outputs, evaluating results, and the algorithmic methods at the heart of successful data mining.
Journal ArticleDOI

Bayesian Estimation and Prediction Using Asymmetric Loss Functions

TL;DR: In this article, the authors derived the risk functions and Bayes risks for a number of well-known models and compared them with those of usual estimators and predictors, and showed that some usual predictors are inadmissible relative to the asymmetric LINEX loss by providing alternative estimators.
Journal ArticleDOI

Loss function‐based evaluation of DSGE models

TL;DR: In this article, a Bayesian econometric procedure for the evaluation and comparison of DSGE models is proposed to evaluate the discrepancy between DSGE model predictions and an overall posterior distribution of population characteristics that the researcher is trying to match.
Frequently Asked Questions (11)
Q1. What are the contributions in "Measuring information leakage using generalized gain functions" ?

This paper introduces g-leakage, a rich generalization of the min-entropy model of quantitative information flow. The authors prove important properties of g-leakage, including bounds between min-capacity, g-capacity, and Shannon capacity. The authors also show a deep connection between a strong leakage ordering on two channels, C1 and C2, and the possibility of factoring C1 into C2C3, for some C3. Based on this connection, the authors propose a generalization of the Lattice of Information from deterministic to probabilistic channels. 

The authors also proved important mathematical properties of g-leakage that further attest to the significance of their framework. As future work the authors intend to identify algorithms to calculate g-capacity, possibly using linear programming. Also, it would be interesting to extend g-leakage to the scenario where the adversary does not know the prior π, but instead has ( possibly incorrect ) beliefs about it, as in the works of Clarkson, Myers, and Scheider [ 31 ] and Hamadou, Sassone, and Palamidessi [ 32 ]. The authors are grateful to Miguel E. Andrés for discussions of this work, and to the anonymous referees for their comments and suggestions. 

On the assumption that C2’s columns are linearly independent, the linearly independent rows of C2 form an invertible matrix, and so the authors are done by Theorem 6.5. 

For they tell us that if the authors can show that the mincapacity of C is small, then the authors are guaranteed that the leakage under any gain function g and under any prior π is also small, as is the Shannon leakage. 

Under the uniform prior π, it is easy to see that the expected gain of every element (u, x) of W is 2−10, since for every u, X [u] is uniformly distributed on [0..1023]. 

since X+ is the best guess a priori, the authors conclude by Theorem 4.2 that Lg(π,C2) > 0.Lemma 6.4 allows us to prove some significant special cases of Conjecture 6.3, as the authors now show. 

Note that the exact setW of guesses is not important, as any gain function with n possible guesses can be represented by a n×|X | matrix G. First, from Theorem 6.9 (adapted to ≤Gn instead of ≤G), the authors know that C1 ≤Gn C2 iff Lg(πu, C1) ≤ Lg(πu, C2) for all g ∈ Gn. 

This order is sound and complete for Bayes risk, and they show that Bayes risk is maximally discerning, if contexts are taken into account, when compared to the alternative elementary tests of marginal guesswork, guessing entropy and Shannon entropy. 

Now if the authors consider the prior π′ = (0.5, 0.5, 0), the authors find that Vgd(π′) = 0.5, pY = (0.3, 0.7), Vgd(pX|y1) = 1, Vgd(pX|y2) = 5 7 , and Vgd(π′,Ex5) = 0.8, which gives Lgd(π′,Ex5) = log 1.6 ≈ 0.6781. 

Note that ≤G ⊆ ≤G2 ; the above theorem shows that in the case when C2 is invertible, the conjecture holds even if the authors restrict to 2-block gain functions. 

Recall from Section IV-C that Vg(π,C1) is the solution to the problem of maximizing tr(DπC1SG) subject to S being a channel matrix.