scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Optimal simultaneous detection and estimation under a false alarm constraint

01 May 1995-IEEE Transactions on Information Theory (IEEE)-Vol. 41, Iss: 3, pp 688-703
TL;DR: A multihypothesis testing framework for studying the tradeoffs between detection and parameter estimation (classification) for a finite discrete parameter set is developed and it is observed that Rissanen's order selection penalty method is nearly min-max optimal in some nonasymptotic regimes.
Abstract: This paper addresses the problem of finite sample simultaneous detection and estimation which arises when estimation of signal parameters is desired but signal presence is uncertain. In general, a joint detection and estimation algorithm cannot simultaneously achieve optimal detection and optimal estimation performance. We develop a multihypothesis testing framework for studying the tradeoffs between detection and parameter estimation (classification) for a finite discrete parameter set. Our multihypothesis testing problem is based on the worst case detection and worst case classification error probabilities of the class of joint detection and classification algorithms which are subject to a false alarm constraint. This framework leads to the evaluation of greatest lower bounds on the worst case decision error probabilities and a construction of decision rules which achieve these lower bounds. For illustration, we apply these methods to signal detection, order selection, and signal classification for a multicomponent signal in noise model. For two or fewer signals, an SNR of 3 dB and signal space dimension of N=10 numerical results are obtained which establish the existence of fundamental tradeoffs between three performance criteria: probability of signal detection, probability of correct order selection, and probability of correct classification. Furthermore, based on numerical performance comparisons between our optimal decision rule and other suboptimal penalty function methods, we observe that Rissanen's (1978) order selection penalty method is nearly min-max optimal in some nonasymptotic regimes. >

Summary (3 min read)

I. INTRODUCTION

  • ANY statistical decision problems in engineering ap-M plications fall into one of two categories: detection and point estimation.
  • The first is the simple coupled design strategy where detection performance is optimized under the false alarm constraint and the estimator is gated by this optimal detector.
  • This gives the form for the optimal estimator and optimal detector and gives tight lower bounds on the worst case estimation and detection error probabilities which can be used to study tradeoffs.
  • The authors show that the optimal constrained classifier in the multiple-component signal example (1) has an equivalent form: compare the maximum of the sum of the log-likelihood function and an optimal penalty function of p to a threshold and if the threshold is exceeded use this penalized loglikelihood to perform maximum-likelihood estimation.

A. Relation to Previous Work

  • Optimal coupled design strategies for detection and estimation have been studied by only a few authors.
  • Pioneering works along the lines of coupled design in simultaneous detection and estimation include the papers by Middleton and Esposito [14], [15], Fredriksen et al.. [7] , and Birdsall and Gobien [3] .
  • They noted that this strategy is optimal only for certain cases; their work reinforces this point by specifying conditions for optimality of their strategy.
  • The min-max multiple hypothesis testing strategy presented in their paper can also be interpreted as a sequence of binary composite hypothesis tests, thereby providing a link to Stuller's paper and establishing the structure of optimal sequential binary tests.

probability of miss Pe(M); and iii) probability of erroneous classijication P O ( E C )

  • Since Pe(M) and Pe(EC) generally vary as a function of 0,0-uniform minimization of these probabilities is in general impossible and a different approach must be taken.
  • The weights {be}eEe, can be regarded as unit normalized weights on the null states of nature 0 E 00; the weights {qj}f,l can be regarded as unit normalized weights on the composite states of nature {Oj}&,; and the weights {ce/qj}ece, can be regarded as unit normalized weights on the states of nature 0 E Oj.

HAb): x fib)

  • The condition (18) says that for a specific b* the level (Y constrained min-max test 4@* ) for the reduced hypotheses (16) must also be of level for the original hypotheses (2).
  • Under this condition Theorem 1 states that the composite null hypothesis.

I

  • Once such a reduction is achieved the authors need merely consider constrained min-max tests for the hypotheses (16) with simple null hypothesis Hib) and then select appropriate b* to satisfy condition (18).
  • The following theorem, specifies the form of constrained min-max tests for the set of hypotheses.
  • It is useful to compare the min-max optimal detector to the popular [26] ad hoc generalized likelihood ratio test (GLRT) Note that the GLRT (41) is not a min-max optimal detector except in the unlikely event that the ratio of maxeEeo f B and maxBEe, f e is equivalent to the ratio of weighted average densities in (40).
  • More specifically, these conditions state that if q5(b,c) equalizes the decision error probabilities over all of the altematives.
  • In some cases, equalization of all of the decision error probabilities is not possible.

Remark 6:

  • The dimension of the weight space over which a search must be performed to determine the optimal weights is the sum of the number of simple altemative hypotheses plus the number of the simple hypotheses composing the null hypothesis.
  • For a composite null hypothesis, this latter number can be very large which severely complicates the computation of the value function.
  • Such use of invariance principles was described in previous work [l] .
  • In some applications it is possible to efficiently parameterize the weights and significantly reduce the number of unknowns in the weight space, facilitating the search for optimal weights satisfying the conditions of Corollary 2, also known as Remark 7.
  • One important case where such a reduction is possible is the case where the decision problem is permutation-invariant [2], in which case the distribution of the likelihood ratio is invariant to permutations in the indices of the hypotheses.

A. Detection and Classijication of Changes in a Distribution

  • The objective is to detect and identify any outliers.
  • It would be very interesting to compare the error performance of the algorithms proposed in these papers to the achievable lower bounds specified by their finite sample min-max decision rules described below.
  • The weighted likelihood ratios are We will assume that the likelihood ratios have continuous distributions under HO so that randomization is not needed to achieve the false alarm constraint.the authors.the authors.
  • Furthermore, since the above decision rules equalize PO (M) and PO (EC), respectively, Corollary 2 asserts that these two rules are in fact the DO and CO rules of level cy.
  • The DO rule in (50) is a weighted average likelihood ratio test and is not equivalent to the generalized likelihood ratio test (GLRT).

Proof of Theorem 2

  • But given assertion i), assertion ii) (and thus the existence of $*) follows directly due to the fact that the convex set Chf-is compact.
  • Hence it remains to justify assertion i).
  • The authors must show that there exists a decision rule -4* E D, that achieves the infimum value incurred by using the test function.
  • To conclude that the min-max problem on the right-hand side of (76) admits a solution, the authors will need to use the min-max theorem [6, sec.
  • Hence the convexity and compactness of the constrained risk set S,.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

688
IEEE
TRANSACTIONS ON INFORMATION
THEORY,
VOL.
41,
NO.
3,
MAY
1995
Optimal Simultaneous Detection and
Estimation Under
a
False Alarm Constraint
Bulent Baygun,
Member,
IEEE,
and
Alfred
0.
Hero
111,
Member,
ZEEE
Abstruct-
This paper addresses the problem of finite sample
simultaneous detection and estimation which arises when esti-
mation of signal parameters is desired but signal presence is
uncertain. In general,
a
joint detection and estimation algorithm
cannot simultaneously achieve optimal detection and optimal
estimation performance. In this paper we develop a multihy-
pothesis testing framework for studying the tradeoffs between
detection and parameter estimation (classification) for a finite
discrete parameter set.
Our
multihypothesis testing problem is
based on the worst case detection and worst case classification
error probabilities of the class of joint detection and classification
algorithms which are subject to a false alarm constraint. This
framework leads to the evaluation
of
greatest lower bounds on
the worst case decision error probabilities and a construction of
decision rules which achieve these lower bounds. For illustration,
we apply these methods to signal detection, order selection, and
signal classification for
a
multicomponent signal in noise model.
For two
or
fewer signals, an
SNR
of
3
dB and signal space
dimension of
AV
=
10
numerical results are obtained which
establish the existence of fundamental tradeoffs between three
performance criteria: probability of signal detection, probability
of
correct order selection, and probability of correct classification.
Furthermore, based on numerical performance comparisons be-
tween
our
optimal decision rule and other suboptimal penalty
function methods, we observe that Rissanen’s order selection
penalty method is nearly min-max optimal in some nonasymp-
totic regimes.
Index
Terms-
Simultaneous decisions, fundamental tradeoffs,
min-max criterion, order selection, signal classification, signal
detection. likelihood ratio.
I. INTRODUCTION
ANY statistical decision problems in engineering ap-
M
plications fall into one of two categories: detection
and point estimation. In the
detection
problem an observed
random quantity may consist of “noise alone” or “signal
masked by noise;” the objective is to decide if there is a
signal in the observation subject to a constraint on false
alarm. In the
point estimation
problem a signal which is
known to be present in the observations has an unknown
feature represented by a parameter; the objective is to decide
on the parameter value. However, one frequently encounters
applications where estimation has to be performed under
Manuscript received August
16,
1993; revised November
30,
1994.
The
work of one
of
the authors
(B.
Baygun) was supported in part by a graduate
fellowship from Mikes, Inc., throughout this research.
The
material in this
paper was presented in part at ICASSP-92, San Francisco, CA, March 23-26,
1992.
B.
Baygun is with Schlumberger-Doll Research, Ridgefield,
CT
06877
USA.
A.
0.
Hero
I11
is with the Department
of
Electrical Engineering and
Computer Science,
The
University of Michigan, Ann Arbor, MI 48109
USA.
IEEE
Log Number 9410399.
uncertainty of signal presence. These include applications
such as fault detection and diagnosis in dynamical system
control [24], target detection and direction finding with an
array of sensors [27], image and speech segmentation
[
131, and
digital communications
[
181. The associated decision problem
is called
simultaneous
or
joint detection
and
estimation.
If we constrain the probability of false alarm to be equal
to
CY,
one can consider two approaches to the design of
decision rules for joint detection and estimation. The first is the
simple
coupled design strategy
where detection performance
is optimized under the false alarm constraint and the estimator
is gated by this optimal detector. In this case, one can
implement a conditionally optimal estimator which produces
an estimate only if the optimal detector decides that the
signal is present. While this uncoupled strategy guarantees
optimal detection performance, in general there is no guarantee
that the gated estimation performance will be acceptable.
The second approach is the
coupled
design strategy where
estimation performance is directly optimized under the false
alarm constraint. As in the uncoupled design, the false alarm
constraint prescribes a gated estimator. However, while this
gating is optimal for estimation, unlike the uncoupled design
it is generally not optimal for detection. Note that under
both the coupled and uncoupled strategies the false alarm
probabilities are identical. However, while in the uncoupled
case the false alarms are generated in such a way as to
minimize their impact on detection performance, in the cou-
pled case these false alarms
are
generated to minimize their
impact on estimation performance. The uncoupled strategy
provides an upper bound on the detection performance while
the coupled strategy provides an upper bound on estimation
performance.
By
comparing the detectiodestimation perfor-
mance of the uncoupled detection-optimal strategy to the
detectiodestimation performance of the coupled estimation-
optimal strategy we can study the fundamental tradeoff be-
tween optimal detection and optimal estimation subject to a
false alarm constraint.
This paper provides a framework for studying the tradeoffs
between detection and estimation based on the worst case
detection and worst case estimation error probabilities of the
class of simultaneous detection and estimation rules for a
finite discrete parameter space. We then formulate and solve
a constrained min-max multihypothesis testing problem with
nonstandard cost structure. This gives the form for the optimal
estimator and optimal detector and gives tight lower bounds
on the worst case estimation and detection error probabilities
which can be used to study tradeoffs.
0018-9448/95$04.00
0
1995
IEEE
Authorized licensed use limited to: University of Michigan Library. Downloaded on February 12, 2009 at 14:21 from IEEE Xplore. Restrictions apply.

BAYGUN AND HERO: OPTIMAL SlMULTANEOUS DETECTION AND ESTIMATION
689
To illustrate our results, we focus on the following mul-
ticomponent signal in noise model. A measured waveform
Y
consists of either a compound signal in additive noise,
or noise alone. If present, the signal is the sum of
p
ran-
domly scaled waveforms (components), out of a possible
N
equal-power orthogonal waveforms
{SI,
.
.
. ,
SN}
which are
known
a priori.
That is, the signal is known to lie in an
N-
dimensional subspace, called the
signal space,
whose basis is
{
S1
,
. . .
,
SN}.
Hence the observation model has the form
Here both the number
p
and the identity (indices) of the
p
signal components
{Sil,.
.
.
.
Si,}
are unknown. Assume that
it is known
a priori
that
p
is upper-bounded by some given
constant
P, P
5
N.
We define three related objectives: i)
signal detection
which is to decide if
p
>
0;
ii)
signal power
estimation
(order selection) which, if
p
>
0,
is to specify
the actual number
p
E
(1,
...
, P}
of signal components;
and iii)
signal component estimation
(classification) which, if
p
=
p,
>
0,
is to identify the
p,
signal components present.
These objectives arise in a number of applications including
telecommunications, harmonic retrieval, surveillance, and air-
traffic control.
In the context of the multicomponent signal model (l), our
results yield the following structure for the optimal constrained
rules. The optimal constrained classifier uses a set of
M=f:
p=l
(:)
likelihood ratios (one for each hypothesized set
{Si,,
. .
.
,
Si,}
of signal components,
il,
. . . ,
i,
E
{
1,
. .
.
,
N},
p
=
1,.
.
.
, P)
to implement a weighted generalized-likelihood ratio test, with
randomized threshold, followed by a weighted maximum-
likelihood estimator. The optimal constrained order selector
uses a set of
P
weighted averages of
(r
)
likelihood ratios,
p
=
1,
.
. .
,
P,
each average corresponding to a fixed number
p
of signal components. The optimal constrained detector
compares a weighted average of all
M
likelihood ratios to
a threshold. In each of the above three cases the weights and
the detection threshold are determined by 1) the solution to a
related nonlinear optimization problem; and 2) the false alarm
constraint
a.
We show that the optimal constrained classifier in the
multiple-component signal example
(1)
has an equivalent
form: compare the maximum of the sum of the log-likelihood
function and an optimal penalty function of
p
to a threshold
and if the threshold is exceeded use this penalized log-
likelihood to perform maximum-likelihood estimation. This
penalized likelihood structure is closely related to Akaike’s
AIC [27], and Rissanen’s MDL [19] order selection cri-
teria. The common feature is that the optimal constrained
classifier, AIC, and MDL all penalize the log-likelihood for
overestimation of
p.
Unlike the AIC and MDL penalties,
the penalty associated with the optimal constrained classifier
ensures optimal worst case estimation performance in the
finite sample regime. Furthermore, this “optimal penalty” takes
specific account of a false alarm constraint. We perform a
numerical study in which we construct the optimal weight
functions for optimal detection. order selection, and classi-
fication, implement the optimal likelihood ratio tests, and
analyze the relative performances for the case of
p
=
2
or
fewer signal components. In this manner, we establish the
existence of significant tradeoffs between optimal detection,
optimal estimation, and optimal order selection. This study
also establishes the remarkable result that the MDL order se-
lection penalty is nearly optimal, in the sense of achieving the
finite sample min-max constrained classification performance
attained with our optimal penalty function, when
SNR
is
3
dB, signal space dimension is
N
=
10,
and the number of
independent snapshots is between 18 and 26.
A.
Relation to Previous Work
Optimal coupled design strategies for detection and estima-
tion have been studied by only a few authors. Pioneering works
along the lines of coupled design in simultaneous detection
and estimation include the papers by Middleton and Esposito
[14],
[15],
Fredriksen
et al..
[7], and Birdsall and Gobien
[3].
The common ground in each of these studies is the
Bayesian viewpoint; that is, the parameters
are
assigned prior
probabilities
so
that average performance can be optimized.
Kelly
et al..
[lo], [ll] studied the problem of simultaneous
detection and estimation using a combination
of
a generalized-
likelihood ratio test and a maximum-likelihood classifier.
They noted that this strategy is optimal only for certain
cases; our work reinforces this point by specifying conditions
for optimality of their strategy. Stuller [23] extended the
generalized-likelihood ratio test approach to multiple com-
posite hypothesis testing, by breaking the problem into a
sequence of binary composite hypothesis tests. He provided
rather stringent sufficient conditions for min-max optimality
of this strategy, pointing out that the question of min-max
optimality in the general case is yet to be investigated. The
min-max multiple hypothesis testing strategy presented in our
paper can also be interpreted as a sequence of binary composite
hypothesis tests, thereby providing a link to Stuller’s paper and
establishing the structure of optimal sequential binary tests.
An outline of the paper is as follows. Section I1 introduces
the statistical framework that will be used in this paper. Section
I11 provides theoretical results whose proofs are contained
in the Appendix. In Section
V,
we specialize the theory
to
three different problems: outlier detection and identification,
detection and classification of a step change, and detection
and parameter estimation of a multicomponent signal in noise.
11.
PROBLEM
STATEMENT
A parametric statistical experiment [9] is defined as the
indexed probability space
(R,o,
Po)
where
0
is a parameter
lying in a parameter space
0,
R
is the set of possible outcomes
of the experiment, is a sigma algebra consisting of subsets of
R,
and
Po
is a probability measure defined on
o.
The parameter
space
0
summarizes all of the uncertainty in the probability
model
Po
for the experiment. It is important to emphasize that
6’
is a fixed nonrandom parameter.
Authorized licensed use limited to: University of Michigan Library. Downloaded on February 12, 2009 at 14:21 from IEEE Xplore. Restrictions apply.

690
IEEE
TRANSACTIONS ON INFORMATION THEORY,
VOL.
41,
NO.
3.
MAY
1995
Define the finite partition, called a
(J
+
1)-ary partition,
{eo,.
. . ,
OJ}
of
0.
For fixed
B
=
etrue,
denoted the “true
8,”
let
X
be a random variable defined on
Cl
and taking values
in
a set
X
called the observation space. We assume that
X
has a probability density function
fe(x)
with respect to some
dominating measure
p.
Let
etrue
be contained in partition
element
0j
for a particular
j
E
(0,
.
. .
,
J}.
The objective
is to correctly decide on the partition element
0,
containing
etrue
based on a realization
X(w)
=
x
of
X.
We can express
this classification problem
in
terms
of
testing between the
J+
1
exhaustive and mutually exclusive hypotheses
[25]
-
HO:
X
N
fe,
8
E
00
When
Btrue
is contained in
Oj
the hypothesis
fIj
is said
to be true and the other hypotheses are said to be false. In
this case,
Ilj
is said to be the “true state of nature.” If the
partition elements
00,
. .
.
,
OJ
are single-point sets, then the
hypotheses
(2)
are called simple hypotheses. Otherwise, if
a
partition element
01
consists of more than one point
6’
then
specification of
El,
does not specify a unique distribution
PO
and
El
is called a composite hypothesis.
A
simple hypothesis
will be identified by the absence of an underscore, e.g.,
Hl.
We specialize our treatment to the case
of
a discrete pa-
rameter space
0
with
K
+
l
elements denoted by indices
(0;’.
.K}.
We will assume that
Oo
corresponds to the set
of
K
-
M
+
1
parameters
00
=
(0,
.
.
.
,
K
-
M}
where
M
is a positive integer less than or equal to
K.
We identify
two special partitions which will play an important role
in
the
sequel. The binary partition,
(00,
01}
where
O1
=
(K-M+
1,
.
.
.
,
K}.
which specifies
a
composite
detection
problem
EO:
xNfO,
6’E(o,”..K-hf}
I&:
X~f0,
8E{K-M+1,...,K}
(3)
where
Bo
is called the
null
hypothesis
and
El
is called
the
alternative hypothesis.
The
(M
+
1)-ary partition,
{
00,01,
.
. . ,
OM},
where
01,
.
.
.
,
Onf
are the single-point
sets
(K
-
M
+
I},
.
. .
,
{K},
respectively, specifies a
joint
detection-class~cation
problem with simple alternatives:
Eo:
X~f0,
6’€(0,....K-M}
HI:
X- f0.
B=K-M+l
The primary difference between detection
(3)
and joint
detection-classification
(4)
is that decision strategies for
detection can only be penalized for erroneously deciding
on the composite alternative
El
while decision strategies for
joint detection-classification can bear an additional penalty for
erroneous classification among the alternatives
H1,
.
.
.
,
H~z.
The set of decision strategies for the general
(J
+
l)-ary
hypothesis testing problem
(2)
is specified by the set of test
functions
[25].
Definition 1:
A
test function
4
=
[40,...,4~]~
for the
multiple hypotheses
go,
. .
.
,
&-is a
(J
+
1)-dimensional
vector function on
X
such that
-
4(x)
E
[0,
l](J+l)
and
J
q$(x)
=
1.vx
E
X.
3
=O
For a given realization
X
=
x,
4](x)
is the conditional
probability of deciding
E3.
Consequently,
1
-
4](x)
is
the
conditional probability of not deciding
HJ
and
43
(x)
+
&(x)
is the conditional probability
of
deciding either
KJ
or
H,.
The summation condition
J
43(x)
=
1
3=0
ensures that exactly one of
Eo,
.
.
.
.
fl,
must be decided.
Let
4
=
[q50.q51,...,4~,~]~
be an arbitrary test function
for testing among the hypotheses
E,,
HI,
. .
. ,
HM
.
This test
function defines a simultaneous detection-classification rule.
Specifically, since detection is a binary decision between
KO
:
6’
E
00
and
E,:
6’
E
GO,
where
A1
-
oO=O-OO=
UOk
k=l
the first element
40
of
-
4
specifies a binary test function
-
4D
for detection
r
iT
On the other hand, define
XE1
as the set of
x
for which
&(x)
#
1,
that is, for
X
=
5
E
Xg1
the decision
E,
occurs with nonzero probability. Then
-
4
specifies an M-ary
test function
4c
on
xH~
for classification
-
where
43(x)/(l
-
&(x))
is the conditional probability of
classifying
6’
into
0,
=
{e,}
given that
X
=
z
E
Xg1.
Conversely, if a test function
q5D
=
[@
,1-
@IT
for detec-
tion and
a
test function
4c
=
[G.
.
. .
.
4EZlT
for classification
are available, a simultaneous detection-classification rule
4
=
[40,
.
.
.
,4~]~
is easily constructed via the identification
-
-
4
=
[4F,
(1
-
4F)4?,.
..
.
(1
-
434w
(7)
We call
4
a “gated’ classification rule since the classification
rule
q5c
isenabled by the detection rule
4:
when
1
-
4:
#
0,
i.e., when signal detection can occur with nonzero probability.
The average performance of a particular test function
$
is determined by i) probability
of
false alarm
Pe(FA);
ii)
Authorized licensed use limited to: University of Michigan Library. Downloaded on February 12, 2009 at 14:21 from IEEE Xplore. Restrictions apply.

BAYGUN AND HERO: OPTIMAL SIMULTANEOUS DETECTION AND ESTIMATION
69
1
probability of
miss
Pe(M);
and iii) probability of
erroneous
classijication
PO
(EC)
null hypothesis
go
can be reduced to an equivalent simple
null hypothesis. Define the K-dimensional unit simplex
CK
Pe(FA)
=Ee[l
-
401,
0
E
00,
K
Po
(M)
=
Ee
[401,
Pe(EC)
=&[I
-
4rJ(e)l,
0
$00
(8)
CK
=
p
E
[o,llK:
Cpj
11
{-
j=O
0
e
eo
where
r~(0)
E
(0,
. . .
,
J}
is the
set partition function
which
takes the value
j
if
0
E
Oj.
We will be interested in those test functions whose false
alarm probability
Pe(FA)
is less than or equal to a prespec-
ified constant
a
E
[0,1] [25].
Definition
2:
A test function
q5
is
of
level
a
if
-
(9)
for a specified
a
E
[0,1].
The classical Neyman-Pearson criterion of signal detection
[12] states that it is desirable to minimize the miss prob-
ability
Pe(M),0
$!
00,
subject to the constraint (9). On
the other hand, in terms of signal classification, minimizing
Pe(EC),
0
e
00
is desirable. However, since
Pe(M)
and
Pe(EC)
generally vary as a function of 0,0-uniform mini-
mization of these probabilities is in general impossible and
a
different approach must be taken.
The weights
{be}eEe,
can be regarded as unit normalized
weights on the null states of nature
0
E
00;
the weights
{qj}f,l
can be regarded as unit normalized weights on
the composite states of nature
{Oj}&,;
and the weights
{ce/qj}ece, can be regarded as unit normalized weights on
the states of nature
0
E
Oj.
Consider the following reduced hypotheses:
111. CONSTRAINED
MIN-MAx
TESTS
HAb):
x
fib)
For the purposes of establishing &uniform lower bounds on
El:
X
N
fe,
0
E
01
Pe(M)
and
Pe(EC)
it makes sense to consider the form and
performance of constrained min-max test functions of level
of level
a
by
a.
Define the set
Va
of all test functions
4
=
[$o,
. . .
,4~]~
&:
X
N
fe,
0
E
OJ.
(16)
-
Note that relative to (2) the null hypothesis in (16) has been
reduced to a simple null hypothesis. Define the expectation
EA')[g(X)] of g(X) under the simple hypothesis
Hib)
J
j=O
x
H
[O,
l](Jfl),
+j
=
1,
Definition
3:
A
test function
4*
=
[4:,
.
'.
,
4?lT
is a
constrained min-max test of level-a between the hypotheses
Ho,...,HJ
if
4*
E
D,,
i.e.
The following theorem is proven in the Appendix.
-
Theorem
1:
For arbitrary
b
E
CK-M+1,
let
-
and if for any other test function
4
=
[40,
.
.
.
,
$J]*
E
D,
be a constrained min-max test of level
a
for testing among
the hypotheses (16) with simple null hypothesis
Hib).
If there
exists a weight vector
b
=
b*
such that
-
max
Eel1
-
4fJ(e)]
5
max
Ee[l
-
hJ(e)].
weo
eeeO
(12)
Observe that, if a constrained min-max test
$*
of level
a
can be found, the left-hand side of (12) provides& achievable max
Ee[l
-
4ib*)]
=
a
(18)
then
q5*ef4@*)
is a constrained min-max test of level
a
for testingamong the hypotheses (2) with composite null
hypothesis
E,.
Furthermore, such a
b*
exists if
eEeo
lower bound on the maximum error probability
maxEe[l
-
4TJ(@)]
eeoo
of any level
a
test.
The first step in deriving the form of constrained min-max
tests
-
4*
for the hypotheses in (2) is to show that a composite
Ehb*)[1
-
&*)I
=
(19)
Authorized licensed use limited to: University of Michigan Library. Downloaded on February 12, 2009 at 14:21 from IEEE Xplore. Restrictions apply.

692
IEEE
TRANSACTIONS
ON
INFORMATION THEORY,
VOL.
41,
NO. 3,
MAY
1995
and
b*
is a "least favorable prior distribution" in the sense
that for any other
4
E
CK-M+~
[l
-
&*)(Z)]fp*)(z)
dp(x).
(20)
The condition
(18)
says that for a specific
b*
the level
(Y
constrained min-max test
4@*
)
for the reduced hypotheses
(16) must also be of level for the original hypotheses (2).
Under this condition Theorem
1
states that the composite null
hypothesis
I&
can be reduced to simple null hypothesis
Hi')
by a
b
weighting of the
fe
over
8
E
00.
Once such a reduction
is achieved we need merely consider constrained min-max
tests for the hypotheses (16) with simple null hypothesis
Hib)
and then select appropriate
b*
to satisfy condition
(18).
The
existence of a weight vector
4*
which satisfies the sufficient
conditions
(19),
(20)
is related to the existence of a detector
having
constant false alarm rate
(CFAR) [21].
The following theorem, proven in the Appendix, specifies
the form of constrained min-max tests for the set of hypotheses
Ho:
x
-
fo
x
f,9,
8
E
01
H,:
x
N
fe,
8
E
OJ
(21)
where
fo
is an arbitrary pdf, e.g.,
fo
=
fib*).
define
yJ
and
f:')
as in
(14),
(15).
Let
Theorem
2:
Fix the level
CY
E
[O.
11. For arbitrary
c
E
CM,
and define the test function
and for
j
=
I,...
.J
and
j
=
j,,,
Io.
else
where
X
2
0
and
<
E
[O.
11 are functions
of
I:
selected to
satisfy the constraint on the false alarm probability
Eo[l
-
43
=
a.
(24)
Then there exists a weight vector
c
=
c*.
called the "optimal
weight vector," for which
and
4*ef4(,*)
defined by
(22)-(25)
is a constrained
min-Gax test
of
level
a
for testing among the hypotheses
Next we give a corollary which specifies the form of
the constrained min-max tests for composite hypotheses
E,
. .
. ,
Corollaly
1:
Fix the level
a
E
[0,1]. For arbitrary
c
E
CM
and
b
E
CK-M+~~
let
fi".qq3,
and
fjc).j
=
l,....J
.
be as
defined in (13)-(
15).
Let
j,,,
=
arg
max
y3
fjG)(z)
WO,
HI
,
. . .
,
HJ.
by combining the results of Theorems 1 and 2.
J>o
and define the test function
by the following assignments:
and
-
4(b,c*)
is a constrained min-max test of level
a
for testing
among the hypotheses (16) with simple null hypothesis
Hib).
Furthermore, if there exists a weight vector
b
=
b*
for which
then
$*~fq5(b*1c*)
defined by (26)-(30) is a constrained
min-max test
of
level
a
for testing among the hypotheses
(2) with composite null
EO.
Authorized licensed use limited to: University of Michigan Library. Downloaded on February 12, 2009 at 14:21 from IEEE Xplore. Restrictions apply.

Citations
More filters
Journal ArticleDOI
TL;DR: Intense simulations in the MIMO radar example demonstrate that by using jointly optimum schemes, the authors can experience significant improvement in estimation quality, as compared to generalized the likelihood ratio test or the test that treats the two subproblems separately, with only small sacrifices in detection power.
Abstract: We consider a well-defined joint detection and parameter estimation problem. By combining the Bayesian formulation of the estimation subproblem with suitable constraints on the detection subproblem, we develop optimum one- and two-step test for the joint detection/estimation setup. The proposed combined strategies have the very desirable characteristic to allow for the trade-off between detection power and estimation quality. Our theoretical developments are, then, applied to the problems of retrospective changepoint detection and multiple-input multiple-output (MIMO) radar. In the former case, we are interested in detecting a change in the statistics of a set of available data and provide an estimate for the time of change, while in the latter in detecting a target and estimating its location. Intense simulations in the MIMO radar example demonstrate that by using jointly optimum schemes, we can experience significant improvement in estimation quality, as compared to generalized the likelihood ratio test or the test that treats the two subproblems separately, with only small sacrifices in detection power.

109 citations

Journal ArticleDOI
TL;DR: This article represents an endeavor by the members of the SSAT-TC to review all the significant developments in the field of SSAP and introduces the recent reorganization of three technical committees of the Signal Processing Society.
Abstract: The Statistical Signal and Array Processing Technical Committee (SSAP-TC) deals with signals that are random and processes an array of signals simultaneously. The field of SSAP represents both solid theory and practical applications. Starting with research in spectrum estimation and statistical modeling, study in this field is always full of elegant mathematical tools such as statistical analysis and matrix theory. The area of statistical signal processing expands into estimation and detection algorithms, time-frequency domain analysis, system identification, and channel modeling and equalization. The area of array signal processing also extends into multichannel filtering, source localization and separation, and so on. This article represents an endeavor by the members of the SSAT-TC to review all the significant developments in the field of SSAP. To provide readers with pointers for further study of the field, this article includes a very impressive bibliography-close to 500 references are cited. This is just one of the indications that the field of statistical signals has been an extremely active one in the signal processing community. The article also introduces the recent reorganization of three technical committees of the Signal Processing Society.

84 citations

Proceedings ArticleDOI
09 Jul 2007
TL;DR: Theoretical results of the optimal joint decision and estimation that minimizes the new Bayes risk are presented and the power of the new approach is illustrated by applications in target tracking and classification.
Abstract: Many problems involve joint decision and estimation, where qualities of decision and estimation affect each other. This paper proposes an integrated approach based on a new Bayes risk, which is a generalization of those for decision and estimation separately. Theoretical results of the optimal joint decision and estimation that minimizes the new Bayes risk are presented. The power of the new approach is illustrated by applications in target tracking and classification.

75 citations


Cites background from "Optimal simultaneous detection and ..."

  • ..., [3], [10], [27]), especially for target inference (see, e....

    [...]

Posted Content
TL;DR: In this paper, the spectral scan statistic is proposed to find the sparsest cut in a graph, and its performance as a testing procedure depends directly on the spectrum of the graph and use this result to explicitly derive its asymptotic properties.
Abstract: We consider the change-point detection problem of deciding, based on noisy measurements, whether an unknown signal over a given graph is constant or is instead piecewise constant over two connected induced subgraphs of relatively low cut size. We analyze the corresponding generalized likelihood ratio (GLR) statistics and relate it to the problem of finding a sparsest cut in a graph. We develop a tractable relaxation of the GLR statistic based on the combinatorial Laplacian of the graph, which we call the spectral scan statistic, and analyze its properties. We show how its performance as a testing procedure depends directly on the spectrum of the graph, and use this result to explicitly derive its asymptotic properties on few significant graph topologies. Finally, we demonstrate both theoretically and by simulations that the spectral scan statistic can outperform naive testing procedures based on edge thresholding and $\chi^2$ testing.

66 citations

Journal ArticleDOI
TL;DR: A statistical model is employed of such an ensemble and the majority voting rule is replaced with a likelihood ratio test to train the ensemble to guarantee desired statistical properties, such as the false-alarm probability and the detection power, while preserving the high detection accuracy of original ensemble classifier.
Abstract: The machine learning paradigm currently predominantly used for steganalysis of digital images works on the principle of fusing the decisions of many weak base learners. In this paper, we employ a statistical model of such an ensemble and replace the majority voting rule with a likelihood ratio test. This allows us to train the ensemble to guarantee desired statistical properties, such as the false-alarm probability and the detection power, while preserving the high detection accuracy of original ensemble classifier. It also turns out the proposed test is linear. Moreover, by replacing the conventional total probability of error with an alternative criterion of optimality, the ensemble can be extended to detect messages of an unknown length to address composite hypotheses. Finally, the proposed well-founded statistical formulation allows us to extend the ensemble to multi-class classification with an appropriate criterion of optimality and an optimal associated decision rule. This is useful when a digital image is tested for the presence of secret data hidden by more than one steganographic method. Numerical results on real images show the sharpness of the theoretically established results and the relevance of the proposed methodology.

59 citations

References
More filters
Book
01 Jan 1983

25,017 citations

01 Nov 1985
TL;DR: This month's guest columnist, Steve Bible, N7HPR, is completing a master’s degree in computer science at the Naval Postgraduate School in Monterey, California, and his research area closely follows his interest in amateur radio.
Abstract: Spread Spectrum It’s not just for breakfast anymore! Don't blame me, the title is the work of this month's guest columnist, Steve Bible, N7HPR (n7hpr@tapr.org). While cruising the net recently, I noticed a sudden bump in the number of times Spread Spectrum (SS) techniques were mentioned in the amateur digital areas. While QEX has discussed SS in the past, we haven't touched on it in this forum. Steve was a frequent cogent contributor, so I asked him to give us some background. Steve enlisted in the Navy in 1977 and became a Data Systems Technician, a repairman of shipboard computer systems. In 1985 he was accepted into the Navy’s Enlisted Commissioning Program and attended the University of Utah where he studied computer science. Upon graduation in 1988 he was commissioned an Ensign and entered Nuclear Power School. His subsequent assignment was onboard the USS Georgia, a trident submarine stationed in Bangor, Washington. Today Steve is a Lieutenant and he is completing a master’s degree in computer science at the Naval Postgraduate School in Monterey, California. His areas of interest are digital communications, amateur satellites, VHF/UHF contesting, and QRP. His research area closely follows his interest in amateur radio. His thesis topic is Multihop Packet Radio Routing Protocol Using Dynamic Power Control. Steve is also the AMSAT Area Coordinator for the Monterey Bay area. Here's Steve, I'll have some additional comments at the end.

8,781 citations

Book
01 Jan 1959
TL;DR: The general decision problem, the Probability Background, Uniformly Most Powerful Tests, Unbiasedness, Theory and First Applications, and UNbiasedness: Applications to Normal Distributions, Invariance, Linear Hypotheses as discussed by the authors.
Abstract: The General Decision Problem.- The Probability Background.- Uniformly Most Powerful Tests.- Unbiasedness: Theory and First Applications.- Unbiasedness: Applications to Normal Distributions.- Invariance.- Linear Hypotheses.- The Minimax Principle.- Multiple Testing and Simultaneous Inference.- Conditional Inference.- Basic Large Sample Theory.- Quadratic Mean Differentiable Families.- Large Sample Optimality.- Testing Goodness of Fit.- General Large Sample Methods.

6,480 citations

Journal ArticleDOI
Jorma Rissanen1
TL;DR: The number of digits it takes to write down an observed sequence x1,...,xN of a time series depends on the model with its parameters that one assumes to have generated the observed data.

6,254 citations

Journal ArticleDOI

4,805 citations