scispace - formally typeset
Open AccessJournal ArticleDOI

An efficient test vector compression scheme using selective Huffman coding

TLDR
Results indicate that the proposed scheme can provide test data compression nearly equal to that of an optimum Huffman code with much less area overhead for the decoder.
Abstract
This paper presents a compression/decompression scheme based on selective Huffman coding for reducing the amount of test data that must be stored on a tester and transferred to each core in a system-on-a-chip (SOC) during manufacturing test. The test data bandwidth between the tester and the SOC is a bottleneck that can result in long test times when testing complex SOCs that contain many cores. In the proposed scheme, the test vectors for the SOC are stored in compressed form in the tester memory and transferred to the chip where they are decompressed and applied to the cores. A small amount of on-chip circuitry is used to decompress the test vectors. Given the set of test vectors for a core, a modified Huffman code is carefully selected so that it satisfies certain properties. These properties guarantee that the codewords can be decoded by a simple pipelined decoder (placed at the serial input of the core's scan chain) that requires very small area. Results indicate that the proposed scheme can provide test data compression nearly equal to that of an optimum Huffman code with much less area overhead for the decoder.

read more

Content maybe subject to copyright    Report

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 22, NO. 6, JUNE 2003 797
An Efficient Test Vector Compression Scheme
Using Selective Huffman Coding
Abhijit Jas, Jayabrata Ghosh-Dastidar, Mom-Eng Ng, and
Nur A. Touba
Abstract—This paper presents a compression/decompression scheme
based on selective Huffman coding for reducing the amount of test
data that must be stored on a tester and transferred to each core in
a system-on-a-chip (SOC) during manufacturing test. The test data
bandwidth between the tester and the SOC is a bottleneck that can
result in long test times when testing complex SOCs that contain many
cores. In the proposed scheme, the test vectors for the SOC are stored in
compressed form in the tester memory and transferred to the chip where
they are decompressed and applied to the cores. A small amount of on-chip
circuitry is used to decompress the test vectors. Given the set of test vectors
for a core, a modified Huffman code is carefully selected so that it satisfies
certain properties. These properties guarantee that the codewords can be
decoded by a simple pipelined decoder (placed at the serial input of the
core’s scan chain) that requires very small area. Results indicate that the
proposed scheme can provide test data compression nearly equal to that of
an optimum Huffman code with much less area overhead for the decoder.
Index Terms—Automatic test equipment, compression, decompression
architecture, embedded core testing, testing time, test set encoding.
I. INTRODUCTION
One of the key concerns in any design project is to meet
time-to-market constraints. In order to accomplish this goal, chip
designers often use predesigned and preverified cores to develop
systems-on-a-chip (SOC) devices. With time, these devices have
become extremely complex. This high level of integration has allowed
vendors to drive down the effective manufacturing costs. However, it
has also rapidly increased the complexity of testing these chips. One
of the increasingly difficult challenges in testing SOCs is dealing with
the large amount of test data that must be transferred between the
tester and the chip [9], [34]. Each core in an SOC has a given set of test
vectors that must be applied to it (usually through a test wrapper that is
provided around a core). The test vectors must be stored on the tester
and then transferred to the inputs of the core during modular testing.
As more and more cores (each with its own test set) are placed on a
single chip, the amount of total test data for the chip increases rapidly.
This poses a serious problem because of the cost and limitations of
automated test equipment (ATE). Testers have limited speed, channel
capacity, and memory. In general, the amount of time required to test a
chip depends on how much test data needs to be transferred to the chip
Manuscript received July 6, 2002; revised September 30, 2002. This work
was supported in part by the National Science Foundation under Grant no. MIP-
9702236 and in part by the Texas Advanced Research Program under Grant
no. 1997-003658-369. This paper was recommended by Associate Editor K.
Chakrabarty.
A. Jas waswiththeDepartmentofElectricaland Computer Engineering, Uni-
versity of Texas, Austin, TX 78712-1084. He is now with the Intel Corporation,
Austin, TX 78746 USA.
J. Ghosh-Dastidar was with the Department of Electrical and Computer En-
gineering, University of Texas, Austin, TX 78712-1084 USA. He is now with
the Altera Corporation, San Jose, CA 95134 USA.
M.-E. Ng was with the Department of Electrical and Computer Engineering,
University of Texas, Austin, TX 78712-1084. She is now with Advanced Micro
Devices, Austin, TX 78741 USA.
N. A. Touba is with the Computer Engineering Research Center, Depart-
ment of Electrical and Computer Engineering, University of Texas, Austin, TX
78712-1084 USA (e-mail: touba@ece.utexas.edu).
Digital Object Identifier 10.1109/TCAD.2003.811452
and how fast the data can be transferred (i.e., the test data bandwidth
to the chip). This depends on the speed and channel capacity of the
tester and the organization and characteristics of the scan chains on
the chip. Both test time and test storage are major concerns for SOCs
from a test economics point of view.
This paper presents a statistical compression/decompression scheme
to reduce the amount of test data that needs to be stored on the tester
and transferred to the chip (preliminary results were published in [24]).
The idea is to store the test vectors for a core in the tester memory in
compressed form, and then transfer the compressed vectors to the chip,
where a small amount of on-chip circuitry is used to decompress the test
vectors. Instead of having to transfer each entire test vector from the
tester to the core, a smaller amount of compressed data is transferred
instead. The approach presented here significantly reduces both test
storage requirements and the overall test time.
Transferringcompressed test vectors takes less time than transferring
the full vectors at a given bandwidth. However, in order to guarantee
a reduction in the overall test time, the decompression process should
not add delay (which would subtract from the time saved in transferring
the test data). Moreover, the on-chip decompression circuitry must be
small so that it does not add significant area overhead. Given a set of
test vectors, a method is presented here for choosing a statistical code
(a modified form of Huffman coding) that can be decoded with a simple
pipelined decoder. The properties of the code are chosen such that the
pipelined decoder has a very small area and is guaranteed to be able to
decode the test data as fast as the tester can transfer it.
The compression/decompression scheme presented in this paper can
be used for generating any set of deterministic scan vectors. It preserves
the sequence of the vectors and requires no modifications to the circuit-
under-test (CUT). It does not require any knowledge of the internal
design of the CUT, and, thus, is suitable for testing intellectual property
cores where the core supplier does not provide any information about
the internal structure of the core.
II. R
ELATED WORK
The problem of reducing test time and test data for core-based
SOCs has been attacked from several different angles in recent
literature. Novel approaches for compressing test data using the
Burrows–Wheeler transform and run-length coding were presented
in [21], [33]. These schemes were developed for reducing the time
to transfer test data from a workstation across a network to a tester
(not for use on chips). Scan chain architectures for core-based
designs that maximize bandwidth utilization are presented in [1].
A technique for compression/decompression of scan vectors using
cyclical decompressors and run-length coding is described in [23].
A modular built-in self-test (BIST approach that allows sharing of
BIST control logic among multiple cores is presented in [30]. A novel
technique for combining BIST and external testing across multiple
cores is described in [32]. The idea of statistically encoding test data
was presented in [22]. They described a BIST scheme for nonscan
circuits based on statistical coding using comma codes (very similar to
Huffman codes) and run-length coding. An approach called “parallel
serial full scan (PSFS) for reducing test time in cores is presented in
[16]. A technique to reduce test data and test time by using specially
designed cores (cores with virtual scan chains) is presented in [26]. An
approach that uses a linear combinational expander circuit is described
in [2]. The use of Golomb codes and frequency-directed run-length
(FDR) codes for compressing test data have been demonstrated
in [5]–[7], respectively. The use of variable length input Huffman
0278-0070/03$17.00 © 2003 IEEE

798 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 22, NO. 6, JUNE 2003
codes for SOC test data compression has been proposed in [14]. A
fixed-to-fixed block encoding scheme is described in [28]. Techniques
for reusing scan chains from other cores in an SOC to increase the
test data bandwidth has been described in [11], and automatic test
pattern generation (ATPG) techniques for producing test cubes that are
suitable for encoding, using the above technique, have been described
in [12]. A fault simulation-based technique to reduce the entropy
of the test vector set by pattern transformation is described in [19].
Such transformations increase the amount of compression that can be
achieved on the transformed test set using statistical coding. ATPG
algorithms for producing test vectors that can more effectively be com-
pressed using statistical codes have been described in [20]. Test vector
compression based on hybrid BIST techniques have been described
in [10], [29], and [27]. A novel scheme of test vector compression
using an embedded processor is described in [25]. In [13], a test vector
compression technique based on geometric primitives is proposed.
Very recently, a new line of research has focused on compressing test
data volume while optimizing other factors like test power [8], [31].
Although all the techniques mentioned above have the same high-
level objective (that of reducing test data and/or test application time),
the different approaches present different design alternatives and their
applicability to a particular design situation varies from case to case.
This paper presents a selective Huffman coding scheme for testing
cores with internal scan. One of the features of this approach is that
the code that is used for a particular core is carefully chosen such that
only a small decoder circuit is required. There are no restrictions on
the order of the test set, and no modifications need to be made to the
core-under-test. The small decoder circuit is simply placed at the serial
input of the core’s scan chain. As will be shown, the decoder provides a
significant reduction in the amount of test data that must be transported
from the tester to the core.
III. S
TATISTICAL CODING
The compression/decompression scheme described in this paper is
based on statistical coding. In statistical coding, variable length code-
words are used to represent fixed-length blocks of bits in a data set. For
example, if a data set is divided into four-bit blocks, then there are
2
4
or 16 unique four-bit blocks. Each of the 16 possible four-bit blocks
can be represented by a binary codeword. The size of each codeword
is variable (it need not be four bits). The idea is to make the codewords
that occur most frequently have a smaller number of bits, and those that
occur least frequently to have a larger number of bits. This minimizes
the average length of a codeword. The goal is to obtain a coded repre-
sentation of the original data set that has the smallest number of bits.
A Huffman code [18] is an optimal statistical code that is proven
to provide the shortest average codeword length among all uniquely
decodable variable length codes. A Huffman code is obtained by con-
structing a Huffman tree. The path from the root to each leaf gives the
codeword for the binary string corresponding to the leaf. An example
of constructing a Huffman code can be seen in Table I and Figs. 1 and
2. An example of a test set divided into four-bit blocks is shown in
Fig. 1. Table I shows the frequency of occurrence of each of the pos-
sible blocks (referred to as symbols). There are a total of 60 four-bit
blocks in the example in Fig. 1. Fig. 2 shows the Huffman tree for this
frequency distribution and the corresponding codewords are shown in
Table I.
An important property of Huffman codes is that they are prefix-free.
No codeword is a prefix of another codeword. This greatly simplifies
the decoding process. The decoder can instantaneously recognize the
end of a codeword uniquely without any look ahead.
TABLE I
S
TATISTICAL CODING BASED ON SYMBOL FREQUENCIES FOR
TEST
SET IN FIG.1
Fig. 1. Example of test set divided into four-bit blocks.
Fig. 2. Huffman tree for the code shown in Table I.
The amount of compression that can be achieved with statistical
coding depends on how skewed the frequency of occurrence is for the
differentcodewords.If all ofthe codewordsoccur with equal frequency,
then no compression can be achieved. It is well known, however, that
the test vectors in a test set tend to have a lot of correlations. This arises
from the fact that faults in the CUT that are structurally related require
similar input value assignments in order to be provoked and sensitized
to an output. This often results in skewed frequency of occurrence for
different codewords. Moreover, for test cubes, the compression can be
very large. The don’t care bits (
X
’s) provide flexibility to allow a block
to be encoded with more than one possible codeword. The shortest pos-
sible codeword can be chosen for each block to maximize the compres-
sion. Algorithms for filling test cubes for maximizing compression are
described in Section VI.

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 22, NO. 6, JUNE 2003 799
Fig. 3. Block diagram illustrating compression/decompression scheme for a slower tester clock.
Fig. 4. Block diagram illustrating compression/decompression scheme using single tester channel to feed multiple scan chains.
To fully exploit the correlations in a test set, the number of bits in
each scan vector should be a multiple of the fixed-length block size
used for the statistical code. When dividing the test set into
b
-bit blocks
for coding, if the size of the scan vectors is not a multiple of
b
, then
X
’s
can be added topad the start of the vectors (firstbits shifted into the scan
chain) to make the length a multiple of
b
. Shifting some extra bits (at the
start of the vector) into the scan chain does not matter provided the final
contents of the scan chain form the correct test vector when it is applied
to the core-under-test. Having each scan vector be a multiple of the
block size aligns the blocks within the vectors so that the correlations
between the bits will skew the frequencies.
IV. O
VERVIEW OF THE PROPOSED SCHEME
The hardware architecture for the proposed scheme is explained in
this section for a single scan chain. The compression/decompression
scheme proposed here involves statistically coding the scan vectors and
then placing an on-chip decoder at the serial input of the scan chain
to decompress the vectors. A block diagram illustrating the scheme is
shown in Fig. 3. The tester channel shifts a constant stream of variable
length codewords (corresponding to compressed scan data) to the de-
coder. The decoder generates the corresponding fixed-lengthblocks. At
every tester clock cycle the decoder receives one bit from the tester. It
takes the decoder
L
clock cycles to decode a codeword, where
L
is the
length of the codeword. Once the decoder has decoded the codeword, it
has to shift the decoded source data into the scan chain. It is not desir-
able for the tester to wait for the decoder to finish shifting the decoded
output into the scan chain. This is because any such wait time induced
by the decoder will reduce the test time reduction that can otherwise be
obtained by compressing the test data. For this reason, in this scheme, a
serializer is used to providesome degree of parallelism between the two
operations, one being the receiving of the input bits from the tester and
decoding them, and the other being the shifting of the decoded output
into the scan chain. Note that the serializer can provide the necessary
parallelism in the shift operation because the decoder produces all the
bits of the decoded output in parallel (at the same time). If the serializer
can shift the decoded output into the scan chain within the time it takes
the decoder to decode the next codeword, then the decoder can imme-
diately load the next decoded output into the serializer and continue
with the decoding process without having to stop the tester. Since, in
many cases, the number of bits
b
in the fixed-length decoded block is
greater than the number of bits in the codeword, the rate at which data
needs to be shifted out of the decoder is higher than the rate at which
the data is coming into the decoder. There are two ways to achieve this.
1) Use scan chain with faster clock than tester clock. This is illus-
trated in Fig. 3. If the system clock rate is faster than the tester
clock rate, then it may be possible to clock the scan chain at a
faster clock rate than the tester’s clock rate (as described in [17]).
The serializer placed between the decoder and the scan chain
is then also clocked at the faster system clock rate. The serial-
izer is loaded in parallel by the decoder (allowing the decoder
to generate multiple bits of data in a slower tester clock cycle)
and serially shifted out into the scan chain at a faster clock rate.
One advantage of this approach is that it can be used to provide
at-speed scan with a slow tester [17].
2) Use single tester channel to feed multiple scan chains. This is il-
lustrated in Fig. 4. If it is not possible to clock the scan chain with
a faster clock than the tester clock, then another approach is to
have the tester channel rotate between
n
scan chains (each scan
chain has its own decoder). At each clock cycle, the tester shifts

800 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 22, NO. 6, JUNE 2003
in a bit for a different decoder for each of the
n
scan chains.
Each of the
n
decoders simply samples its input once every
n
clock cycles in a different phase from the other decoders. For
example, if there are two scan chains
(
n
=2)
, then the decoder
for scan chain 1 would sample its input on even tester clock cy-
cles, and the decoder for scan chain 2 would sample its input on
odd tester clock cycles. With this approach, the “effective clock
rate” for each of the decoders is divided by
n
. However, the scan
chain corresponding to each decoder is still clocked at the normal
tester clock rate and, thus, its clock rate is
n
times faster than the
decoder. Each time the decoder is clocked once, the scan chain
is clocked
n
times.
In the remainder of this paper, without loss of generality, it will be
assumed that the scan clock is faster than the tester clock (i.e., cor-
responding to scenario 1 above). However, all of the concepts apply
equally as well for scenario 2 where the tester channel feeds multiple
scan chains such that the “effective clock rate” seen by each decoder is
slower than the clock rate of the scan chain.
To illustrate how the decoder and serializer work, consider the fol-
lowing example. Suppose the scan vectors are divided into four-bit
blocks, and each four-bit block is replaced by a variable length code-
word. The compressed test data stored on the tester consists of the vari-
able length codewords. These codewords are shifted into the decoder
as a continuous stream of bits. If the codewords are prefix-free, than
the decoder can easily recognize when it has received a complete code-
word. When the decoder has received a complete codeword,it loads the
corresponding four-bit block in parallel into the serializer. The contents
of the serializer are then shifted into the scan chain. If the scan chain is
clocked at twice the clock rate that the tester operates at, then after two
tester clock periods the entire contents of the serializer will be shifted
into the scan chain. During the two tester clock periods that the serial-
izer is in operation, the decoder can be receiving the next codeword.
The key to making the scheme work is careful selection of the sta-
tistical code that is used for compressing the test set. There are two
important issues that must be considered in selecting the code: one is
that the decoder must be small in order to keep the area overhead down,
and the other is that the decoder must not output the decompressed bits
into the serializer faster than they can be shifted out into the scan chain.
While a Huffman code gives the optimum compression for a test set di-
vided into a particular fixed-length block size, it generally requires a
very large decoder. A Huffman code for a fixed-length block size of
b
bits requires a finite state machine (FSM) decoder with
2
b
0
1
states.
Thus, the size of the decoder for a Huffman code grows exponentially
as the block size is increased. A method for selecting an efficient statis-
tical code for the proposed scheme is described in the following section.
In this scheme, the output response is assumed to be fully com-
pacted on-chip using standard response compaction hardware struc-
tures such as a multiple-input signature register (MISR). Test response
compaction is an extensively researched topic and several well-defined
techniques exist for doing so [4].
V. S
TATISTICAL CODE SELECTION FOR PROPOSED SCHEME
Given the test set for a core, a statistical code for compressing the test
set must be selected. There is a tradeoff in selecting the code between
the amount of compression that is achieved and the complexity of the
decoder. Moreover, if the clock frequency of the tester is
f
T
and the
clock frequency of the scan chain is
f
sys
(system clock frequency) then
the ratio of the system clock frequency and the tester clock frequency
f
sys
=f
T
limits the minimum size of a codeword. If the test set is divided
into fixed-lengthblocks of
b
bits,then the serializer will hold
b
bits,and,
thus, it takes
b
scan-clock cycles to shift the buffer’s contents into the
Fig. 5. Huffman tree for the three highest frequency symbols in Table I.
scan chain. During the time that the contents of the serializer are being
shifted into the scan chain, the tester is shifting bits into the decoder.
When the decoder receives a complete codeword, it needs to output the
corresponding block of
b
bits into the serializer. If the codeword is too
short, then the serializer may not have been emptied yet which would
cause a problem for the decoder.So, in order to ensure that the serializer
is always empty when the decoder finishes decoding a codeword, the
minimum size of a codeword
L
min
must be no smaller than the ratio of
the tester and scan-clock rates times the size of each block
L
min
b
f
T
f
sys
:
(i)
For example, if the block size is 8 and the scan-clock rate is twice the
tester clock rate, then the minimum size of a codeword is 4. Note that
if it is not possible to have the scan clock rate be faster then the tester
clock rate, then an alternative solution (as previously described) is to
make the scan clock rate be twice as fast as the “effective clock rate” as
seen by the decoder by simply having the tester channel feed two scan
chains so that the rate that the decoder receives data from the tester is
half as fast as the rate at which data can be shifted into the scan chain.
Using a Huffman code would provide the maximum compression,
however, it would require a complex decoder and may not satisfy the
constraint on the minimum size of a codeword. Therefore, some al-
ternative statistical code must be selected. The approach taken here
involves using a selective coding approach for which a very simple
decoder can be constructed. Consider the case where the test set is di-
vided into fixed-length blocks of
b
bits. There will be
2
b
codewords.
The first bit of each codeword will be used to indicate whether the fol-
lowing bits are coded or not. If the first bit of the codeword is a 0,
then the next
b
bits are not coded and can simply be passed through
the decoder as is (hence, the complete codeword has
b
+1
bits). If
the first bit of the codeword is a 1, then the next variable number of
bits form a prefix-free code that will be translated by the decoder into a
b
-bit block. The idea is to only code the most frequently occurring
b
-bit
blocks using codewords with small numbers of bits (less than
b
,but
greater than or equal to
L
min
). Compression is achieved by having the
most common
b
-bit blocks be represented by codewords with less than
b
bits. The decoder is simple because only a small number of blocks
are coded. The vast majority of the blocks are not coded and can be
simply passed through the decoder. If
n
blocks are coded, then the de-
coder can be implemented with an FSM having no more than
n
+
b
states (compared with a Huffman code which requires
2
b
0
1
states).
An example to illustrate the proposed approach for selecting a sta-
tistical code is shown in Fig. 5. Consider the test set in Fig. 1. If the
entire test set is divided into four-bit blocks then the frequency distri-
bution obtained is shown in the second column of Table I. As can be
seen from Table I, the symbols having the highest frequencies are
0010
,
0100
, and
0110
. So, these are the symbols that are coded while the rest
of them will be left unchanged. A Huffman tree for the three patterns is
constructed to get their codewords (as shown in Fig. 5). The codewords
for the remaining 13 symbols are simply a 0 followed by the symbol
itself (as shown in the last column of Table I).
The two important parameters in selecting the code are the block size
b
and the number of coded blocks
n
. Once those havebeen chosen, then

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 22, NO. 6, JUNE 2003 801
Fig. 6. Example Test Set to Illustrate Alg1 for Filling Test Cubes.
the procedure for constructing the code is mechanical. A Huffman tree
is formed for the
n
most frequently occurring
b
-bit blocks. The code-
words for the most frequently occurring blocks are simply a 1, followed
by the Huffman code obtained from the Huffman tree. The codewords
for the remaining blocks are simply a 0, followed by the
b
-bit block
itself. The amount of area overhead for the decoder can be controlled
by placing an upper bound on the values of
n
and
b
. An increase in
n
implies an increase in the number of states of the decoder where in the
limiting case when all patterns are encoded the decoder becomes a full
Huffman decoder. An increase in the block size
b
, on the other hand,
implies an increase in the serializer area that is required for this scheme.
In this case, the limit is to make each test vector a pattern which results
in a large hardware overhead to regenerate the test vectors from the
codewords. The effect of
n
and
b
on the amount of compression will
be discussed in greater detail in the experimental results section. For a
particular value of
b
, the amount of compression that will be achieved
can be computed in linear time with respect to the number of bits in the
test set. Thus, the best value of
b
can be efficiently determined through
experimentation. Several values of
b
can be tried for a particular test
set to determine which gives the best compression. Similarly, the best
value for
n
can also be efficiently determined through experimentation.
VI. A
LGORITHMS FOR FILLING TEST CUBES
One of the advantages of implementing this selective Huffman en-
coding scheme on test cubes is that the unspecified bits can be filled
with 1’s and 0’s in a way that the frequency distribution of the patterns
becomes skewed. This helps in maximizing the compression. There are
several algorithms that can be used to fill the
X
’s. In this section, two
will be discussed.
When the block size is sufficiently small, an exact analysis can be
done by considering all binary combinations (minterms) contained in
the unspecified blocks. This algorithm (henceforth, referred to as Alg1)
is illustrated with an example in the following paragraphs.
Fig. 6 shows an example test set consisting of three test cubes,
each of length 12. Let the block size be
b
=4
. Hence, the
three test cubes shown above are partitioned into a set of 9
four-bit blocks,
B
=
f
10
X
1
;XX
10
;
1
XXX; X
011
;
10
X
1
;
10
X
1
;
0
X
10
;
101
X;
1
XXX
g
. Each unspecified block can contain
from 1 (if fully specified) to
2
4
=16
(if completely unspecified)
possible binary combinations (minterms). For each of the 16 possible
minterms for a block, the frequency of occurrence is determined
by seeing how many of the unspecified blocks (in set
B
) contain
that minterm. For example, the minterm 1111 is contained in two of
the unspecified blocks in the set
B
, while the minterm 0000 is not
contained in any of the unspecified blocks. The minterm that occurs
most frequently (i.e., is contained in the largest number of unspecified
blocks in set
B
) is selected first. The
X
’s in each unspecified block
that contains the most frequent minterm are specified so that it matches
that minterm, and the unspecified block is then removed from the set
B
. The frequency of occurrence for each of the remaining minterms is
then recomputed since the set
B
has been changed, and the procedure
repeats until the set
B
is empty. This procedure maximizes the
frequency of occurrence of the codewords thereby increasing the
encoding efficiency of the statistical encoding.
In the example in Fig. 6, the most frequently occurring minterm
is 1011. Seven of the unspecified blocks in
B
contain 1011, so after
Fig. 7. Example test set from Fig. 6 after
X
’s are filled using Alg1.
the first iteration, the set
B
will contain only {
XX
10
and
0
X
10
}.
The most frequently occurring minterm in the second iteration is 0010,
which is contained in both remaining unspecifiedblocks. The test cubes
after specifying all the
X
’s is shown in Fig. 7.
Alg1 provides an exact analysis of the frequency distribution of the
minterms by considering all possibilities. However, this comes at a cost
in terms of the runtime of the algorithm. It is easy to see that the algo-
rithm is exponential in block size
b
and, hence, the use of this algorithm
should be limited to small block sizes only. However, there are alter-
nate ways to specify the don’t care bits to maximize the compression
which trade off accuracy for faster runtime. The next algorithm (hence-
forth, referred to as Alg2) is extremely fast and in most cases produces
results comparable to the first algorithm. Alg2 is illustrated next with
the same example used in the previous case.
In Alg2, the most frequently occurring unspecified block is identi-
fied. It is then compared with the next most frequently occurring un-
specified block to see if there is a conflict in any bit position (i.e., one
has a 1 and the other has a 0, or vise versa). If there is no conflict, then
theyare merged by specifying all bit positions in which either block has
a specified value. For example, if block
X
0
X
1
is merged with block
X
01
X
, then the resulting block is
X
011
. Note that merging blocks
can only increase the number of specified bits. The most frequently
occurring unspecified block is compared with all the other unspecified
blocks in decreasing order of frequency and whenever merging is pos-
sible, it is done. This is done until no more merging can be done with
the most frequently occurring unspecified block. This process is then
repeated for the second most frequently occurring unspecified block.
This continues until there are no more blocks that can be merged. At
this point, all the remaining blocks are unique and cannot share any
minterms. Any remaining
X
’s can now be randomly filled with 0’s and
1’s as they will have no impact on the amount of compression. Alg2
fills the
X
’s by greedily merging unspecified blocks based on their
frequency of occurrence. This is a heuristic that skews the frequency
of occurrence, however, unlike Alg1, it is not guaranteed to maximize
the encoding efficiency since the greedy procedure may miss a better
merging order. However, it is a much faster procedure than Alg1 as the
number of operations is much less because merging is done right away
to reduce the set of blocks.
Consider applying Alg2 to the example test data shown in Fig. 6.
The set
B
as described earlier has 6 unique unspecified blocks
10
X
1
,
XX
10
,
1
XXX
,
X
011
,
0
X
10
and
101
X
. Let the set of these 6
unique blocks be denoted by
B
uniq
. Of these six unique blocks, the
frequency of occurrence of block
10
X
1
is 3, that of block
1
XXX
is
2, and, for the rest, the frequency is 1. In the first step of the algorithm,
since the block
10
X
1
is the most frequently occurring, it is compared
with the next most frequently occurring block which is
1
XXX
. Since
there are no conflicts, they are merged thereby reducing the set
B
uniq
.
The merged block
10
X
1
is then compared with the other blocks
that have frequency 1, and is merged with
X
011
and
101
X
. At this
point, the set
B
uniq
=
f
1011
;XX
10
;
0
X
10
g
. The procedure is
then repeated again, starting with the next most frequently occurring
unspecified block. In the end,
B
uniq
=
f
1011
;
0
X
10
g
and no more
merging can be done. The final test vector set is shown in Fig. 8.
Note that unlike the previous algorithm, in this case it is possible to
have some don’t care bits left over in the transformed test set which
can now be randomly filled with 1’s or 0’s without having any impact
on the amount of compression. The amount of compression obtained

Citations
More filters
Book

VLSI Test Principles and Architectures: Design for Testability (Systems on Silicon)

TL;DR: This book is a comprehensive guide to new DFT methods that will show the readers how to design a testable and quality product, drive down test cost, improve product quality and yield, and speed up time-to-market and time- to-volume.
Journal ArticleDOI

Survey of Test Vector Compression Techniques

TL;DR: This article summarizes and categories hardware-based test vector compression techniques for scan architectures, which fall broadly into three categories: code-based schemes use data compression codes to encode test cubes; linear-decompression- based schemes decompress the data using only linear operations; and broadcast-scan-based scheme rely on broadcasting the same values to multiple scan chains.
Book

VLSI Test Principles and Architectures: Design for Testability

TL;DR: A comprehensive guide to new DFT methods that will show the readers how to design a testable and quality product, drive down test cost, improve product quality and yield, and speed up time to market and time-to-volume as mentioned in this paper.
Journal ArticleDOI

Nine-coded compression technique for testing embedded cores in SoCs

TL;DR: A new test-data compression technique that uses exactly nine codewords that provides significant reduction in test- data volume and test-application time and is flexible in utilizing both fixed- and variable-length blocks.
Journal ArticleDOI

Optimal Selective Huffman Coding for Test-Data Compression

TL;DR: In this paper, the authors show that the already proposed encoding scheme is not optimal and present a new one, proving that it is optimal Moreover, they compare the two encodings theoretically and derive a set of conditions which show that, in practical cases, the proposed encoding always offers better compression in terms of hardware overhead.
References
More filters
Journal ArticleDOI

A Method for the Construction of Minimum-Redundancy Codes

TL;DR: A minimum-redundancy code is one constructed in such a way that the average number of coding digits per message is minimized.
Journal ArticleDOI

A method for the construction of minimum-redundancy codes

TL;DR: A minimum-redundancy code is one constructed in such a way that the average number of coding digits per message is minimized.
Proceedings ArticleDOI

Combinational profiles of sequential benchmark circuits

TL;DR: A set of 31 digital sequential circuits described at the gate level that extend the size and complexity of the ISCAS'85 set of combinational circuits and can serve as benchmarks for researchers interested in sequential test generation, scan-basedtest generation, and mixed sequential/scan-based test generation using partial scan techniques.
Book

Essentials of Electronic Testing for Digital, Memory and Mixed-Signal VLSI Circuits

TL;DR: This book provides a careful selection of essential topics on all three types of circuits, namely, digital, memory, and mixed-signal, each requiring different test and design for testability methods.
Proceedings ArticleDOI

Test set compaction algorithms for combinational circuits

TL;DR: In this paper, two new algorithms, redundant vector elimination (RVE) and essential fault reduction (EFR), were proposed for generating compact test sets for combinational circuits under the single stuck at fault model.
Related Papers (5)
Frequently Asked Questions (10)
Q1. What contributions have the authors mentioned in the paper "An efficient test vector compression scheme using selective huffman coding" ?

This paper presents a compression/decompression scheme based on selective Huffman coding for reducing the amount of test data that must be stored on a tester and transferred to each core in a system-on-a-chip ( SOC ) during manufacturing test. Results indicate that the proposed scheme can provide test data compression nearly equal to that of an optimum Huffman code with much less area overhead for the decoder. 

The serializer is loaded in parallel by the decoder (allowing the decoder to generate multiple bits of data in a slower tester clock cycle) and serially shifted out into the scan chain at a faster clock rate. 

To fully exploit the correlations in a test set, the number of bits in each scan vector should be a multiple of the fixed-length block size used for the statistical code. 

Such transformations increase the amount of compression that can be achieved on the transformed test set using statistical coding. 

In this case, the limit is to make each test vector a pattern which results in a large hardware overhead to regenerate the test vectors from the codewords. 

For all the other techniques (Golomb, FDR, and VIHC), the zero-fill algorithm was used for filling the X’s as that maximizes the amount of compression using those methods. 

Note that if it is not possible to have the scan clock rate be faster then the tester clock rate, then an alternative solution (as previously described) is to make the scan clock rate be twice as fast as the “effective clock rate” as seen by the decoder by simply having the tester channel feed two scan chains so that the rate that the decoder receives data from the tester is half as fast as the rate at which data can be shifted into the scan chain. 

the scan chain corresponding to each decoder is still clocked at the normal tester clock rate and, thus, its clock rate is n times faster than the decoder. 

The compression/decompression scheme proposed here involves statistically coding the scan vectors and then placing an on-chip decoder at the serial input of the scan chain to decompress the vectors. 

If it is not possible to clock the scan chain with a faster clock than the tester clock, then another approach is to have the tester channel rotate between n scan chains (each scan chain has its own decoder).