What other techniques were used for filling the X’s?

For all the other techniques (Golomb, FDR, and VIHC), the zero-fill algorithm was used for filling the X’s as that maximizes the amount of compression using those methods.

(Open Access) An efficient test vector compression scheme using selective Huffman coding (2003) | A. Jas

Q: Why is the serializer used in parallel?

The serializer is loaded in parallel by the decoder (allowing the decoder to generate multiple bits of data in a slower tester clock cycle) and serially shifted out into the scan chain at a faster clock rate.

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 22, NO. 6, JUNE 2003 797

An Efficient Test Vector Compression Scheme

Using Selective Huffman Coding

Abhijit Jas, Jayabrata Ghosh-Dastidar, Mom-Eng Ng, and

Nur A. Touba

Abstract—This paper presents a compression/decompression scheme

based on selective Huffman coding for reducing the amount of test

data that must be stored on a tester and transferred to each core in

a system-on-a-chip (SOC) during manufacturing test. The test data

bandwidth between the tester and the SOC is a bottleneck that can

result in long test times when testing complex SOCs that contain many

cores. In the proposed scheme, the test vectors for the SOC are stored in

compressed form in the tester memory and transferred to the chip where

they are decompressed and applied to the cores. A small amount of on-chip

circuitry is used to decompress the test vectors. Given the set of test vectors

for a core, a modified Huffman code is carefully selected so that it satisfies

certain properties. These properties guarantee that the codewords can be

decoded by a simple pipelined decoder (placed at the serial input of the

core’s scan chain) that requires very small area. Results indicate that the

proposed scheme can provide test data compression nearly equal to that of

an optimum Huffman code with much less area overhead for the decoder.

Index Terms—Automatic test equipment, compression, decompression

architecture, embedded core testing, testing time, test set encoding.

I. INTRODUCTION

One of the key concerns in any design project is to meet

time-to-market constraints. In order to accomplish this goal, chip

designers often use predesigned and preverified cores to develop

systems-on-a-chip (SOC) devices. With time, these devices have

become extremely complex. This high level of integration has allowed

vendors to drive down the effective manufacturing costs. However, it

has also rapidly increased the complexity of testing these chips. One

of the increasingly difficult challenges in testing SOCs is dealing with

the large amount of test data that must be transferred between the

tester and the chip [9], [34]. Each core in an SOC has a given set of test

vectors that must be applied to it (usually through a test wrapper that is

provided around a core). The test vectors must be stored on the tester

and then transferred to the inputs of the core during modular testing.

As more and more cores (each with its own test set) are placed on a

single chip, the amount of total test data for the chip increases rapidly.

This poses a serious problem because of the cost and limitations of

automated test equipment (ATE). Testers have limited speed, channel

capacity, and memory. In general, the amount of time required to test a

chip depends on how much test data needs to be transferred to the chip

Manuscript received July 6, 2002; revised September 30, 2002. This work

was supported in part by the National Science Foundation under Grant no. MIP-

9702236 and in part by the Texas Advanced Research Program under Grant

no. 1997-003658-369. This paper was recommended by Associate Editor K.

Chakrabarty.

A. Jas waswiththeDepartmentofElectricaland Computer Engineering, Uni-

versity of Texas, Austin, TX 78712-1084. He is now with the Intel Corporation,

Austin, TX 78746 USA.

J. Ghosh-Dastidar was with the Department of Electrical and Computer En-

gineering, University of Texas, Austin, TX 78712-1084 USA. He is now with

the Altera Corporation, San Jose, CA 95134 USA.

M.-E. Ng was with the Department of Electrical and Computer Engineering,

University of Texas, Austin, TX 78712-1084. She is now with Advanced Micro

Devices, Austin, TX 78741 USA.

N. A. Touba is with the Computer Engineering Research Center, Depart-

ment of Electrical and Computer Engineering, University of Texas, Austin, TX

78712-1084 USA (e-mail: touba@ece.utexas.edu).

Digital Object Identifier 10.1109/TCAD.2003.811452

and how fast the data can be transferred (i.e., the test data bandwidth

to the chip). This depends on the speed and channel capacity of the

tester and the organization and characteristics of the scan chains on

the chip. Both test time and test storage are major concerns for SOCs

from a test economics point of view.

This paper presents a statistical compression/decompression scheme

to reduce the amount of test data that needs to be stored on the tester

and transferred to the chip (preliminary results were published in [24]).

The idea is to store the test vectors for a core in the tester memory in

compressed form, and then transfer the compressed vectors to the chip,

where a small amount of on-chip circuitry is used to decompress the test

vectors. Instead of having to transfer each entire test vector from the

tester to the core, a smaller amount of compressed data is transferred

instead. The approach presented here significantly reduces both test

storage requirements and the overall test time.

Transferringcompressed test vectors takes less time than transferring

the full vectors at a given bandwidth. However, in order to guarantee

a reduction in the overall test time, the decompression process should

not add delay (which would subtract from the time saved in transferring

the test data). Moreover, the on-chip decompression circuitry must be

small so that it does not add significant area overhead. Given a set of

test vectors, a method is presented here for choosing a statistical code

(a modified form of Huffman coding) that can be decoded with a simple

pipelined decoder. The properties of the code are chosen such that the

pipelined decoder has a very small area and is guaranteed to be able to

decode the test data as fast as the tester can transfer it.

The compression/decompression scheme presented in this paper can

be used for generating any set of deterministic scan vectors. It preserves

the sequence of the vectors and requires no modifications to the circuit-

under-test (CUT). It does not require any knowledge of the internal

design of the CUT, and, thus, is suitable for testing intellectual property

cores where the core supplier does not provide any information about

the internal structure of the core.

II. R

ELATED WORK

The problem of reducing test time and test data for core-based

SOCs has been attacked from several different angles in recent

literature. Novel approaches for compressing test data using the

Burrows–Wheeler transform and run-length coding were presented

in [21], [33]. These schemes were developed for reducing the time

to transfer test data from a workstation across a network to a tester

(not for use on chips). Scan chain architectures for core-based

designs that maximize bandwidth utilization are presented in [1].

A technique for compression/decompression of scan vectors using

cyclical decompressors and run-length coding is described in [23].

A modular built-in self-test (BIST approach that allows sharing of

BIST control logic among multiple cores is presented in [30]. A novel

technique for combining BIST and external testing across multiple

cores is described in [32]. The idea of statistically encoding test data

was presented in [22]. They described a BIST scheme for nonscan

circuits based on statistical coding using comma codes (very similar to

Huffman codes) and run-length coding. An approach called “parallel

serial full scan (PSFS) for reducing test time in cores is presented in

[16]. A technique to reduce test data and test time by using specially

designed cores (cores with virtual scan chains) is presented in [26]. An

approach that uses a linear combinational expander circuit is described

in [2]. The use of Golomb codes and frequency-directed run-length

(FDR) codes for compressing test data have been demonstrated

in [5]–[7], respectively. The use of variable length input Huffman

798 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 22, NO. 6, JUNE 2003

codes for SOC test data compression has been proposed in [14]. A

fixed-to-fixed block encoding scheme is described in [28]. Techniques

for reusing scan chains from other cores in an SOC to increase the

test data bandwidth has been described in [11], and automatic test

pattern generation (ATPG) techniques for producing test cubes that are

suitable for encoding, using the above technique, have been described

in [12]. A fault simulation-based technique to reduce the entropy

of the test vector set by pattern transformation is described in [19].

Such transformations increase the amount of compression that can be

achieved on the transformed test set using statistical coding. ATPG

algorithms for producing test vectors that can more effectively be com-

pressed using statistical codes have been described in [20]. Test vector

compression based on hybrid BIST techniques have been described

in [10], [29], and [27]. A novel scheme of test vector compression

using an embedded processor is described in [25]. In [13], a test vector

compression technique based on geometric primitives is proposed.

Very recently, a new line of research has focused on compressing test

data volume while optimizing other factors like test power [8], [31].

Although all the techniques mentioned above have the same high-

level objective (that of reducing test data and/or test application time),

the different approaches present different design alternatives and their

applicability to a particular design situation varies from case to case.

This paper presents a selective Huffman coding scheme for testing

cores with internal scan. One of the features of this approach is that

the code that is used for a particular core is carefully chosen such that

only a small decoder circuit is required. There are no restrictions on

the order of the test set, and no modifications need to be made to the

core-under-test. The small decoder circuit is simply placed at the serial

input of the core’s scan chain. As will be shown, the decoder provides a

significant reduction in the amount of test data that must be transported

from the tester to the core.

III. S

TATISTICAL CODING

The compression/decompression scheme described in this paper is

based on statistical coding. In statistical coding, variable length code-

words are used to represent fixed-length blocks of bits in a data set. For

example, if a data set is divided into four-bit blocks, then there are

or 16 unique four-bit blocks. Each of the 16 possible four-bit blocks

can be represented by a binary codeword. The size of each codeword

is variable (it need not be four bits). The idea is to make the codewords

that occur most frequently have a smaller number of bits, and those that

occur least frequently to have a larger number of bits. This minimizes

the average length of a codeword. The goal is to obtain a coded repre-

sentation of the original data set that has the smallest number of bits.

A Huffman code [18] is an optimal statistical code that is proven

to provide the shortest average codeword length among all uniquely

decodable variable length codes. A Huffman code is obtained by con-

structing a Huffman tree. The path from the root to each leaf gives the

codeword for the binary string corresponding to the leaf. An example

of constructing a Huffman code can be seen in Table I and Figs. 1 and

2. An example of a test set divided into four-bit blocks is shown in

Fig. 1. Table I shows the frequency of occurrence of each of the pos-

sible blocks (referred to as symbols). There are a total of 60 four-bit

blocks in the example in Fig. 1. Fig. 2 shows the Huffman tree for this

frequency distribution and the corresponding codewords are shown in

Table I.

An important property of Huffman codes is that they are prefix-free.

No codeword is a prefix of another codeword. This greatly simplifies

the decoding process. The decoder can instantaneously recognize the

end of a codeword uniquely without any look ahead.

TABLE I

TATISTICAL CODING BASED ON SYMBOL FREQUENCIES FOR

TEST

SET IN FIG.1

Fig. 1. Example of test set divided into four-bit blocks.

Fig. 2. Huffman tree for the code shown in Table I.

The amount of compression that can be achieved with statistical

coding depends on how skewed the frequency of occurrence is for the

differentcodewords.If all ofthe codewordsoccur with equal frequency,

then no compression can be achieved. It is well known, however, that

the test vectors in a test set tend to have a lot of correlations. This arises

from the fact that faults in the CUT that are structurally related require

similar input value assignments in order to be provoked and sensitized

to an output. This often results in skewed frequency of occurrence for

different codewords. Moreover, for test cubes, the compression can be

very large. The don’t care bits (

’s) provide flexibility to allow a block

to be encoded with more than one possible codeword. The shortest pos-

sible codeword can be chosen for each block to maximize the compres-

sion. Algorithms for filling test cubes for maximizing compression are

described in Section VI.

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 22, NO. 6, JUNE 2003 799

Fig. 3. Block diagram illustrating compression/decompression scheme for a slower tester clock.

Fig. 4. Block diagram illustrating compression/decompression scheme using single tester channel to feed multiple scan chains.

To fully exploit the correlations in a test set, the number of bits in

each scan vector should be a multiple of the fixed-length block size

used for the statistical code. When dividing the test set into

-bit blocks

for coding, if the size of the scan vectors is not a multiple of

, then

’s

can be added topad the start of the vectors (firstbits shifted into the scan

chain) to make the length a multiple of

. Shifting some extra bits (at the

start of the vector) into the scan chain does not matter provided the final

contents of the scan chain form the correct test vector when it is applied

to the core-under-test. Having each scan vector be a multiple of the

block size aligns the blocks within the vectors so that the correlations

between the bits will skew the frequencies.

IV. O

VERVIEW OF THE PROPOSED SCHEME

The hardware architecture for the proposed scheme is explained in

this section for a single scan chain. The compression/decompression

scheme proposed here involves statistically coding the scan vectors and

then placing an on-chip decoder at the serial input of the scan chain

to decompress the vectors. A block diagram illustrating the scheme is

shown in Fig. 3. The tester channel shifts a constant stream of variable

length codewords (corresponding to compressed scan data) to the de-

coder. The decoder generates the corresponding fixed-lengthblocks. At

every tester clock cycle the decoder receives one bit from the tester. It

takes the decoder

clock cycles to decode a codeword, where

is the

length of the codeword. Once the decoder has decoded the codeword, it

has to shift the decoded source data into the scan chain. It is not desir-

able for the tester to wait for the decoder to finish shifting the decoded

output into the scan chain. This is because any such wait time induced

by the decoder will reduce the test time reduction that can otherwise be

obtained by compressing the test data. For this reason, in this scheme, a

serializer is used to providesome degree of parallelism between the two

operations, one being the receiving of the input bits from the tester and

decoding them, and the other being the shifting of the decoded output

into the scan chain. Note that the serializer can provide the necessary

parallelism in the shift operation because the decoder produces all the

bits of the decoded output in parallel (at the same time). If the serializer

can shift the decoded output into the scan chain within the time it takes

the decoder to decode the next codeword, then the decoder can imme-

diately load the next decoded output into the serializer and continue

with the decoding process without having to stop the tester. Since, in

many cases, the number of bits

in the fixed-length decoded block is

greater than the number of bits in the codeword, the rate at which data

needs to be shifted out of the decoder is higher than the rate at which

the data is coming into the decoder. There are two ways to achieve this.

1) Use scan chain with faster clock than tester clock. This is illus-

trated in Fig. 3. If the system clock rate is faster than the tester

clock rate, then it may be possible to clock the scan chain at a

faster clock rate than the tester’s clock rate (as described in [17]).

The serializer placed between the decoder and the scan chain

is then also clocked at the faster system clock rate. The serial-

izer is loaded in parallel by the decoder (allowing the decoder

to generate multiple bits of data in a slower tester clock cycle)

and serially shifted out into the scan chain at a faster clock rate.

One advantage of this approach is that it can be used to provide

at-speed scan with a slow tester [17].

2) Use single tester channel to feed multiple scan chains. This is il-

lustrated in Fig. 4. If it is not possible to clock the scan chain with

a faster clock than the tester clock, then another approach is to

have the tester channel rotate between

scan chains (each scan

chain has its own decoder). At each clock cycle, the tester shifts

800 IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 22, NO. 6, JUNE 2003

in a bit for a different decoder for each of the

scan chains.

Each of the

decoders simply samples its input once every

clock cycles in a different phase from the other decoders. For

example, if there are two scan chains

(

=2)

, then the decoder

for scan chain 1 would sample its input on even tester clock cy-

cles, and the decoder for scan chain 2 would sample its input on

odd tester clock cycles. With this approach, the “effective clock

rate” for each of the decoders is divided by

. However, the scan

chain corresponding to each decoder is still clocked at the normal

tester clock rate and, thus, its clock rate is

times faster than the

decoder. Each time the decoder is clocked once, the scan chain

is clocked

times.

In the remainder of this paper, without loss of generality, it will be

assumed that the scan clock is faster than the tester clock (i.e., cor-

responding to scenario 1 above). However, all of the concepts apply

equally as well for scenario 2 where the tester channel feeds multiple

scan chains such that the “effective clock rate” seen by each decoder is

slower than the clock rate of the scan chain.

To illustrate how the decoder and serializer work, consider the fol-

lowing example. Suppose the scan vectors are divided into four-bit

blocks, and each four-bit block is replaced by a variable length code-

word. The compressed test data stored on the tester consists of the vari-

able length codewords. These codewords are shifted into the decoder

as a continuous stream of bits. If the codewords are prefix-free, than

the decoder can easily recognize when it has received a complete code-

word. When the decoder has received a complete codeword,it loads the

corresponding four-bit block in parallel into the serializer. The contents

of the serializer are then shifted into the scan chain. If the scan chain is

clocked at twice the clock rate that the tester operates at, then after two

tester clock periods the entire contents of the serializer will be shifted

into the scan chain. During the two tester clock periods that the serial-

izer is in operation, the decoder can be receiving the next codeword.

The key to making the scheme work is careful selection of the sta-

tistical code that is used for compressing the test set. There are two

important issues that must be considered in selecting the code: one is

that the decoder must be small in order to keep the area overhead down,

and the other is that the decoder must not output the decompressed bits

into the serializer faster than they can be shifted out into the scan chain.

While a Huffman code gives the optimum compression for a test set di-

vided into a particular fixed-length block size, it generally requires a

very large decoder. A Huffman code for a fixed-length block size of

bits requires a finite state machine (FSM) decoder with

states.

Thus, the size of the decoder for a Huffman code grows exponentially

as the block size is increased. A method for selecting an efficient statis-

tical code for the proposed scheme is described in the following section.

In this scheme, the output response is assumed to be fully com-

pacted on-chip using standard response compaction hardware struc-

tures such as a multiple-input signature register (MISR). Test response

compaction is an extensively researched topic and several well-defined

techniques exist for doing so [4].

V. S

TATISTICAL CODE SELECTION FOR PROPOSED SCHEME

Given the test set for a core, a statistical code for compressing the test

set must be selected. There is a tradeoff in selecting the code between

the amount of compression that is achieved and the complexity of the

decoder. Moreover, if the clock frequency of the tester is

and the

clock frequency of the scan chain is

sys

(system clock frequency) then

the ratio of the system clock frequency and the tester clock frequency

sys

limits the minimum size of a codeword. If the test set is divided

into fixed-lengthblocks of

bits,then the serializer will hold

bits,and,

thus, it takes

scan-clock cycles to shift the buffer’s contents into the

Fig. 5. Huffman tree for the three highest frequency symbols in Table I.

scan chain. During the time that the contents of the serializer are being

shifted into the scan chain, the tester is shifting bits into the decoder.

When the decoder receives a complete codeword, it needs to output the

corresponding block of

bits into the serializer. If the codeword is too

short, then the serializer may not have been emptied yet which would

cause a problem for the decoder.So, in order to ensure that the serializer

is always empty when the decoder finishes decoding a codeword, the

minimum size of a codeword

min

must be no smaller than the ratio of

the tester and scan-clock rates times the size of each block

min



sys

(i)

For example, if the block size is 8 and the scan-clock rate is twice the

tester clock rate, then the minimum size of a codeword is 4. Note that

if it is not possible to have the scan clock rate be faster then the tester

clock rate, then an alternative solution (as previously described) is to

make the scan clock rate be twice as fast as the “effective clock rate” as

seen by the decoder by simply having the tester channel feed two scan

chains so that the rate that the decoder receives data from the tester is

half as fast as the rate at which data can be shifted into the scan chain.

Using a Huffman code would provide the maximum compression,

however, it would require a complex decoder and may not satisfy the

constraint on the minimum size of a codeword. Therefore, some al-

ternative statistical code must be selected. The approach taken here

involves using a selective coding approach for which a very simple

decoder can be constructed. Consider the case where the test set is di-

vided into fixed-length blocks of

bits. There will be

codewords.

The first bit of each codeword will be used to indicate whether the fol-

lowing bits are coded or not. If the first bit of the codeword is a 0,

then the next

bits are not coded and can simply be passed through

the decoder as is (hence, the complete codeword has

bits). If

the first bit of the codeword is a 1, then the next variable number of

bits form a prefix-free code that will be translated by the decoder into a

-bit block. The idea is to only code the most frequently occurring

-bit

blocks using codewords with small numbers of bits (less than

,but

greater than or equal to

min

). Compression is achieved by having the

most common

-bit blocks be represented by codewords with less than

bits. The decoder is simple because only a small number of blocks

are coded. The vast majority of the blocks are not coded and can be

simply passed through the decoder. If

blocks are coded, then the de-

coder can be implemented with an FSM having no more than

states (compared with a Huffman code which requires

states).

An example to illustrate the proposed approach for selecting a sta-

tistical code is shown in Fig. 5. Consider the test set in Fig. 1. If the

entire test set is divided into four-bit blocks then the frequency distri-

bution obtained is shown in the second column of Table I. As can be

seen from Table I, the symbols having the highest frequencies are

0010

0100

, and

0110

. So, these are the symbols that are coded while the rest

of them will be left unchanged. A Huffman tree for the three patterns is

constructed to get their codewords (as shown in Fig. 5). The codewords

for the remaining 13 symbols are simply a 0 followed by the symbol

itself (as shown in the last column of Table I).

The two important parameters in selecting the code are the block size

and the number of coded blocks

. Once those havebeen chosen, then

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 22, NO. 6, JUNE 2003 801

Fig. 6. Example Test Set to Illustrate Alg1 for Filling Test Cubes.

the procedure for constructing the code is mechanical. A Huffman tree

is formed for the

most frequently occurring

-bit blocks. The code-

words for the most frequently occurring blocks are simply a 1, followed

by the Huffman code obtained from the Huffman tree. The codewords

for the remaining blocks are simply a 0, followed by the

-bit block

itself. The amount of area overhead for the decoder can be controlled

by placing an upper bound on the values of

and

. An increase in

implies an increase in the number of states of the decoder where in the

limiting case when all patterns are encoded the decoder becomes a full

Huffman decoder. An increase in the block size

, on the other hand,

implies an increase in the serializer area that is required for this scheme.

In this case, the limit is to make each test vector a pattern which results

in a large hardware overhead to regenerate the test vectors from the

codewords. The effect of

and

on the amount of compression will

be discussed in greater detail in the experimental results section. For a

particular value of

, the amount of compression that will be achieved

can be computed in linear time with respect to the number of bits in the

test set. Thus, the best value of

can be efficiently determined through

experimentation. Several values of

can be tried for a particular test

set to determine which gives the best compression. Similarly, the best

value for

can also be efficiently determined through experimentation.

VI. A

LGORITHMS FOR FILLING TEST CUBES

One of the advantages of implementing this selective Huffman en-

coding scheme on test cubes is that the unspecified bits can be filled

with 1’s and 0’s in a way that the frequency distribution of the patterns

becomes skewed. This helps in maximizing the compression. There are

several algorithms that can be used to fill the

’s. In this section, two

will be discussed.

When the block size is sufficiently small, an exact analysis can be

done by considering all binary combinations (minterms) contained in

the unspecified blocks. This algorithm (henceforth, referred to as Alg1)

is illustrated with an example in the following paragraphs.

Fig. 6 shows an example test set consisting of three test cubes,

each of length 12. Let the block size be

. Hence, the

three test cubes shown above are partitioned into a set of 9

four-bit blocks,

;XX

;

XXX; X

011

;

101

XXX

. Each unspecified block can contain

from 1 (if fully specified) to

=16

(if completely unspecified)

possible binary combinations (minterms). For each of the 16 possible

minterms for a block, the frequency of occurrence is determined

by seeing how many of the unspecified blocks (in set

) contain

that minterm. For example, the minterm 1111 is contained in two of

the unspecified blocks in the set

, while the minterm 0000 is not

contained in any of the unspecified blocks. The minterm that occurs

most frequently (i.e., is contained in the largest number of unspecified

blocks in set

) is selected first. The

’s in each unspecified block

that contains the most frequent minterm are specified so that it matches

that minterm, and the unspecified block is then removed from the set

. The frequency of occurrence for each of the remaining minterms is

then recomputed since the set

has been changed, and the procedure

repeats until the set

is empty. This procedure maximizes the

frequency of occurrence of the codewords thereby increasing the

encoding efficiency of the statistical encoding.

In the example in Fig. 6, the most frequently occurring minterm

is 1011. Seven of the unspecified blocks in

contain 1011, so after

Fig. 7. Example test set from Fig. 6 after

’s are filled using Alg1.

the first iteration, the set

will contain only {

and

The most frequently occurring minterm in the second iteration is 0010,

which is contained in both remaining unspecifiedblocks. The test cubes

after specifying all the

’s is shown in Fig. 7.

Alg1 provides an exact analysis of the frequency distribution of the

minterms by considering all possibilities. However, this comes at a cost

in terms of the runtime of the algorithm. It is easy to see that the algo-

rithm is exponential in block size

and, hence, the use of this algorithm

should be limited to small block sizes only. However, there are alter-

nate ways to specify the don’t care bits to maximize the compression

which trade off accuracy for faster runtime. The next algorithm (hence-

forth, referred to as Alg2) is extremely fast and in most cases produces

results comparable to the first algorithm. Alg2 is illustrated next with

the same example used in the previous case.

In Alg2, the most frequently occurring unspecified block is identi-

fied. It is then compared with the next most frequently occurring un-

specified block to see if there is a conflict in any bit position (i.e., one

has a 1 and the other has a 0, or vise versa). If there is no conflict, then

theyare merged by specifying all bit positions in which either block has

a specified value. For example, if block

is merged with block

, then the resulting block is

011

. Note that merging blocks

can only increase the number of specified bits. The most frequently

occurring unspecified block is compared with all the other unspecified

blocks in decreasing order of frequency and whenever merging is pos-

sible, it is done. This is done until no more merging can be done with

the most frequently occurring unspecified block. This process is then

repeated for the second most frequently occurring unspecified block.

This continues until there are no more blocks that can be merged. At

this point, all the remaining blocks are unique and cannot share any

minterms. Any remaining

’s can now be randomly filled with 0’s and

1’s as they will have no impact on the amount of compression. Alg2

fills the

’s by greedily merging unspecified blocks based on their

frequency of occurrence. This is a heuristic that skews the frequency

of occurrence, however, unlike Alg1, it is not guaranteed to maximize

the encoding efficiency since the greedy procedure may miss a better

merging order. However, it is a much faster procedure than Alg1 as the

number of operations is much less because merging is done right away

to reduce the set of blocks.

Consider applying Alg2 to the example test data shown in Fig. 6.

The set

as described earlier has 6 unique unspecified blocks

XXX

011

and

101

. Let the set of these 6

unique blocks be denoted by

uniq

. Of these six unique blocks, the

frequency of occurrence of block

is 3, that of block

XXX

2, and, for the rest, the frequency is 1. In the first step of the algorithm,

since the block

is the most frequently occurring, it is compared

with the next most frequently occurring block which is

XXX

. Since

there are no conflicts, they are merged thereby reducing the set

uniq

The merged block

is then compared with the other blocks

that have frequency 1, and is merged with

011

and

101

. At this

point, the set

uniq

1011

;XX

;

. The procedure is

then repeated again, starting with the next most frequently occurring

unspecified block. In the end,

uniq

1011

;

and no more

merging can be done. The final test vector set is shown in Fig. 8.

Note that unlike the previous algorithm, in this case it is possible to

have some don’t care bits left over in the transformed test set which

can now be randomly filled with 1’s or 0’s without having any impact

on the amount of compression. The amount of compression obtained

An efficient test vector compression scheme using selective Huffman coding

Figures

Citations

VLSI Test Principles and Architectures: Design for Testability (Systems on Silicon)

Survey of Test Vector Compression Techniques

VLSI Test Principles and Architectures: Design for Testability

Nine-coded compression technique for testing embedded cores in SoCs

Optimal Selective Huffman Coding for Test-Data Compression

References

A Method for the Construction of Minimum-Redundancy Codes

A method for the construction of minimum-redundancy codes

Combinational profiles of sequential benchmark circuits

Essentials of Electronic Testing for Digital, Memory and Mixed-Signal VLSI Circuits

Test set compaction algorithms for combinational circuits

Related Papers (5)

System-on-a-chip test-data compression and decompression architectures based on Golomb codes

Test data compression and test resource partitioning for system-on-a-chip using frequency-directed run-length (FDR) codes

Test set compaction algorithms for combinational circuits

Embedded deterministic test

Test vector decompression via cyclical scan chains and its application to testing core-based designs

Frequently Asked Questions (10)

Q1. What contributions have the authors mentioned in the paper "An efficient test vector compression scheme using selective huffman coding" ?

Q2. Why is the serializer used in parallel?

Q3. What is the way to use the correlations in a test set?

Q4. What is the effect of transformations on the test vector set?

Q5. What is the limit for the encoding of test vectors?

Q6. What other techniques were used for filling the X’s?

Q7. How can a decoder be able to have the codewords shifted?

Q8. How many times faster is the scan chain corresponding to each decoder?

Q9. What is the scheme for coding scan vectors?

Q10. Why is it possible to clock the scan chain with a faster clock than the tester clock?