Journal Article•DOI•

A* Algorithm Inspired Memory-Efficient Detection for MIMO Systems

Ronald Y. Chang¹, Wei-Ho Chung¹, Sian-Jheng Lin¹•Institutions (1)

18 Jul 2012-IEEE Wireless Communications Letters (IEEE)-Vol. 1, Iss: 5, pp 508-511

TL;DR: Modified best-first detection algorithms in which the order of nodes is determined by both the original cost and the estimated future cost associated with each node are proposed, as inspired by an improved shortest path algorithm (A* algorithm).

read less

Abstract: Implementation of a best-first detection algorithm for multiple-input multiple-output (MIMO) systems requires large amounts of memory especially in large systems with high-order modulation. In this letter, we propose modified best-first detection algorithms in which the order of nodes is determined by both the original cost and the estimated future cost associated with each node, as inspired by an improved shortest path algorithm (A* algorithm). The modified algorithms maintain the detection optimality, reduce the memory requirement and sorting complexity, and achieve improved detection performance in memory-constrained scenarios.

...read moreread less

Summary (1 min read)

Jump to: [Introduction] – [II. TRANSMISSION SYSTEM AND BEST-FIRST DETECTION] – [A. Complexity Evaluation] – [B. Simulation Results] and [V. CONCLUSION]

Introduction

Best-first search (BFS) detection schemes [2]–[6] based on the Dijkstra’s (or ) algorithm maintains a list of nodes sorted in some defined cost and explores the nodes in such order.
Imposing a memory constraint [6] facilitates hardware implementation and reduces the search complexity at the cost of some performance degradation.
The proposed methods are described Manuscript received June 18, 2012.

II. TRANSMISSION SYSTEM AND BEST-FIRST DETECTION

Transmitted symbol vector x̃c contains uncorrelated entries selected equiprobably from the squared quadrature amplitude modulation (QAM) alphabet S = {a + ib | a, b ∈ Q} and has zero mean and covariance matrix σ2xINT , where Q is the pulse amplitude modulation (PAM) alphabet and INT is the NT ×NT identity matrix.
Hc has independent and identically distributed (i.i.d.).
Gaussian entries with zero mean and covariance matrix σ2HINR , where σ2H = 1.
The channel information is assumed perfectly known to the receiver.
The authors reach (9) by rewriting the objective function, where the second and third terms do not depend on xk−11 .

A. Complexity Evaluation

Here, the authors evaluate the overall computational complexity of the proposed algorithms in comparison with conventional methods.
Since all processing is conducted on real values based on (2), all the calculations below refer to real operations.
The complexity of a tree-search detection scheme is evaluated in terms of the number of nodes visited and expanded (defined respectively by nodes that ever occupy a position and become the best node in the node list).
Similar calculations can be carried out for the BFS-LA2 algorithm.

B. Simulation Results

Here, the authors present the simulation results: symbol error rate (SER) performance in Fig. 1, memory usage in Fig. 2, and complexity in terms of floating-point operations in Table I (one real multiplication/addition each counts a flop).
Similar observations can be made in Fig. 1(b).
Fig. 2 illustrates the memory-reduction capability of the proposed schemes.

V. CONCLUSION

Modified BFS-based MIMO detection algorithms incorporating an efficient look-ahead mechanism have been presented.
Simulation results demonstrated that the proposed algorithms maintain exact ML detection capability while achieving memory savings and enhanced performance in memory-constrained scenarios.
Complexity analysis was conducted to confirm the computational feasibility of the proposed algorithms.

Did you find this useful? Give us your feedback

Figures (3)

TABLE I COMPLEXITY (FLOPS) COMPARISONS OF MIMO DETECTION SCHEMES (LIST-SIZE CONSTRAINT L = 4 FOR 4× 4 16-QAM AND L = 8 FOR 4× 4 64-QAM; BIAS = 0.01)

Fig. 2. Memory usage for BFS-based detection schemes. (a) 4 × 4 MIMO with 16-QAM and 64-QAM. (b) 8 × 8 MIMO with 16-QAM and 64-QAM. Notations follow those in Table I.

Fig. 1. SER performance of MIMO detection schemes. (a) 4×4 MIMO with 16-QAM. (b) 4× 4 MIMO with 64-QAM. Notations follow those in Table I.

Content maybe subject to copyright Report

508 IEEE WIRELESS COMMUNICATIONS LETTERS, VOL. 1, NO. 5, OCTOBER 2012

∗

Algorithm Inspired Memory-Efﬁcient Detection for MIMO Systems

Ronald Y. Chang, Wei-Ho Chung, Member, IEEE, and Sian-Jheng Lin

Abstract—Implementation of a best-ﬁrst detection algorithm

for multiple-input multiple-output (MIMO) systems requires large

amounts of memory especially in large systems with high-order

modulation. In this letter, we propose modiﬁed best-ﬁrst detection

algorithms in which the order of nodes is determined by both

the original cost and the estimated future cost associated with

each node, as inspired by an improved shortest path algorithm

∗

algorithm). The modiﬁed algorithms maintain the detection

optimality, reduce the memory requirement and sorting com-

plexity, and achieve improved detection performance in memory-

constrained scenarios.

Index Terms—Maximum likelihood (ML) decoding, multiple-

input multiple-output (MIMO) systems, tree-search detection,

Dijkstra’s algorithm, A

∗

algorithm, memory efﬁciency.

I. INT RODUCTION

HE multiple-input multiple-output (MIMO) detection

problem can be viewed as a tree-search problem [1].

Best-ﬁrst search (BFS) detection schemes [2]–[6] based on

the Dijkstra’s (or stack) algorithm maintains a list (stack) of

nodes sorted in some deﬁned cost and explores the nodes in

such order. While BFS detection can minimize the number of

searched nodes needed to establish the maximum likelihood

(ML) solution [4], it has a prohibitively large memory require-

ment. Imposing a memory constraint [6] facilitates hardware

implementation and reduces the search complexity at the cost

of some performance degradation. Modifying the sorting crite-

rion such as the use of a biased cost [3] can improve the error

and complexity performance of a BFS scheme in memory-

constrained scenarios, yet at the loss of detection optimality.

The choice of the bias is also heuristic, requiring empirical

efforts to determine proper values. The optimal detectio n

performance in memory-constrained scenarios is guaranteed in

the proposed scheme in [7] that combines the memory-efﬁcient

sphere decoder and the computationally-efﬁcient Dijkstra’s

algorithm. However, it generally searches more nodes than the

original BFS scheme.

In this letter, we propose an optimal BFS detection scheme

inspired by the A

∗

algorithm [8] which speeds up the orig-

inal Dijkstra’s algorithm without losing algorithm optimality

(shortest path is guaranteed). By adding a new term to the

original cost of nodes, the proposed modiﬁed BFS scheme

demonstrates memory efﬁciency and improved error perfor-

mance over conventional BFS detectors in memory-constrained

scenarios. Simulation and complexity studies also show the

tradeoff between error and memory performances and the

computations required for obtaining the added term.

This letter is organized as follows. Sec. II presen ts the system

model and BFS detection. The proposed methods are described

Manuscript receive d June 18, 2012. The associate editor coordinating the

review of this letter and approving it for publication was G. V itetta.

This work was supported by the National Science Council of T a iwan under

Grant NSC 100-2221-E-001-004.

The authors are with the Research Center for Information Technology

Innovation, Academia Sinica, Taipei, Taiwan (e-mail: yjrchang@gmail.com,

{whc, sjlin}@citi.sinica.edu.tw).

Digital Object Identiﬁer 10.1109/WCL.2012.071612.120450

in Sec. III, with complexity and performance results presented

in Sec. IV. Conclusion is given in Sec. V.

II. T

RANSMISSION SYSTEM AND BEST-FIRST DETECTION

We consider an uncoded MIMO transmission system with

) transmit (receive) antennas (denoted by an N

×N

system). The baseband signal model is given by

= H

+ v

(1)

where y

is the N

× 1 received signal containing the

× 1 transmitted signal

perturbed by the N

× N

uncorrelated ﬂat-fading channel H

and the N

× 1 noise

. Tra nsmitted symbo l vector

contains uncorrelated entries

selected equiprobably from the squared quadrature amplitude

modulation (QAM) alphabet S = {a + ib | a, b ∈Q}and

has zero mean and covariance matrix σ

,whereQ is the

pulse amplitude modulation (PAM) alphabet and I

is the

× N

identity matrix. Complex-valued channel matrix H

has independent and identically distributed (i.i.d.) Gaussian

entries with zero mean and covariance m atrix σ

,where

=1. The channel information is assumed perfectly known

to the receiver. Noise v

is additive white Gaussian noise

(AWGN) with i.i.d. complex elements and has zero mean and

covariance matrix σ

The complex signal model in (1) can be transformed into an

equivalent real signal model by deﬁning y



=[(y

) (y

)]

x =[(

) (

)]

, v =[(v

) (v

)]

,and

H =



(H

) −(H

)

(H

) (H

)



where (·) and (·) denote the real and imaginary parts of its

argument, respectively. The real signal model is given by



= H

x + v (2)

where y



∈ R

, H ∈ R

n×m

x ∈Q

,andv ∈ R

, with

n =2N

and m =2N

. We hereafter assume m = n for

presentation brevity.

Given the model in (2), the ML symbol detection is to solve

=arg min

x∈Q

y



− Hx

(3)

where · denotes the l

-norm of a vector. By performing the

QR decomposition on H (H = QR), we formulate (3) into an

equivalent expression

=argmin

x∈Q

y − Rx

,where

y = Q



. The upper-triangular structure of R enables the

expansion of y − Rx

in the form

− r

m,m

)

+ ···+



−



i=1

1,i



(4)

where y

is the ith element of y, x

is the ith element of x,and

i,j

is the (i, j)-entry of R. We denote the (m − k +1)th term

in (4) by b(x

) andthesummationoftheﬁrstm−k+1 terms

by d(x

) ( k =1, 2 ,...,m), where x

 (x

,...,x

)

∈

2162-2337/12$31.00

 2012 IEEE

CHANG et al.:A

∗

ALGORITHM INSPIRED MEMORY-EFFICIENT DETECTION FOR MIMO SYSTEMS 509

m−k+1

represents the partial symbol vector. A (rooted)

detection tree is created from (4), which consists of a v irtual

root node, the nonleaf nodes in layers 1,...,m−1 each having

|Q| child nodes, and the leaf nodes in layer m,where|·|is

the cardinality of a set. Each node in layer m − k +1 uniquely

represents an x

and has an associated path metric d(x

)

and branch metric b(x

).Sinced(x

) of a leaf node equals

y − Rx

evaluated for x = x

represented by the node,

the objective of optimal detection is to ﬁnd the leaf node with

the smallest path metric among all leaf nodes.

The BFS detection algorithm maintains a list of nodes sorted

in ascending order of their deﬁned cost (denoted by c). The

cost can be a node’s path metric d [4]–[6], or its biased path

metric d − k [3] if this node is in layer k,where>0 is

the bias.

The conventional BFS algorithm with cost c and

list-size constraint L consists of the following iterative steps:

0) Initially, the node list N contains only the root node. 1)

Select the best (ﬁrst) node from N ; if this node is in layer m,

terminate the algorithm and output it as the solution. 2) Expand

the best node by adding all its child nodes to N and removing

itself from N . 3) Order the nodes in N in ascending order

of the cost c and discard nodes beyond the ﬁrst min(|N |,L)

nodes. Slightly abusing the notation, we use BFS(L) to denote

the ab ove algorithm with c = d an d BFS(L, ) to denote the

algorithm with c = d − k.

III. T

HE PROPOSED BEST-FIRST DETECTION ALGORITHMS

The A

∗

algorithm [8] speeds up the search of the shortest

path in a graph by considering both the travelled distance

thus far and the estimated distance ahead (the heuristic).

If the heuristic is admissible (not over-estimating the real

distance), the shortest path is guaranteed. The more accurate

is the estimate, the better performance of the algorithm can be

achieved [9]. In a graph where the edge length represents the

geographic distance, an admissible heuristic can be the straight-

line distance from a node to the destination.

The idea of including a heuristic may be applied to enhance

a BFS detection scheme which ﬁnds the shortest path from

the source (the root node) to the destination (the grouping

of all leaf nodes). Since there is no notion of straight-line

distances in the detection tree, the h euristic can only be

obtained by calculation. The novelty of this work is that two

methods of ﬁnding admissible heuristics are developed without

involving an exhaustive search of the unexplored part of the

tree (requiring exponential complexity) and without simply

precomputing the BFS iterations (trivial modiﬁcation). The

inclusion of the heuristic is termed “look-ahead (LA).”

A. Look-Ahead One Layer

Consider at some point of the BFS algorithm a node in layer

m − k +1 that represents a speciﬁc

=(˘x

,...,˘x

)

visited. The existing (known) cost from the source to this node

is given by the path metric of this node d(



j=k

and the future (unknown) cost from this node to the destination

is given by



k−1

j=1



k−1

)



,where(·, ·) denotes the

concatenation of two column vectors by placing the second

one under the ﬁrst one. Any lower bound on this future cost

constitutes an admissible heuristic for this node. One lower

In this letter, we use d to denote the path metric generally and use d(x

)

to denote the path metric of a speciﬁc node; same for c and, later, h

and h

bound is given by the minimum of the |Q| immediate branch

metrics under this node, i.e.,

k−1



j=1



k−1

)



≥ min

k−1

∈Q



k−1

)



=min

k−1

∈Q



(k)

k−1

− r

k−1,k−1

k−1



 h

(

) (5)

where y

(k)

k−1

is the (k−1)th element of y

(k)

= y−



i=k

˘x

·r

with r

being the kth column of R. Note that if ﬁnd-

ing the minimum in (5) requires an exhaustive search of

k−1

∈Qthen this look-ahead presents little advantage,

as it reduces to performing node expansion in BFS one

layer ahead and requires the same amount of computation.

Fortunately, the minimizing x

k−1

in (5) is directly given by

the one-dimensional zero-forcing (ZF) solution after slicing,

i.e.,



(k)

k−1

k−1,k−1



, which can be obtained without actu-

ally computing the division and slicing. For example, for 4-

QAM with Q = {−1, 1}, the minimizing x

k−1

is given by

sgn



(k)

k−1



,wheresgn(x)=1if x ≥ 0 and −1 if x<0 (note

that r

k−1,k−1

> 0); for 16-QAM with Q = {−3, −1, 1, 3},

the minimizing x

k−1

is given by 2sgn



(k)

k−1



+ sgn



(k)

k−1

−

2sgn(y

(k)

k−1

k−1,k−1



. As a result,



(k)

k−1

− r

k−1,k−1

k−1



in (5) needs to be computed just once rather than |Q| times.

The ﬁrst heuristic is given by

(



(k)

k−1

− r

k−1,k−1



(k)

k−1

k−1,k−1





. (6)

The new algorithm, referred to as the BFS-LA1(L) algorithm,

modiﬁes the conventional BFS(L) algorithm by adopting the

cost c = d + h

in Step 3 and terminating the algorithm when

the best node selected is in layer m − 1 in Step 1.

B. Look-Ahead Multiple Layers

The second and tighter lower bound on the future cost that

can be obtained without an exhaustive constellatio n search

and that can construct a meaningful modiﬁcation is derived

by relaxing the constraint on the minimization variable from

k−1

∈Q

k−1

to x

k−1

∈ R

k−1

. Speciﬁcally,

k−1



j=1



k−1

)



≥ min

k−1

∈Q

k−1



j=1



k−1

)



(7)

=min

k−1

∈Q

k−1



(k)

− R

(k)

k−1



(8)

=min

k−1

∈Q

k−1



k−1

−

k−1

1,ZF



(k)



k−1

−

k−1

1,ZF





(k)



−



(k)

k−1

1,ZF



(9)

≥ min

k−1

∈R

k−1



k−1

−

k−1

1,ZF



=α



k−1

−

k−1

1,ZF



(k)



k−1

−

k−1

1,ZF





(k)



−



(k)

k−1

1,ZF



 h

(

) (10)

where R

(k)

is an m × (k − 1) submatrix of R with columns

, r

k+1

,...,r

removed,

k−1

1,ZF



(k)



−1

(k)

is the unconstrained ZF solution for the reduced-dimension

510 IEEE WIRELESS COMMUNICATIONS LETTERS, VOL. 1, NO. 5, OCTOBER 2012

detection pr oblem in (8), and α





k−1

1,ZF

−

k−1

1,ZF



the squared distance between the constrained ZF solution

k−1

1,ZF



k−1

1,ZF



k−1

and

k−1

1,ZF

. We reach (9) b y rewriting

the objective function, where the second and third terms do not

depend on x

k−1

. From (9) to (10) we have used the fact that the

ﬁrst term in (9) is a convex function in x

k−1

with the minimum

value of zero achieved at x

k−1

1,ZF

when there is no

constraint on x

k−1

(i.e., x

k−1

∈ R

k−1

). By restricting x

k−1

the hypersphere of



k−1

−

k−1

1,ZF



= α

, we are guaranteed

to ﬁnd a positive-valued minimum no greater than that yielded

by any x

k−1

outside the hypersphere (due to convexness) as

well as that yielded by x

k−1

1,ZF

(one special point on the

hypersphere). Since

k−1

1,ZF

is the nearest lattice point to

k−1

1,ZF

in Q

k−1

, we have obtained a lower bound.

The ﬁrst term in (10) is given by α

min

, achieved at x

k−1

αv

min

k−1

1,ZF

,whereλ

min

and v

min

are the minimum eigenvalue

of R

(k)

and the corresponding unit-length eigenvector,

respectively. Thus, the second heuristic is given by

(

)=α

min



(k)



−



(k)

k−1

1,ZF



. (11)

As will be veriﬁed in Sec. IV, h

requires more compu-

tations than h

but yields better performance and memory

efﬁciency. When the considered node

is in layer m − 1,

“look-ahead multiple layers” reduces to “look-ahead one layer,”

and h

become id entical. Su bstituting the new cost

c = d + h

in the BFS-LA1(L) algorithm gives the BFS-

LA2(L) algorithm.

IV. R

ESULTS AND DISCUSSIONS

A. Complexity Evaluation

Here, we evaluate the overall computational complexity

of the proposed algorithms in comparison with conventional

methods. Since all processing is conducted o n real values based

on (2), all the calculations below refer to real operations.

The complexity of a tree-search detection scheme is evalu-

ated in terms of the number o f nodes visited and expanded

(deﬁned respectively by nodes that ever occupy a position

and become the best node in the node list). We let I

(∗)

and

(∗)

be the number of visited and expanded nodes in layer

m − k +1, respectively, for scheme ∗ with some L and, if

applicable, . Note that visiting a node in layer m−k+1 entails

computing b(x

) (m − k +2 multiplications and m − k +1

additions) and summing up d(x

k+1

) and b(x

) (one addition).

Therefore, the total complexity of the BFS algor ithm is given



k=1

(BFS)

(m − k +2) multiplications and additions. The

complexity of the BFS-LA1 algorithm includes the complexity

of running the regular search iterations from layer 1 to m − 1,

which requires



k=2

(BFS-LA1)

(m − k +2) multiplications

and additions, and the complexity of look-ahead (one layer).

The complexity of look-ahead for a node in layer m − k +1

is equal to the sum of the complexity of computing b(x

k−1

)

once (m − k +3 multiplications and m − k +2 additions)

and the complexity of adding up d and h

(one addition).

The number of nodes that require such computations is equal

to the number of visited but nonexpanded nodes, which is

(BFS-LA1)

− J

(BFS-LA1)

for layer 1,...,m− 2 and I

(BFS-LA1)

for layer m − 1. Collecting these results, the total complexity

of the BFS-LA1 algorithm is given by



k=2

(BFS-LA1)

(2m −

2k +5)−



k=3

(BFS-LA1)

(m − k +3) multiplications and

additions.

Similar calculations can be carried out for the BFS-LA2

algorithm. Here, the complexity of look-ahead (multiple lay-

ers) includes the computation o f λ

min

k−1

1,ZF

, and some ma-

trix/vector manipulations (note that y

(k)

is already available

given d(

)). Note that



(k)



−1

(k)

and λ

min

can

be precomputed, one time p er layer, but α



(k)



,and



(k)

k−1

1,ZF



need to be computed for each node visited. We

compute matrix/vector computations by direct multiplications

and accumulations, matrix inverse by the efﬁcient LDL

decomposition method [10], and λ

min

by the power method

[11] applied on



(k)



−1

to obtain its dominant (largest)

eigenvalue. Summing numbers up, the total computation counts

for the BFS-LA2 algorithm are



k=2



(BFS-LA2)

(km − k +

m +3)+(7/6)k

− (8/3)k +(3/2)



multiplications and



k=2



(BFS-LA2)

(km − k + m +2)+(7/6)k

− (5/2)k

(17/6)k − (3/2)



additions, where we have assumed equal

numbers of multiplications and additions in the complexity of

the power method approximated by 4(k − 1)

+3(k − 1) [11].

The complexity of ML detection, considered for comparison as

a b aseline scheme, is given by |Q|

+ m) multiplications

and |Q|

+ m − 1) additions from (3).

B. Simulation Results

Here, we present the simulation results: symbol error rate

(SER) performance in Fig. 1, memory usage in Fig. 2, and

complexity in terms of ﬂoating-point operations (ﬂops) in

Table I (one real multiplication/addition each counts a ﬂop).

Standard minimum-mean-square-error (MMSE) linear detector

is adopted in Fig. 1 for comparison. The memory usage

counts the number o f memory units required for running an

algorithm, where each unit is used to store the (partial) symbol

vector represented by a node and the cost associated with

a node. Let I and J be the total number of visited and

expanded nodes, respectively, where I = J|Q|. Then, the

memory usage for a BFS-based detection scheme is given by

J(|Q| − 1) + 1 or I(1 − 1/|Q|)+1 units for the case of

unlimited mem ory, and min



J(|Q| − 1) + 1, (|Q| − 1) + L



units for the case of limited mem ory with list-size constraint

L. The signal-to-noise ratio (SNR) in the plots is deﬁned as

E[H



]/E[v



]=N

/σ

Fig. 1 shows that, in the unlimited-memory setting, BFS,

BFS-LA1, and BFS-LA2 all achieve optimal performance.

BFS(∞, ) has various degrees of performance degradation

but generally achieves memory savings (Fig. 2) and reduced

complexity (Table I). The performance penalty for BFS(∞,

) is moderate in Fig. 1(a) and very high in Fig. 1(b) with

thesamebias =0.01. T his shows th at the selection of

bias is scenario-dependent and requires a manual effort. In

the memory-constrained setting, the proposed BFS-LA1 and

BFS-LA2 schemes demonstrate signiﬁcant SER performance

advantage over conventional schemes. The improved memory

efﬁciency of the proposed schemes when there is no memory

constraint (Fig. 2) results in a smaller SER perform ance

degradation when a memory constraint is imposed. In Fig. 1(a),

BFS-LA1(4) achieves a 3–4 dB gain over BFS(4) at SER =

CHANG et al.:A

∗

ALGORITHM INSPIRED MEMORY-EFFICIENT DETECTION FOR MIMO SYSTEMS 511

22 24 26 28 30 32 34

−5

−4

−3

−2

−1

SNR (dB)

SER

MMSE

BFS(4)

BFS(4, 0.01)

BFS−LA1(4)

BFS−LA2(4)

BFS(∞, 0.01)

BFS(∞), BFS−LA1(∞), BFS−LA2(∞) = ML

(a)

30 32 34 36 38 40 42

−6

−5

−4

−3

−2

−1

SNR (dB)

SER

MMSE

BFS(8)

BFS(8, 0.01)

BFS−LA1(8)

BFS−LA2(8)

BFS(∞, 0.01)

BFS(∞), BFS−LA1(∞), BFS−LA2(∞) = ML

(b)

Fig. 1. SER performance of MIMO detection schemes. (a) 4×4 MIMO with

16-QAM. (b) 4 × 4 MIMO with 64-QAM. Notations follow those in Table I.

TABLE I

OMPLEXITY (FLOPS )COMPARI SONS OF MIMO DETECTION SCHEMES

(LIST-SIZE CONSTRAINT L =4FOR 4 × 4 16-QAM AND L =8FOR 4 × 4

64-QAM; B

IAS  =0.01)

MIMO System 4 × 4

Modulation 16-QAM 64-QAM

SNR (dB) 22 28 34 30 36 42

BFS(L, ) 618 396 357 1,102 758 719

BFS(L) 655 421 364 1,411 842 727

BFS-LA1(L) 930 618 560 2,102 1,303 1,184

BFS-LA2(L) 6,666 5,432 5,132 11,515 8,316 7,734

BFS(∞, ) 747 398 357 1,153 758 719

BFS(∞) 846 436 366 1,679 861 730

BFS-LA1(∞) 1,084 629 560 2,340 1,318 1,185

BFS-LA2(∞) 7,183 5,466 5,133 11,695 8,355 7,736

ML 9.4 × 10

2.4 × 10

3 × 10

−3

, at moderate additional cost (e.g., 560 vs. 364 ﬂops

at SNR =34dB). BFS-LA2(4) achieves another 1–2 dB gain

at high additional cost (e.g., 5,132 ﬂops at SNR =34dB).

Similar observations can be made in Fig. 1(b). Clearly, there

is a tradeoff between the computations required in obtaining

the heuristic and the SER and memory performances yielded

as a result of using the heuristic.

Fig. 2 illustrates the memory-reduction capability of the

proposed schemes. Comparing different schemes without a list-

size constraint (i.e., L = ∞) shows the n ature of the algorithms

in terms o f memory performance. As SNR increases, the mem-

ory usage converges to m(|Q|−1)+1 for BFS with/without the

bias, and (m − 1)(|Q| − 1) + 1 for BFS-LA1/BFS-LA2. This

suggests that the proposed schemes are asymptotically more

memory-efﬁcient than conventional schemes at high SNR. In

other SNR regions, various degrees of memory saving are

achieved for the proposed schemes since fewer iterations are

needed due to look-ahead. The reduced memory usage also

leads to reduced sorting complexity, since generating a sorted

list (or ﬁnding the minimum-cost node in the case of L = ∞)

is required at each iteration of the algorithm.

20 25 30 35 40

100

120

140

SNR (dB)

Memory Usage (Units)

BFS(∞)

BFS(∞, 0.01)

BFS−LA1(∞)

BFS−LA2(∞)

BFS(∞)

BFS(∞, 0.01)

BFS−LA1(∞)

BFS−LA2(∞)

4x4 MIMO with 16−QAM

4x4 MIMO with 64−QAM

(a)

24 26 28 30 32 34 36 38 40 42

200

400

600

800

1000

1200

1400

1600

1800

SNR (dB)

Memory Usage (Units)

BFS(∞)

BFS(∞, 0.01)

BFS−LA1(∞)

BFS−LA2(∞)

BFS(∞)

BFS(∞, 0.01)

BFS−LA1(∞)

BFS−LA2(∞)

8x8 MIMO with 16−QAM

8x8 MIMO with 64−QAM

(b)

Fig. 2. Memory usage for BFS-based detection schemes. (a) 4 × 4 MIMO

with 16-QAM and 64-QAM. (b) 8 × 8 MIMO with 16-QAM and 64-QAM.

Notations follow those in Table I.

V. C ONCLUSION

Modiﬁed BFS-based MIMO detection algorithms incorpo-

rating an efﬁcient look-ahead mechanism have been presented.

Simulation results demonstrated that the proposed algorithms

maintain exact ML detection capability while achieving mem-

ory savings and enhanced performance in memory-constrained

scenarios. Complexity analysis was conducted to conﬁrm the

computational feasibility of the proposed algorithms.

EFERENCES

[1] E. G. Larsson, “MIMO detection methods: How they work,” IEEE Signal

Process. Mag., vol. 26, no. 3, pp. 91–95, May 2009.

[2] F. Jelinek, “Fast sequential decoding algorithm using a stack,” IBM J.

Research and Development, vol. 13, no. 6, pp. 675–685, Nov. 1969.

[3] A. D. Murugan, H. El Gamal, M. O. Damen, and G. Caire, “A

uniﬁed framework for tree search decoding: rediscovering the sequential

decoder,” IEEE Trans. Inf. Theory, vol. 52, no. 3, pp. 933–953, Mar.

2006.

[4] K. Su, “Efﬁcient maximum likelihood detection for communication over

multiple input multiple output channels,” Ph.D. dissertation, Univ. of

Cambridge, 2005.

[5] T. Fukatani, R. Matsumoto, and T. Uyematsu, “Two methods for decreas-

ing the computational complexity of the MIMO ML decoder ,” IEICE

Trans. Fundamentals, vol. E87–A, no. 10, pp. 2571–2576, Oct. 2004.

[6] A. Okawado, R. Matsumoto, and T. Uyematsu, “Near ML detection using

Dijkstra’s algorithm with bounded list size over MIMO channels,” in

Proc. 2008 IEEE International Symp. on Inform. Theory, pp. 2022–2025.

[7] Y. Dai and Z. Yan, “Memory-constrained tree search detection and new

ordering schemes,” IEEE J. Sel. Topics Signal Pr ocess., vol. 3, no. 6, pp.

1026–1037, Dec. 2009.

[8] P. E. Hart, N. J. Nilsson, and B. Raphael, “A formal basis for the heuristic

determination of minimum cost paths,” IEEE T rans. Systems Science and

Cybernetics, vol. 4, no. 2, pp. 100–107, July 1968.

[9] R. Dechter and J. Pearl, “Generalized best-ﬁrst search strategies and the

optimality of A*,” J. ACM, vol. 32, no. 3, pp. 505–536, July 1985.

[10] T.-H. Liu and Y.-L. Y. Liu, “Modiﬁed f ast recursive algorithm for efﬁcient

MMSE-SIC detection of the V-BLAST system,” IEEE Trans. Wireless

Commun., vol. 7, no. 10, pp. 3713–3717, Oct. 2008.

[11] I. Dimov and A. Karaivanova, “A po wer method with Monte Carlo

iterations,” in Recent Advances in Numerical Methods and Applications,

World Scientiﬁc, Singapore, 1999, pp. 239–247.

HTML Viewer

Frequently Asked Questions (1)

Q1. What are the contributions in this paper?

In this letter, the authors propose modified best-first detection algorithms in which the order of nodes is determined by both the original cost and the estimated future cost associated with each node, as inspired by an improved shortest path algorithm ( A∗ algorithm ).

A* Algorithm Inspired Memory-Efficient Detection for MIMO Systems

Summary (1 min read)

Introduction

II. TRANSMISSION SYSTEM AND BEST-FIRST DETECTION

A. Complexity Evaluation

B. Simulation Results

V. CONCLUSION

Figures (3)

Citations

Additional excerpts

References

"A* Algorithm Inspired Memory-Effici..." refers background in this paper

"A* Algorithm Inspired Memory-Effici..." refers methods in this paper

"A* Algorithm Inspired Memory-Effici..." refers methods in this paper

"A* Algorithm Inspired Memory-Effici..." refers methods in this paper

Related Papers (5)

Frequently Asked Questions (1)

Q1. What are the contributions in this paper?