What is the meaning of the SESD?

The SESD constrains the search to nodes which lie within a radius r around ỹ and traverses the tree depthfirst, visiting the children of a given node in ascending order of their PEDs.

What is the performance of the sphere decoder?

In the region where Lmax is small, the performance is dominated by aggressive LLR clipping rather than by the run-time constraint.

what is the d(s) of a tree?

Euclidean distances d(s) = ‖ỹ −Rs‖2 in (6) and (7) can be computed recursively as d(s) = d1 with the partial Euclidean distances (PEDs)di = di+1 + |ei|2 , i = MT ,MT − 1, . . . , 1 (8)and the distance increments (DIs)|ei|2 = ∣∣∣ỹi − MT∑j=iRi,jsj ∣∣∣2. (9) Since the dependence of the PEDs di on the symbol vector s is only through s(i), the authors have transformed ML detection and the computation of the max-log LLRs into a weighted treesearch problem: PEDs and PSVs are associated with nodes, branches correspond to DIs.

What is the key to achieving low complexity?

The keys to achieving low complexity are the single tree-search strategy in Section III-B, MMSE-SQRD preprocessing, LLR clipping, and imposing run-time constraints with MF scheduling.

What is the metric for determining the LLRs?

Computing the LLRs as in (5) requires determining the metric λMLj,b , which is achieved by traversing only those partsof the tree that have leaves in X “ xMLj,b ” j,b .

What is the corresponding LLR for the MR MT channel matrix?

To this end, the channel matrix H is first QR-decomposed according to H = QR, where Q is unitary of dimension MR ×MT and the upper-triangular MT×MT matrix R has real-valued positive entries on its main diagonal.

How can the authors initialize the search radius rj,b?

It is therefore important to realize that, without compromising max-log optimality, the authors can initialize the search radius rj,b by setting it equal to theminimum value of ‖ỹ −

What is the main idea behind the single tree search?

the authors formulate update rules and a pruning criterion based on a list containing the metrics λML and λMLj,b .The main concept is to have a list containing the metric λML along with the corresponding bit sequence xML and the metrics λMLj,b of all counter-hypotheses and to search the subtree originating from a given node only if the result can lead to an update of either λML or one of the λMLj,b .List administration:

(Open Access) Soft-Output Sphere Decoding: Performance and Implementation Aspects (2006) | Christoph Studer

Q: What are the contributions mentioned in the paper "Soft-output sphere decoding: performance and implementation aspects" ?

In this paper, the authors show how sphere decoding can be used as an efficient tool to implement soft-output MIMO detection with flexible trade-offs between computational complexity and ( error rate ) performance. In particular, the authors demonstrate that single tree search, ordered QR decomposition, channel matrix regularization, and log-likelihood ratio clipping are the key ingredients for realizing soft-output MIMO detectors with near max-log performance at a computational complexity that is reasonably close to that of hard-output sphere decoding.

Q: What is the way to prune a search tree?

More efficient pruning of the search tree closer to the root is obtained if “stronger streams” correspond to the levels closer to the root, i.e., P is chosen such that the main diagonal entries of R in HP = QR are sorted in ascending order.

Q: What is the key to a more efficient tree-search strategy?

The key to a more efficient (compared to RTS) tree-search strategy is to ensure that every node in the tree is visited at most once.

Soft-Output Sphere Decoding:

Performance and Implementation Aspects

C. Studer

∗

, M. Wenk

∗

, A. Burg

∗

, and H. Bölcskei

†

∗

Integrated Systems Laboratory

ETH Zurich, Switzerland

email: {studer, mawenk, apburg}@iis.ee.ethz.ch

†

Communication Technology Laboratory

ETH Zurich, Switzerland

email: boelcskei@nari.ee.ethz.ch

Abstract— Multiple-input multiple-output (MIMO) detection

algorithms providing soft information for a subsequent channel

decoder pose signiﬁcant implementation challenges due to their

high computational complexity. In this paper, we show how

sphere decoding can be used as an efﬁcient tool to implement

soft-output MIMO detection with ﬂexible trade-offs between

computational complexity and (error rate) performance. In

particular, we demonstrate that single tree search, ordered QR

decomposition, channel matrix regularization, and log-likelihood

ratio clipping are the key ingredients for realizing soft-output

MIMO detectors with near max-log performance at a computa-

tional complexity that is reasonably close to that of hard-output

sphere decoding.

I. INTRODUCTION

Multiple-input multiple-output (MIMO) wireless systems

employ multiple antennas on both sides of the wireless link

and offer increased spectral efﬁciency (compared to single-

antenna systems) by transmitting multiple data streams concur-

rently and in the same frequency band (spatial multiplexing).

MIMO technology constitutes the basis for upcoming wireless

communication standards, such as IEEE 802.11n and IEEE

802.16e.

The main challenge in the practical realization of MIMO

wireless systems lies in the efﬁcient implementation of the

detector which needs to separate the spatially multiplexed data

streams. To this end, a wide range of algorithms offering

various trade-offs between performance and computational

complexity have been developed [1]. Linear detection produc-

ing hard-decision outputs constitutes one extreme of the com-

plexity/performance trade-off region, while computationally

demanding a posteriori probability (APP) detection algorithms

result in the opposite extreme. The computational complexity

of a MIMO detection algorithm depends on the symbol

constellation size and the number of spatially multiplexed data

streams, but often also on the instantaneous MIMO channel

realization and the signal-to-noise ratio (SNR). On the other

hand, the overall decoding effort is typically constrained by

system bandwidth, latency requirements, and limitations on

power consumption. Implementing different algorithms, each

optimized for a maximum allowed decoding effort and/or a

particular system conﬁguration, would entail a considerable

This work was supported by the STREP project No. IST-026905 (MAS-

COT) within the sixth framework programme (FP6) of the European Com-

mission.

hardware overhead and in addition be highly inefﬁcient since

large portions of the chip would remain idle most of the time.

A practical MIMO receiver design must therefore be able to

cover a wide range of complexity/performance trade-offs using

a single tunable detection algorithm.

Contributions: In this (predominantly tutorial) paper, we

provide a formulation of the sphere decoder [2], [3] as a

tunable MIMO detector with performance ranging from that of

successive interference cancellation (SIC) to that of max-log

APP detection. Tuning of the detector is achieved through log-

likelihood ratio (LLR) clipping, preprocessing, and imposing

constraints on the maximum computational complexity of the

decoder. We formulate a framework for systematically char-

acterizing the resulting complexity/performance trade-offs. Fi-

nally, we elaborate on, and provide some reﬁnements of, the

tree-search algorithm introduced in [4] and the LLR clipping

approach proposed in [5].

Outline: The remainder of this paper is organized as fol-

lows. Section II reviews the transformation of the MIMO

detection and LLR computation problems into a tree-search

problem. Section III reviews max-log APP sphere decod-

ing and proposes some reﬁnements of existing algorithms.

In Section IV, we describe methods for reducing the tree-

search complexity. A framework for evaluating the complex-

ity/performance trade-offs of the resulting class of detectors is

introduced in Section V. We conclude in Section VI.

II. SOFT-OUTPUT SPHERE DECODING

Consider a MIMO system with M

transmit and M

≥ M

receive antennas. The coded bit-stream is mapped to

-dimensional transmit vector symbols s ∈ O

, where O

stands for the underlying complex-valued scalar constellation

of cardinality 2

. The individual coded bits are denoted by

j,b

, where the indices j and b refer to the bth bit in the

binary label of the jth entry of s, respectively. The resulting

complex baseband input-output relation is given by

y = Hs + n (1)

where H denotes the M

× M

channel matrix and n is

an i.i.d. proper complex Gaussian distributed M

-dimensional

noise vector with variance N

per complex entry.

A. Max-Log Soft-Output Computation

Soft-output MIMO detection requires the computation of

LLRs for all coded bits. In order to reduce the corresponding

computational complexity, we employ the max-log approxi-

mation [6]



j,b



= min

s∈X

(0)

j,b

ky − Hsk

− min

s∈X

(1)

j,b

ky − Hsk

(2)

where X

(0)

j,b

and X

(1)

j,b

are the disjoint sets of vector symbols

that have the bth bit in the label of the jth scalar symbol equal

to 0 and 1, respectively, and the LLRs in (2) are normalized

to avoid dependence on the noise variance. For each bit, one

of the two minima in (2) is given by λ

= ky − Hs

where

= arg min

s∈O

ky − Hsk

(3)

is the maximum likelihood (ML) solution. The other minimum

in (2) is given by

j,b

= min

s∈X

(

j,b

)

j,b

ky − Hsk

(4)

where the counter-hypothesis x

j,b

denotes the binary comple-

ment of the bth bit in the binary label of the jth entry of s

With (3) and (4) the max-log LLRs can be written as



j,b



(

− λ

j,b

, x

j,b

= 0

j,b

− λ

, x

j,b

= 1 .

(5)

From (5) we can conclude that efﬁcient max-log APP MIMO

detection reduces to efﬁciently identifying s

, λ

, and λ

j,b

for j = 1, 2, . . . , M

and b = 1, 2, . . . , Q [7].

B. Max-Log APP MIMO Detection as a Tree Search

Transforming (3) and (4) into tree-search problems and

using the sphere decoding algorithm [2], [3] allows to efﬁ-

ciently compute the LLRs (5). To this end, the channel matrix

H is ﬁrst QR-decomposed according to H = QR, where Q

is unitary of dimension M

× M

and the upper-triangular

×M

matrix R has real-valued positive entries on its main

diagonal. Left-multiplying (1) by

leads to the modiﬁed

input-output relation

y = Rs + Q

n with

y = Q

and hence, noting that Q

n has the same statistics as n, to

the equivalent formulation of λ

and λ

j,b

= min

s∈O

y − Rsk

(6)

j,b

= min

s∈X

(

j,b

)

j,b

y − Rsk

. (7)

We next deﬁne the partial symbol vectors (PSVs)

(i)

= [ s

i+1

· · · s

]

and note that the s

(i)

can be

arranged in a tree that has its root just above level i = M

and leaves, which correspond to possible candidate symbol

vectors, on level i = 1. After initializing d

= 0, the

The superscript

stands for conjugate transposition.

Euclidean distances d(s) = k

y − Rsk

in (6) and (7) can be

computed recursively as d(s) = d

with the partial Euclidean

distances (PEDs)

= d

i+1

+ |e

, i = M

, M

− 1, . . . , 1 (8)

and the distance increments (DIs)



˜y

−

j=i

i,j



. (9)

Since the dependence of the PEDs d

on the symbol vector s

is only through s

(i)

, we have transformed ML detection and

the computation of the max-log LLRs into a weighted tree-

search problem: PEDs and PSVs are associated with nodes,

branches correspond to DIs. Each path from the root down

to a leaf corresponds to a symbol vector s ∈ O

. The

leaf associated with the smallest metric in O

and X

“

j,b

”

j,b

corresponds to the solution of (6) and (7), respectively. The

basic building block underlying the two tree traversal strategies

described in the next section is the Schnorr-Euchner sphere

decoder (SESD) with radius reduction [8], brieﬂy summarized

as follows: The SESD constrains the search to nodes which

lie within a radius r around

y and traverses the tree depth-

ﬁrst, visiting the children of a given node in ascending order

of their PEDs. The basic idea of radius reduction is to start

the algorithm with r = ∞ and to update the radius according

to r

← d(s) whenever a leaf s has been reached. This avoids

the problem of selecting a suitable (initial) radius and leads to

efﬁcient pruning of the tree.

Throughout this paper, computational complexity is deﬁned

as the number of visited nodes. This complexity measure

is directly related to the throughput of corresponding VLSI

implementations [9].

III. TREE-TRAVERSAL STRATEGIES

Computing the LLRs as in (5) requires determining the

metric λ

j,b

, which is achieved by traversing only those parts

of the tree that have leaves in X

“

j,b

”

j,b

. Since this computation

has to be carried out for every coded bit, it is immediately

obvious that the resulting need for repeated tree traversals can

lead to a major computational burden. In the following, we

review two alternative tree-traversal strategies, proposed in [7]

and [4], respectively, for solving (6) and (7). In addition, we

propose some minor reﬁnements of the tree-search algorithm

introduced in [4].

A. Repeated Tree Search

An algorithm for computing the LLRs based on repeated

tree search (RTS) was described in [7]. The basic idea is

to start by solving (6) (using the SESD) and to rerun the

SESD to solve (7) for each coded bit (i.e., QM

times) in

the vector symbol. When rerunning the SESD to determine

j,b

, the search tree is prepruned by forcing the decoder to

exclude all nodes (and the corresponding subtrees) from the

search for which x

j,b

= x

j,b

. This prepruning procedure is

illustrated in Fig. 1. Initializing the SESD with r = ∞ in

0 0

0 0 0 0

= 1

= 0 x

= 0

= [ 0 1 1 ]

Fig. 1. Example of the prepruning procedure in the RTS approach. Counter-

hypotheses to the ML solution are found by forcing the algorithm through

the dashed branches.

each of the QM

runs required to obtain λ

j,b

will lead to

high computational complexity. It is therefore important to

realize that, without compromising max-log optimality, we

can initialize the search radius r

j,b

by setting it equal to the

minimum value of k

y − Rsk over all s ∈ X

“

j,b

”

j,b

found

during preceding tree traversals.

The main advantage of the RTS strategy lies in the fact

that each traversal of the tree can be performed using a hard-

decision SESD with minimal modiﬁcations to account for

the search being carried out on a prepruned tree. The main

disadvantage is the repeated traversal of large parts of the tree.

As noted in [10], this problem can be mitigated somewhat by

changing the detection order in each run. Unfortunately, the

resulting need for multiple QR-decompositions typically leads

to prohibitive overall computational complexity.

B. Single Tree Search

The key to a more efﬁcient (compared to RTS) tree-search

strategy is to ensure that every node in the tree is visited at

most once. This can be accomplished by searching for the ML

solution and all counter-hypotheses concurrently. The basic

idea behind such a single tree search (STS) approach has been

outlined in [4]. In the following, we shall elaborate on the

idea presented in [4] and describe some minor reﬁnements.

Speciﬁcally, we formulate update rules and a pruning criterion

based on a list containing the metrics λ

and λ

j,b

The main concept is to have a list containing the metric

along with the corresponding bit sequence x

and

the metrics λ

j,b

of all counter-hypotheses and to search the

subtree originating from a given node only if the result can

lead to an update of either λ

or one of the λ

j,b

List administration: The algorithm is initialized with

= λ

j,b

= ∞ (∀ j, b). Whenever a leaf with correspond-

ing binary label x has been reached, the decoder distinguishes

between two cases:

1) If a new ML hypothesis is found, i.e., d (x) < λ

, all

j,b

for which x

j,b

= x

j,b

are set to λ

followed

by the updates λ

← d (x) and x

← x. In other

words, for each bit in the ML hypothesis that is changed

in the process of the update, the metric of the former

ML hypothesis becomes the metric of the new counter-

hypothesis, followed by an update of the ML hypothesis.

This procedure ensures that all λ

j,b

always contain the

metric associated with a valid counter-hypothesis to the

current ML hypothesis.

2) In the case where d (x) ≥ λ

, only the counter-

hypotheses have to be checked. For all j and b for

which d (x) < λ

j,b

and x

j,b

= x

j,b

, the decoder up-

dates λ

j,b

← d (x).

Pruning criterion: The key aspect of this algorithm is

the following pruning criterion. A given node s

(i)

level i and the subtree originating from that node have

the partial binary label x

(i)

consisting of the bits x

j,b

(b = 1, 2, . . . , Q and j = i, i + 1, . . . , M

). The remaining

bits x

j,b

(j = 1, 2, . . . , i − 1) corresponding to the subtree are

unknown at this point. The pruning criterion for s

(i)

along

with its subtree is compiled from two conditions. First, the

bits in the partial binary label x

(i)

are compared with the

corresponding bits in the binary label of the current ML

hypothesis. In this comparison, for all j, b with x

j,b

= x

j,b

the corresponding counter-hypotheses λ

j,b

might be affected

when further searching the node’s subtree. Second, all counter-

hypotheses corresponding to the subtree of s

(i)

with the asso-

ciated metrics λ

j,b

(j = 1, 2, . . . , i − 1) may also be updated

since the corresponding bits are not yet known. In summary,

the metrics which may be affected during further search in the

subtree emanating from a node s

(i)

are given by the set

A = {a

} =

j,b



j,b

= x

j,b

, j ≥ i

∪

j,b



j < i

The node s

(i)

along with its subtree is pruned if its PED



(i)



satisﬁes



(i)



> max

∈A

. (10)

This pruning criterion (illustrated in Fig. 2) ensures that the

subtree of a given node is explored only if it can lead to an

update of either the ML hypothesis or of at least one of the

counter-hypotheses. Note that λ

does not appear in (10) as

≤ λ

j,b

(∀ j, b).

IV. METHODS FOR COMPLEXITY REDUCTION

So far we have discussed tree-search strategies which

solve (2) exactly and hence do not compromise the perfor-

mance of the max-log APP decoder. The goal of this section is

to describe methods that allow to trade-off decoder complexity

with (error rate) performance.

A. LLR Clipping

The dynamic range of LLRs is typically not bounded.

However, practical systems need to constrain the maximum

LLR value to enable ﬁxed-point implementations. Evidently

this will lead to a performance degradation. A straightforward

(i)

max

(i)

−1,2

i,2

i−1,1

i−1,2

1,1

1,2

? ?

counter-hypotheses

−1,1

i,1

0 0

level i

Fig. 2. Example of the STS pruning criterion (M

= 5 and two bits per

symbol): The partial binary label x

(i)

determines which counter-hypotheses

may be affected during the search of the subtree emanating from the current

node.

way of ensuring that LLR values are bounded is to clip them

after the detection stage so that



L(x

j,b

)



≤ L

max

∀ j, b . (11)

We emphasize that the constraint in (11) refers to the normal-

ized LLRs L(x

j,b

) as deﬁned in (2). It has been noted in [5]

that (11) can be built into the tree-search algorithm such that

it leads to a reduction in search complexity. In the following,

we brieﬂy describe the application of the idea proposed in [5]

to the RTS and the STS tree-traversal strategies.

a) LLR Clipping for RTS: Whenever the RTS algorithm

starts to search for a counter-hypothesis, with the search radius

j,b

initialized as described in Section III-A, we ﬁrst update

j,b

← min



j,b

, λ

+ L

max



(12)

which ensures that (11) is satisﬁed. Metrics associated with

counter-hypotheses for which no valid lattice point can be

found are set to λ

+ L

max

b) LLR Clipping for STS: Whenever a leaf has been

reached and a new ML hypothesis has been found after

carrying out the steps in Case 1 in Section III-B, the counter-

hypotheses have to be updated according to

j,b

← min

j,b

, λ

+ L

max

∀ j, b . (13)

For L

max

= ∞, we obviously get the exact max-log solution,

whereas for L

max

→ 0, the decoder performance approaches

that of a hard-output ML detector. On the other hand smaller

max

leads to a reduction in complexity, as more aggressive

pruning is performed. The parameter L

max

can therefore be

used to adjust the complexity/performance trade-off (cf. Sec-

tion V).

B. Ordering and Regularization

Ordering: A common approach to reduce complexity in

sphere decoding without compromising the decoder’s perfor-

mance is to adapt the detection ordering of the spatial streams

to the geometry of the instantaneous channel realization by

performing a QR-decomposition on HP (rather than H),

where P is a suitably chosen permutation matrix. More

efﬁcient pruning of the search tree closer to the root is obtained

if “stronger streams” correspond to the levels closer to the root,

i.e., P is chosen such that the main diagonal entries of R in

HP = QR are sorted in ascending order. In the following, this

approach is termed sorted QR-decomposition (SQRD) [11].

Regularization: Poorly conditioned channel realizations H

lead to signiﬁcant search complexity due to the low effective

SNR on one or multiple of the effective spatial streams. An

efﬁcient way to counter this problem is to perform the tree-

search on a regularized channel matrix by computing



αI



P =





where I is the M

× M

-identity matrix, Q

is of dimension

× M

, Q

and R are of dimension M

× M

, and

α > 0 is a suitably chosen regularization parameter. Note

that Q

is, in general, not unitary. LLRs are then computed

according to



j,b



= min

s∈X

(0)

j,b

y − R

− min

s∈X

(1)

j,b

y − R

(14)

where

y = Q

y and

s = Ps. Note that the LLRs in (14) need

to be reordered at the end of the decoding process to account

for the permutation induced by P. Operating on a regularized

version of the channel matrix clearly entails an (error rate)

performance loss. However, we shall see in Section V that

choosing α according to the minimum mean squared error

(MMSE) criterion (resulting in MMSE-SQRD) as outlined in

[12], degrades the performance only slightly while leading to

considerable savings in terms of search complexity.

C. Run-Time Constraints

A disadvantage of all SDs is that the computational com-

plexity required to ﬁnd the ML solution (and the LLR values)

depends on the realization of the channel matrix and the noise;

the worst-case complexity corresponds to an exhaustive search.

On the other hand, in order to meet the practically important

requirement of a ﬁxed throughput, the algorithm run-time must

be constrained, which leads to a constraint on the maximum

detection effort. This, in turn, generally prevents the detector

from achieving ML or max-log APP performance.

A straightforward way of enforcing a run-time constraint

is to terminate the search, on a symbol vector by symbol

vector basis, after a maximum number of visited nodes. The

detector then returns the best solution found so far, i.e., the

current ML and counter-hypotheses. A better solution is to

impose an aggregate run-time constraint of ND

avg

visited

nodes for an entire block of N vector symbols

. The maximum

complexity allocated to the detection of the kth vector symbol

can, for example, be chosen according to the maximum-ﬁrst

(MF) scheduling strategy [13] as

max

(k) = ND

avg

−

k−1

i=1

D(i) − (N − k)M

(15)

In an OFDM-based MIMO system, N would, for example, be the number

of OFDM tones.

where D(i) denotes the actual number of visited nodes for the

ith vector symbol. The concept behind (15) is that a vector

symbol is allowed to use up all of the remaining run-time

within the block up to a safety margin of (N − k)M

visited

nodes, which allows to ﬁnd at least the decision feedback

solution for the remaining vector symbols. Setting D

avg

= M

maximizes the throughput but reduces the performance to that

of hard-decision SIC.

V. PERFORMANCE/COMPLEXITY TRADE-OFFS

In practice, system engineers are typically faced with the

problem of designing a receiver that achieves a given target

frame error rate (FER) at a given throughput. The quality

of the receiver implementation can then be measured by the

minimum SNR required to achieve this target FER. In the

following, we assess the complexity/performance trade-offs

of the concepts described in Sections III and IV by plotting

the average (over independent channel and noise realizations)

number of visited nodes as a function of this minimum

SNR. Since the number of visited nodes translates directly to

the required chip area per throughput [9], the corresponding

charts allow to associate an SNR penalty with a reduction in

hardware complexity.

All simulation results are for a rate 1/2 (generator poly-

nomials [133

171

] and constraint length 7) convolutionally

encoded 4 × 4 MIMO-OFDM system with 16-QAM constel-

lation (using Gray mapping) and N = 64 tones. A soft-in

Viterbi decoder [14] is employed. One frame consists of 1024

randomly interleaved (across space and frequency) bits and a

TGn type C channel model [15] is used.

A. Comparison of Tree-Search Strategies

Fig. 3 compares the performance of RTS and STS max-

log APP decoders, and the list sphere decoder (LSD) [6] for

different target FERs, different values of L

max

and in the case

of the LSD for different list sizes. Changing the list size allows

to adjust the complexity/performance trade-off.

The STS approach is seen to clearly outperform the RTS

strategy in terms of average complexity. We can furthermore

see that for this setup max-log APP performance is achieved

for L

max

= 0.2. Increasing the LLR clipping level beyond

this value only increases complexity without improving per-

formance.

The implementation of the LSD requires additional memory

and logic for the administration of the candidate list, which is

not accounted for in this comparison. Fig. 3 shows that even

when this additional complexity is ignored, the LSD is still

inferior to the STS algorithm.

B. Impact of Preprocessing and Regularization

Fig. 4 compares the impact of SQRD, MMSE-SQRD, and

standard (unordered) QRD-based preprocessing on the com-

plexity/performance trade-off of the STS algorithm at a target

FER of 0.01. It can be seen that the improvement resulting

from SQRD compared to unordered QRD becomes signiﬁcant

in the low (but realistic) complexity region. Further (minor)

15.5 16 16.5 17 17.5 18 18.5 19

100

150

200

250

300

350

400

450

0.0125

0.025

0.05

0.1

0.2

0.4

0.0125

0.025

0.05

0.1

0.2

0.4

0.01250.025

0.05

0.1

0.2

0.4

0.0125

0.1

0.2

0.4

Average number of visited nodes

Minimum SNR for a given FER

RTS, FER=0.04

RTS, FER=0.01

STS, FER=0.04

STS, FER=0.01

LSD [6], FER=0.04

LSD [6], FER=0.01

0.05

0.025

Fig. 3. Comparison of repeated tree search (RTS), single tree search (STS)

and the list sphere decoder (LSD) as proposed in [6]. The numbers next to

the curves correspond to L

max

for RTS and STS and to the list size in the

case of the LSD.

16.5 17 17.5 18 18.5 19 19.5 20

100

120

hard-output

SESD

0.0125

0.025

0.05

0.1

0.2

0.0125

0.025

0.05

0.1

0.2

0.0125

0.025

0.05

0.1

0.2

0.4

Average number of visited nodes

Minimum SNR for a given FER

QRD

SQRD

MMSE-SQRD

Fig. 4. Comparison of unordered QRD, SQRD and MMSE-SQRD prepro-

cessing applied to STS at a target FER of 0.01. The numbers next to the

curves correspond to L

max

. For L

max

→ 0, the performance approaches

that of hard-output SESD.

improvements are obtained from regularization using MMSE-

SQRD. In the region where the average complexity is very

high, the performance penalty resulting from regularization

eventually renders MMSE-SQRD inferior to SQRD.

C. LLR Clipping

Both Fig. 3 and Fig. 4 show that adjusting the LLR clipping

level L

max

allows to sweep an entire family of sphere decoders

ranging from the exact max-log APP SESD (obtained, in our

setup, for L

max

≥ 0.2) to hard-output SESD (L

max

= 0). The

LLR clipping level is therefore an important design parameter

which can be used to conveniently adjust the decoder at

runtime to a given complexity constraint.

Soft-Output Sphere Decoding: Performance and Implementation Aspects

Figures

Citations

Simulating the Long Term Evolution physical layer

Soft-output sphere decoding: algorithms and VLSI implementation

Mutual information based calculation of the Precoding Matrix Indicator for 3GPP UMTS/LTE

Experimental Evaluation of Adaptive Modulation and Coding in MIMO WiMAX with Limited Feedback

Physics-inspired heuristics for soft MIMO detection in 5G new radio and beyond

References

Iterative decoding of binary block and convolutional codes

Achieving near-capacity on a multiple-antenna channel

Closest point search in lattices

Improved methods for calculating vectors of short length in a lattice, including a complexity analysis

Lattice basis reduction: improved practical algorithms and solving subset sum problems

Related Papers (5)

Achieving near-capacity on a multiple-antenna channel

Lattice basis reduction: improved practical algorithms and solving subset sum problems

Simulating the Long Term Evolution physical layer

Soft-input soft-output sphere decoding

A simple transmit diversity technique for wireless communications

Frequently Asked Questions (21)

Q1. What are the contributions mentioned in the paper "Soft-output sphere decoding: performance and implementation aspects" ?

Q2. What is the way to reduce complexity in sphere decoding?

Q3. What is the way to prune a search tree?

Q4. What is the simplest way to compute the LLRs?

Q5. What is the key to a more efficient tree-search strategy?

Q6. What is the main advantage of the RTS strategy?

Q7. What is the meaning of the SESD?

Q8. What is the performance of the sphere decoder?

Q9. what is the d(s) of a tree?

Q10. What is the way to enforce a run-time constraint?

Q11. What is the key to achieving low complexity?

Q12. What is the way to ensure that the LLR values are bounded?

Q13. What is the way to counter this problem?

Q14. What is the metric for determining the LLRs?

Q15. What is the corresponding LLR for the MR MT channel matrix?

Q16. What is the corresponding computational complexity of the MR MT antenna?

Q17. How can the authors initialize the search radius rj,b?

Q18. What is the ML algorithm used to update the counterhypotheses?

Q19. What is the way to limit the complexity of the detection of the kth vector symbol?

Q20. What is the pruning criterion for a tree?

Q21. What is the main idea behind the single tree search?