scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

A multi-standard flexible turbo/LDPC decoder via ASIC design

TL;DR: This paper describes the first complete design of a single-core multi-standard flexible Turbo/LDPC decoder using an ASIC approach and provides a proof-of-concept implementation complaint with 3GPP-HSDPA, DVB-SH, IEEE 802.16e and IEEE802.11n standards.
Abstract: This paper describes the first complete design of a single-core multi-standard flexible Turbo/LDPC decoder using an ASIC approach. Such a solution outperforms other state-of-the-art implementations based on application-specific instruction-set processors (ASIPs), which are shown to suffer from impaired throughput and power consumption. In this paper, we describe in detail the VLSI flexible architecture of a decoder coping with all the modern communication standards defining LDPC and Turbo codes, and provide a proof-of-concept implementation complaint with 3GPP-HSDPA, DVB-SH, IEEE 802.16e and IEEE 802.11n standards. The decoder, running at only 150MHz for a reduced power, occupies an area of 0.9mm2 with a maximum power consumption of only 86.1mW.
Citations
More filters
Journal ArticleDOI
TL;DR: This work concentrates on the design of a reconfigurable architecture for both turbo and LDPC codes decoding, tackling the reconfiguration issue and introducing a formal and systematic treatment that was not previously addressed.
Abstract: Flexible and reconfigurable architectures have gained wide popularity in the communications field. In particular, reconfigurable architectures for the physical layer are an attractive solution not only to switch among different coding modes but also to achieve interoperability. This work concentrates on the design of a reconfigurable architecture for both turbo and LDPC codes decoding. The novel contributions of this paper are: i) tackling the reconfiguration issue introducing a formal and systematic treatment that, to the best of our knowledge, was not previously addressed and ii) proposing a reconfigurable NoC-based turbo/LDPC decoder architecture and showing that wide flexibility can be achieved with a small complexity overhead. Obtained results show that dynamic switching between most of considered communication standards is possible without pausing the decoding activity. Moreover, post-layout results show that tailoring the proposed architecture to the WiMAX standard leads to an area occupation of 2.75 mm2 and a power consumption of 101.5 mW in the worst case.

57 citations


Cites background or methods from "A multi-standard flexible turbo/LDP..."

  • ...When comparing WiFi results [12] guarantees a higher than A, B and C, even though aiming for a lower throughput than B....

    [...]

  • ...In [9]–[11], flexibility is achieved through the design of processing elements (PEs) based on application-specific-instruction-set-processor (ASIP) architectures, whereas in [12]–[14] PEs rely on application-specific-integrated-circuit (ASIC) solutions....

    [...]

  • ...The multi-standard decoder designed in [12] supports 3GPP-HSDPA, WiFi, WiMAX, and DVB-SH....

    [...]

Proceedings ArticleDOI
14 Mar 2011
TL;DR: A multi-core architecture which supports convolutional codes, binary/duo-binary turbo codes, and LDPC codes, based on Application Specific Instruction-set Processors (ASIP) and avoids the use of dedicated interleave/deinterleave address lookup memories is presented.
Abstract: In order to address the large variety of channel coding options specified in existing and future digital communication standards, there is an increasing need for flexible solutions. This paper presents a multi-core architecture which supports convolutional codes, binary/duo-binary turbo codes, and LDPC codes. The proposed architecture is based on Application Specific Instruction-set Processors (ASIP) and avoids the use of dedicated interleave/deinterleave address lookup memories. Each ASIP consists of two datapaths one optimized for turbo and the other for LDPC mode, while efficiently sharing memories and communication resources. The logic synthesis results yields an overall area of 2.6mm2 using 90nm technology. Payload throughputs of up to 312Mbps in LDPC mode and of 173Mbps in Turbo mode are possible at 520MHz, fairing better than existing solutions.

36 citations


Cites background from "A multi-standard flexible turbo/LDP..."

  • ...In [1], a flexible architecture is presented that support Viterbi (for CC decoding), LDPC and turbo decoding (SBTC and DBTC)....

    [...]

  • ...In spite of good throughput achieved, this architecture does not support DBTC....

    [...]

  • ...A high throughput of 257Mbps is achieved for LDPC mode while a limited throughput of 37.2Mbps in DBTC and 18.6Mbps in SBTC modes are achieved at 400MHz....

    [...]

  • ...The supported types in turbo are usually Single Binary and/ Double Binary Turbo Codes (SBTC and DBTC)....

    [...]

  • ...First we initialize the ASIP mode (SBTC, DBTC), current iteration number (iter = 0), number of windows (N ) per ASIP, length of windows (L) and the length of last window (Llast)....

    [...]

Patent
08 Sep 2010
TL;DR: A configurable Turbo-LDPC decoder is presented in this article, where a set of P> 1 Soft-Input-Soft-Output decoding units (DP 0 -DP P-1 ; DP i ) are used for iteratively decoding both Turbo-and LDPC-encoded input data.
Abstract: A configurable Turbo-LDPC decoder comprising: A set of P> 1 Soft-Input-Soft-Output decoding units (DP 0 -DP P-1 ; DP i ) for iteratively decoding both Turbo- and LDPC-encoded input data, each of said decoding units having first (I 1 i ) and second (I 2 i ) input ports and first (O 1 i ) and second (O 2 i ) output ports for intermediate data; First and second memories (M 1 , M 2 ) for storing said intermediate data, each of said first and second memories comprising P independently readable and writable memory blocks having respective input and output ports; and A configurable switching network (SN) for connecting the first input and output ports of said decoding units to the output and input ports of said first memory, and the second input and output ports of said decoding units to the output and input ports of said second memory

22 citations

Proceedings ArticleDOI
12 Mar 2012
TL;DR: This contribution focuses on one of the most important baseband processing units in wireless receivers, the forward error correction unit, and proposes a Network-on-Chip (NoC) based approach to the design of multi-standard decoders.
Abstract: The current convergence process in wireless technologies demands for strong efforts in the conceiving of highly flexible and interoperable equipments. This contribution focuses on one of the most important baseband processing units in wireless receivers, the forward error correction unit, and proposes a Network-on-Chip (NoC) based approach to the design of multi-standard decoders. High level modeling is exploited to drive the NoC optimization for a given set of both turbo and Low-Density-Parity-Check (LDPC) codes to be supported. Moreover, synthesis results prove that the proposed approach can offer a fully compliant WiMAX decoder, supporting the whole set of turbo and LDPC codes with higher throughput and an occupied area comparable or lower than previously reported flexible implementations. In particular, the mentioned design case achieves a worst-case throughput higher than 70 Mb/s at the area cost of 3.17 mm2 on a 90 nm CMOS technology.

20 citations


Cites background from "A multi-standard flexible turbo/LDP..."

  • ...However, it does not reach a high enough throughput for the WiMAX standard, while our decoder has both smaller area and higher throughput than [7]....

    [...]

  • ...As in [7], the number of bits to represent λk[c], αk[s], βk[s] and λk[u] is set to 7, whereas 5 bits are sufficient for Rlk and λk[c(e)]....

    [...]

  • ...Few recent works [7], [9], [15] tried to exploit the intra-IP NoC approach to design flexible turbo/LDPC decoder architectures....

    [...]

Journal ArticleDOI
TL;DR: This paper derives for the first time the closed-form expressions for the exact Cramér-Rao lower bounds (CRLBs) of these estimators over turbo-codedsquare-QAM-modulated single- or multi-carrier transmissions, and introduces a new recursive process that enables the construction of arbitrary Gray-coded square- QAM constellations.
Abstract: In this paper, we consider the problem of joint phase and carrier frequency offset (CFO) estimation for turbo-coded systems. We derive for the first time the closed-form expressions for the exact Cramer-Rao lower bounds (CRLBs) of these estimators over turbo-coded square-QAM-modulated single- or multi-carrier transmissions. In the latter case, the derived bounds remain valid in the general case of adaptive modulation and coding (AMC) where the coding rate and modulation order vary from one subcarrier to another depending on the corresponding channel quality information (CQI). In particular, we introduce a new recursive process that enables the construction of arbitrary Gray-coded square-QAM constellations. Some hidden properties of such constellations will be revealed, owing to this recursive process, and carefully handled to decompose the system's likelihood function (LF) into the sum of two analogous terms. This decomposition makes it possible to carry out analytically all the statistical expectations involved in the Fisher information matrix (FIM). The new analytical CRLB expressions corroborate the previous attempts to evaluate the underlying bounds empirically . In the low-to-medium signal-to-noise ratio (SNR) region, the CRLB for code-aided (CA) estimation lies between the bounds for completely blind [non-data-aided (NDA)] and completely data-aided (DA) estimation schemes, thereby highlighting the effect of the coding gain. Most interestingly, in contrast to the NDA case, the CA CRLBs start to decay rapidly and reach the DA bounds at relatively small SNR thresholds. It will also be shown that contrary to the CRLB of the phase shift, the CRLB of the CFO improves in a multi-carrier system as compared to its counterpart in a single-carrier system. The derived bounds are also valid for LDPC-coded systems and they can be evaluated in the same way when the latter are decoded using the turbo principal.

17 citations


Cites methods from "A multi-standard flexible turbo/LDP..."

  • ...In fact, we plot the CA CRLBs for the carrier phase estimation using two different coding rates R1 = 0.3285 ≈ 13 and R2 = 0.4892 ≈ 1 2 ....

    [...]

  • ...In fact, by defining ρq = S2q/2σ 2 to be the signal-to-noiseplus-iterference ratio (SINR) on the qth subcarrier, we obtain for for q, q′ = 1, 2, · · · , Q: [I(α)]ϕq,ϕq =−E { ∂2 ln ( p[Y ;α] ) ∂ϕ2q } = k0+K−1∑ k=k0 Ωp,k(ρq), (76) [I(α)]ν,ν =−E { ∂2 ln ( p[Y ;α] ) ∂ν∂ν } = (2π)2 Q∑ q=1 k0+K−1∑ k=k0…...

    [...]

References
More filters
Book
01 Jan 1963
TL;DR: A simple but nonoptimum decoding scheme operating directly from the channel a posteriori probabilities is described and the probability of error using this decoder on a binary symmetric channel is shown to decrease at least exponentially with a root of the block length.
Abstract: A low-density parity-check code is a code specified by a parity-check matrix with the following properties: each column contains a small fixed number j \geq 3 of l's and each row contains a small fixed number k > j of l's. The typical minimum distance of these codes increases linearly with block length for a fixed rate and fixed j . When used with maximum likelihood decoding on a sufficiently quiet binary-input symmetric channel, the typical probability of decoding error decreases exponentially with block length for a fixed rate and fixed j . A simple but nonoptimum decoding scheme operating directly from the channel a posteriori probabilities is described. Both the equipment complexity and the data-handling capacity in bits per second of this decoder increase approximately linearly with block length. For j > 3 and a sufficiently low rate, the probability of error using this decoder on a binary symmetric channel is shown to decrease at least exponentially with a root of the block length. Some experimental results show that the actual probability of decoding error is much smaller than this theoretical bound.

11,592 citations

Proceedings Article
01 Jan 1993

7,742 citations

Proceedings ArticleDOI
23 May 1993
TL;DR: In this article, a new class of convolutional codes called turbo-codes, whose performances in terms of bit error rate (BER) are close to the Shannon limit, is discussed.
Abstract: A new class of convolutional codes called turbo-codes, whose performances in terms of bit error rate (BER) are close to the Shannon limit, is discussed. The turbo-code encoder is built using a parallel concatenation of two recursive systematic convolutional codes, and the associated decoder, using a feedback decoding rule, is implemented as P pipelined identical elementary decoders. >

5,963 citations

Journal ArticleDOI
TL;DR: The general problem of estimating the a posteriori probabilities of the states and transitions of a Markov source observed through a discrete memoryless channel is considered and an optimal decoding algorithm is derived.
Abstract: The general problem of estimating the a posteriori probabilities of the states and transitions of a Markov source observed through a discrete memoryless channel is considered. The decoding of linear block and convolutional codes to minimize symbol error probability is shown to be a special case of this problem. An optimal decoding algorithm is derived.

4,830 citations


"A multi-standard flexible turbo/LDP..." refers methods in this paper

  • ...The BCJR algorithm [18] has been selected as the most appropriate solution since, besides the well-known use for turbo decoding, it can be also employed for LDPC decoding in the form of turbo-decoding message­ passing (TDMP) [17]....

    [...]