A flexible high throughput multi-ASIP architecture for LDPC and turbo decoding

doi:10.1109/DATE.2011.5763047

Home
/
Papers
/
A flexible high throughput multi-ASIP architecture for LDPC and turbo decoding

Proceedings Article•DOI•

A flexible high throughput multi-ASIP architecture for LDPC and turbo decoding

Purushotham Murugappa¹, Rachid Al-Khayat¹, Amer Baghdadi¹, Michel Jezequel¹•Institutions (1)

École nationale supérieure des télécommunications de Bretagne¹

14 Mar 2011-pp 1-6

TL;DR: A multi-core architecture which supports convolutional codes, binary/duo-binary turbo codes, and LDPC codes, based on Application Specific Instruction-set Processors (ASIP) and avoids the use of dedicated interleave/deinterleave address lookup memories is presented.

read less

Abstract: In order to address the large variety of channel coding options specified in existing and future digital communication standards, there is an increasing need for flexible solutions. This paper presents a multi-core architecture which supports convolutional codes, binary/duo-binary turbo codes, and LDPC codes. The proposed architecture is based on Application Specific Instruction-set Processors (ASIP) and avoids the use of dedicated interleave/deinterleave address lookup memories. Each ASIP consists of two datapaths one optimized for turbo and the other for LDPC mode, while efficiently sharing memories and communication resources. The logic synthesis results yields an overall area of 2.6mm2 using 90nm technology. Payload throughputs of up to 312Mbps in LDPC mode and of 173Mbps in Turbo mode are possible at 520MHz, fairing better than existing solutions.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

VLSI Implementation of a Multi-Mode Turbo/LDPC Decoder Architecture

[...]

Carlo Condo, Maurizio Martina, Guido Masera

01 Jun 2013-IEEE Transactions on Circuits and Systems I-regular Papers

TL;DR: This work concentrates on the design of a reconfigurable architecture for both turbo and LDPC codes decoding, tackling the reconfiguration issue and introducing a formal and systematic treatment that was not previously addressed.

...read moreread less

Abstract: Flexible and reconfigurable architectures have gained wide popularity in the communications field. In particular, reconfigurable architectures for the physical layer are an attractive solution not only to switch among different coding modes but also to achieve interoperability. This work concentrates on the design of a reconfigurable architecture for both turbo and LDPC codes decoding. The novel contributions of this paper are: i) tackling the reconfiguration issue introducing a formal and systematic treatment that, to the best of our knowledge, was not previously addressed and ii) proposing a reconfigurable NoC-based turbo/LDPC decoder architecture and showing that wide flexibility can be achieved with a small complexity overhead. Obtained results show that dynamic switching between most of considered communication standards is possible without pausing the decoding activity. Moreover, post-layout results show that tailoring the proposed architecture to the WiMAX standard leads to an area occupation of 2.75 mm2 and a power consumption of 101.5 mW in the worst case.

...read moreread less

57 citations

Cites background or methods from "A flexible high throughput multi-AS..."

...[11] propose an ASIP decoder architecture supporting WiMAX and WiFi LDPC codes, and WiMAX, 3GPP-LTE and DVB-RCS turbo codes....
[...]
...On the contrary, worst case throughput in [11] is not high enough for WiMAX....
[...]
...In [9]–[11], flexibility is achieved through the design of processing elements (PEs) based on application-specific-instruction-set-processor (ASIP) architectures, whereas in [12]–[14] PEs rely on application-specific-integrated-circuit (ASIC) solutions....
[...]

Proceedings Article•DOI•

A network-on-chip-based turbo/LDPC decoder architecture

[...]

Carlo Condo¹, Maurizio Martina¹, Guido Masera¹•Institutions (1)

Polytechnic University of Turin¹

12 Mar 2012

TL;DR: This contribution focuses on one of the most important baseband processing units in wireless receivers, the forward error correction unit, and proposes a Network-on-Chip (NoC) based approach to the design of multi-standard decoders.

...read moreread less

Abstract: The current convergence process in wireless technologies demands for strong efforts in the conceiving of highly flexible and interoperable equipments. This contribution focuses on one of the most important baseband processing units in wireless receivers, the forward error correction unit, and proposes a Network-on-Chip (NoC) based approach to the design of multi-standard decoders. High level modeling is exploited to drive the NoC optimization for a given set of both turbo and Low-Density-Parity-Check (LDPC) codes to be supported. Moreover, synthesis results prove that the proposed approach can offer a fully compliant WiMAX decoder, supporting the whole set of turbo and LDPC codes with higher throughput and an occupied area comparable or lower than previously reported flexible implementations. In particular, the mentioned design case achieves a worst-case throughput higher than 70 Mb/s at the area cost of 3.17 mm2 on a 90 nm CMOS technology.

...read moreread less

20 citations

Cites background from "A flexible high throughput multi-AS..."

...Few recent works [7], [9], [15] tried to exploit the intra-IP NoC approach to design flexible turbo/LDPC decoder architectures....
[...]
...Comparison with [9] shows similar core area occupation, whereas our NoC contributes for 0....
[...]
...High throughput PEs able to support both turbo and LDPC decoding [5]–[9] can be im-...
[...]
...It is worth noting that, in the turbo decoding mode the proposed architecture achieves the lowest power consumption as compared with [5]–[9]....
[...]
...5 code, are still above 70 Mb/s: in [9], according to the provided formula, for the same code throughput is below the standard threshold....
[...]

Book Chapter•DOI•

Chapter 13 – Hardware Design and Realization for Iteratively Decodable Codes

[...]

Emmanuel Boutillon¹, Guido Masera²•Institutions (2)

University of Southern Brittany¹, Polytechnic University of Turin²

01 Jun 2014

TL;DR: In this chapter, an overview of architecture of turbo and LDPC codes is presented, the standard implementation of those codes is first presented, and architecture for high-speed, low-power, and high flexibility are derived.

...read moreread less

Abstract: The transition from analog telecommunication equipment and terminals to digital systems and, more recently, the fast development of wireless communications were made possible by three factors: 1) key advances in integrated circuit technology, 2) large improvements in methodologies and tools for the design of highly complex digital circuits, and 3) progress in information theory, in particular, the belief propagation algorithm that allows error control codes operating close to the Shannon limit. In this chapter, an overview of architecture of turbo and LDPC codes is presented. The standard implementation (i.e., low complexity) of those codes is first presented. Then architecture for high-speed, low-power, and high flexibility are derived. Finally, the chapter concludes with the presentation of exotic decoding architectures and a survey of relevant architectures. Keywords

...read moreread less

13 citations

Cites background from "A flexible high throughput multi-AS..."

...multi-ASIP architecture is described in [116], where each core includes two instruction sets, one for LDPC and one for turbo codes....
[...]

Proceedings Article•DOI•

Low-latency software LDPC decoders for x86 multi-core devices

[...]

Bertrand Le Gal¹, Christophe Jego¹•Institutions (1)

University of Bordeaux¹

01 Oct 2017

TL;DR: A novel LDPC parallelization approach for LDPC decoding on a multi-core processor device is proposed, which reduces the processing latency down to some microseconds as highlighted by x86 multi- core experimentations.

...read moreread less

Abstract: LDPC codes are a family of error correcting codes used in most modern digital communication standards even in future 3GPP 5G standard. Thanks to their high processing power and their parallelization capabilities, prevailing multi-core and many-core devices facilitate real-time implementations of digital communication systems, which were previously implemented on dedicated hardware targets. Through massive frame decoding parallelization, current LDPC decoders throughputs range from hundreds of Mbps up to Gbps. However, inter-frame parallelization involves latency penalties, while in future 5G wireless communication systems, the latency should be reduced as far as possible. To this end, a novel LDPC parallelization approach for LDPC decoding on a multi-core processor device is proposed in this article. It reduces the processing latency down to some microseconds as highlighted by x86 multi-core experimentations.

...read moreread less

12 citations

Cites background or methods from "A flexible high throughput multi-AS..."

...The elements VN[3,34] are loaded from memory and then the unsolicited values are masked....
[...]
...However, the multi-core processor data-path cannot be sized according to Z contrary to custom ASIP solutions [31], [32], [33], [34]....
[...]

Journal Article•DOI•

A Flexible NISC-Based LDPC Decoder

[...]

Bertrand Le Gal¹, Christophe Jego¹, Camille Leroux¹•Institutions (1)

University of Bordeaux¹

01 May 2014-IEEE Transactions on Signal Processing

TL;DR: This paper proposes to address the problem of designing generic and efficient LDPC decoders by using a nonsymmetric NISC-based architecture that performs layered decoding that achieves higher hardware efficiency even for the challenging-to-implement unstructured LDPC codes.

...read moreread less

Abstract: Low density parity-check (LDPC) codes, are widely used for error correction in digital communication systems. Their inclusion in communication standards requires to define decoders able to support efficiently a set of codes with different code length, code rates or code structures. In addition to this high flexibility, these decoders still have to achieve high throughputs in order to comply with standards requirements. In this paper, we propose to address the problem of designing generic and efficient LDPC decoders by using a nonsymmetric NISC-based architecture that performs layered decoding. NISC architectures provide flexibility with a limited loss in hardware efficiency. In addition, an automated design flow is used to efficiently assign computations to the processing units (PU) and to map data to the memory units (MU). Unlike previous works, the NISC decoder can include a number of PUs that is different than the number of MUs. This nonsymmetric characteristic provides a higher degree of freedom during the computation/data assignment phase of the design flow. This whole design framework automatically generates an LDPC decoder able to support any set of predetermined LDPC codes regardless of their parameters. The automated nature of the design framework enables to easily explore the design space for a given set of codes. Compared to state of the art LDPC decoders, the automatically generated decoders achieve higher hardware efficiency even for the challenging-to-implement unstructured LDPC codes.

...read moreread less

12 citations

1
2
3
4
…
5
6
7
8

Collapse

References

PDF

Open Access

More filters

Proceedings Article•DOI•

FlexiChaP: A reconfigurable ASIP for convolutional, turbo, and LDPC code decoding

[...]

Matthias Alles¹, Timo Vogt¹, Norbert Wehn¹•Institutions (1)

Kaiserslautern University of Technology¹

24 Oct 2008

TL;DR: An application-specific instruction-set processor (ASIP) which supports convolutional codes, binary/duo-binary turbo codes, and LDPC codes is presented, outperforming existing ASIP solutions for LDPC decoding by an order of magnitude.

...read moreread less

Abstract: Future mobile and wireless communication networks require flexible modem architectures to provide seamless services between different network standards. In this paper we focus on the outer modem which has to support various advanced channel coding techniques like convolutional codes, turbo codes, and low-density parity-check (LDPC) codes. We present an application-specific instruction-set processor (ASIP) which supports convolutional codes, binary/duo-binary turbo codes, and LDPC codes. Special emphasis is put on the support of LDPC codes. The ASIP consists of a special pipeline which is completely optimized for channel decoding. Logic synthesis yields an overall area of 0.62 mm2 for this ASIP in a 65 nm low power technology. Payload throughputs of, e.g., up to 257 Mbps are possible at 400 MHz for the WiMAX and WiFi LDPC codes, outperforming existing ASIP solutions for LDPC decoding by an order of magnitude.

...read moreread less

77 citations

"A flexible high throughput multi-AS..." refers background in this paper

...However, it occupies a large area of 0.9mm2 in 45nm technology ( 3.6mm2 in 90nm)....
[...]

Proceedings Article•DOI•

Binary de Bruijn on-chip network for a flexible multiprocessor LDPC decoder

[...]

Hazem Moussa¹, Amer Baghdadi¹, Michel Jezequel¹•Institutions (1)

École nationale supérieure des télécommunications de Bretagne¹

08 Jun 2008

TL;DR: A novel on-chip interconnection network adapted to a flexible multiprocessor LDPC decoder based on the de Bruijn network that allows it to efficiently support the communication intensive nature of the application.

...read moreread less

Abstract: This paper proposes a novel on-chip interconnection network adapted to a flexible multiprocessor LDPC decoder based on the de Bruijn network. The main characteristics of this network - including its logarithmic diameter, scalable aggregate bandwidth, and optimized routing technique- allow it to efficiently support the communication intensive nature of the application. We present a detailed hardware implementation of the routers and the network interfaces as well as the packet format and the routing algorithm. The latter is a parallelized version of the shortest path with deflection routing algorithm. In order to evaluate the performance of the proposed network, a generic RTL VHDL description has been developed and synthesized with CMOS STMicroelectronics 0.18 mum technology. The flexibility and the scalability of this on-chip communication network enable it to be used for any kind of LDPC code.

...read moreread less

57 citations

Journal Article•DOI•

From Parallelism Levels to a Multi-ASIP Architecture for Turbo Decoding

[...]

Olivier Muller, Amer Baghdadi, Michel Jezequel

01 Jan 2009-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: In this article, the authors present a flexible multiprocessor platform for high throughput turbo decoding using configurable application-specific instruction set processors (ASIP) combined with an efficient memory and communication interconnect scheme.

...read moreread less

Abstract: Emerging digital communication applications and the underlying architectures encounter drastically increasing performance and flexibility requirements. In this paper, we present a novel flexible multiprocessor platform for high throughput turbo decoding. The proposed platform enables exploiting all parallelism levels of turbo decoding applications to fulfill performance requirements. In order to fulfill flexibility requirements, the platform is structured around configurable application-specific instruction-set processors (ASIP) combined with an efficient memory and communication interconnect scheme. The designed ASIP has an single instruction multiple data (SIMD) architecture with a specialized and extensible instruction-set and 6-stages pipeline control. The attached memories and communication interfaces enable its integration in multiprocessor architectures. These multiprocessor architectures benefit from the recent shuffled decoding technique introduced in the turbo-decoding field to achieve higher throughput. The major characteristics of the proposed platform are its flexibility and scalability which make it reusable for all simple and double binary turbo codes of existing and emerging standards. Results obtained for double binary WiMAX turbo codes demonstrate around 250 Mb/s throughput using 16-ASIP multiprocessor architecture.

...read moreread less

57 citations

From Parallelism Levels to a Multi-ASIP Architecture

[...]

Olivier Muller, Amer Baghdadi, Michel Jezequel

01 Jan 2008

TL;DR: A novel flexible multiprocessor platform for high throughput turbo decoding that enables exploiting all parallelism levels of turbo decoding applications to fulfill performance requirements and is reusable for all simple and double binary turbo codes of existing and emerging standards.

...read moreread less

Abstract: Emergingdigitalcommunicationapplicationsandthe underlying architectures encounter drastically increasing perfor- mance and flexibility requirements. In this paper, we present a novel flexible multiprocessor platform for high throughput turbo decoding. The proposed platform enables exploiting all parallelism levelsofturbodecodingapplicationstofulfillperformancerequire- ments. In order to fulfill flexibility requirements, the platform is structuredaroundconfigurableapplication-specificinstruction-set processors (ASIP) combined with an efficient memory and com- munication interconnect scheme. The designed ASIP has an single instruction multiple data (SIMD) architecture with a specialized and extensible instruction-set and 6-stages pipeline control. The attached memories and communication interfaces enable its inte- gration in multiprocessor architectures. These multiprocessor ar- chitectures benefit from the recent shuffled decoding technique in- troduced in the turbo-decoding field to achieve higher throughput. The major characteristics of the proposed platform are its flex- ibility and scalability which make it reusable for all simple and double binary turbo codes of existing and emerging standards. Re- sults obtained for double binary WiMAX turbo codes demonstrate around 250 Mb/s throughput using 16-ASIP multiprocessor archi- tecture.

...read moreread less

55 citations

Journal Article•DOI•

A Flexible LDPC/Turbo Decoder Architecture

[...]

Yang Sun¹, Joseph R. Cavallaro¹•Institutions (1)

Rice University¹

01 Jul 2011

TL;DR: A unified message passing algorithm for LDPC and Turbo codes is proposed and a flexible soft-input soft-output (SISO) module to handle LDPC/Turbo decoding is introduced and an area-efficient flexible SISO decoder architecture is proposed to support LDPC-Turbo codes decoding.

...read moreread less

Abstract: Low-density parity-check (LDPC) codes and convolutional Turbo codes are two of the most powerful error correcting codes that are widely used in modern communication systems. In a multi-mode baseband receiver, both LDPC and Turbo decoders may be required. However, the different decoding approaches for LDPC and Turbo codes usually lead to different hardware architectures. In this paper we propose a unified message passing algorithm for LDPC and Turbo codes and introduce a flexible soft-input soft-output (SISO) module to handle LDPC/Turbo decoding. We employ the trellis-based maximum a posteriori (MAP) algorithm as a bridge between LDPC and Turbo codes decoding. We view the LDPC code as a concatenation of n super-codes where each super-code has a simpler trellis structure so that the MAP algorithm can be easily applied to it. We propose a flexible functional unit (FFU) for MAP processing of LDPC and Turbo codes with a low hardware overhead (about 15% area and timing overhead). Based on the FFU, we propose an area-efficient flexible SISO decoder architecture to support LDPC/Turbo codes decoding. Multiple such SISO modules can be embedded into a parallel decoder for higher decoding throughput. As a case study, a flexible LDPC/Turbo decoder has been synthesized on a TSMC 90 nm CMOS technology with a core area of 3.2 mm2. The decoder can support IEEE 802.16e LDPC codes, IEEE 802.11n LDPC codes, and 3GPP LTE Turbo codes. Running at 500 MHz clock frequency, the decoder can sustain up to 600 Mbps LDPC decoding or 450 Mbps Turbo decoding.

...read moreread less

37 citations

"A flexible high throughput multi-AS..." refers methods in this paper

...A high throughput of 257Mbps is achieved for LDPC mode while a limited throughput of 37.2Mbps in DBTC and 18.6Mbps in SBTC modes are achieved at 400MHz....
[...]