scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

A flexible high throughput multi-ASIP architecture for LDPC and turbo decoding

TL;DR: A multi-core architecture which supports convolutional codes, binary/duo-binary turbo codes, and LDPC codes, based on Application Specific Instruction-set Processors (ASIP) and avoids the use of dedicated interleave/deinterleave address lookup memories is presented.
Abstract: In order to address the large variety of channel coding options specified in existing and future digital communication standards, there is an increasing need for flexible solutions. This paper presents a multi-core architecture which supports convolutional codes, binary/duo-binary turbo codes, and LDPC codes. The proposed architecture is based on Application Specific Instruction-set Processors (ASIP) and avoids the use of dedicated interleave/deinterleave address lookup memories. Each ASIP consists of two datapaths one optimized for turbo and the other for LDPC mode, while efficiently sharing memories and communication resources. The logic synthesis results yields an overall area of 2.6mm2 using 90nm technology. Payload throughputs of up to 312Mbps in LDPC mode and of 173Mbps in Turbo mode are possible at 520MHz, fairing better than existing solutions.
Citations
More filters
Journal ArticleDOI
TL;DR: This work concentrates on the design of a reconfigurable architecture for both turbo and LDPC codes decoding, tackling the reconfiguration issue and introducing a formal and systematic treatment that was not previously addressed.
Abstract: Flexible and reconfigurable architectures have gained wide popularity in the communications field. In particular, reconfigurable architectures for the physical layer are an attractive solution not only to switch among different coding modes but also to achieve interoperability. This work concentrates on the design of a reconfigurable architecture for both turbo and LDPC codes decoding. The novel contributions of this paper are: i) tackling the reconfiguration issue introducing a formal and systematic treatment that, to the best of our knowledge, was not previously addressed and ii) proposing a reconfigurable NoC-based turbo/LDPC decoder architecture and showing that wide flexibility can be achieved with a small complexity overhead. Obtained results show that dynamic switching between most of considered communication standards is possible without pausing the decoding activity. Moreover, post-layout results show that tailoring the proposed architecture to the WiMAX standard leads to an area occupation of 2.75 mm2 and a power consumption of 101.5 mW in the worst case.

57 citations


Cites background or methods from "A flexible high throughput multi-AS..."

  • ...[11] propose an ASIP decoder architecture supporting WiMAX and WiFi LDPC codes, and WiMAX, 3GPP-LTE and DVB-RCS turbo codes....

    [...]

  • ...On the contrary, worst case throughput in [11] is not high enough for WiMAX....

    [...]

  • ...In [9]–[11], flexibility is achieved through the design of processing elements (PEs) based on application-specific-instruction-set-processor (ASIP) architectures, whereas in [12]–[14] PEs rely on application-specific-integrated-circuit (ASIC) solutions....

    [...]

Proceedings ArticleDOI
12 Mar 2012
TL;DR: This contribution focuses on one of the most important baseband processing units in wireless receivers, the forward error correction unit, and proposes a Network-on-Chip (NoC) based approach to the design of multi-standard decoders.
Abstract: The current convergence process in wireless technologies demands for strong efforts in the conceiving of highly flexible and interoperable equipments. This contribution focuses on one of the most important baseband processing units in wireless receivers, the forward error correction unit, and proposes a Network-on-Chip (NoC) based approach to the design of multi-standard decoders. High level modeling is exploited to drive the NoC optimization for a given set of both turbo and Low-Density-Parity-Check (LDPC) codes to be supported. Moreover, synthesis results prove that the proposed approach can offer a fully compliant WiMAX decoder, supporting the whole set of turbo and LDPC codes with higher throughput and an occupied area comparable or lower than previously reported flexible implementations. In particular, the mentioned design case achieves a worst-case throughput higher than 70 Mb/s at the area cost of 3.17 mm2 on a 90 nm CMOS technology.

20 citations


Cites background from "A flexible high throughput multi-AS..."

  • ...Few recent works [7], [9], [15] tried to exploit the intra-IP NoC approach to design flexible turbo/LDPC decoder architectures....

    [...]

  • ...Comparison with [9] shows similar core area occupation, whereas our NoC contributes for 0....

    [...]

  • ...High throughput PEs able to support both turbo and LDPC decoding [5]–[9] can be im-...

    [...]

  • ...It is worth noting that, in the turbo decoding mode the proposed architecture achieves the lowest power consumption as compared with [5]–[9]....

    [...]

  • ...5 code, are still above 70 Mb/s: in [9], according to the provided formula, for the same code throughput is below the standard threshold....

    [...]

Book ChapterDOI
01 Jun 2014
TL;DR: In this chapter, an overview of architecture of turbo and LDPC codes is presented, the standard implementation of those codes is first presented, and architecture for high-speed, low-power, and high flexibility are derived.
Abstract: The transition from analog telecommunication equipment and terminals to digital systems and, more recently, the fast development of wireless communications were made possible by three factors: 1) key advances in integrated circuit technology, 2) large improvements in methodologies and tools for the design of highly complex digital circuits, and 3) progress in information theory, in particular, the belief propagation algorithm that allows error control codes operating close to the Shannon limit. In this chapter, an overview of architecture of turbo and LDPC codes is presented. The standard implementation (i.e., low complexity) of those codes is first presented. Then architecture for high-speed, low-power, and high flexibility are derived. Finally, the chapter concludes with the presentation of exotic decoding architectures and a survey of relevant architectures. Keywords

13 citations


Cites background from "A flexible high throughput multi-AS..."

  • ...multi-ASIP architecture is described in [116], where each core includes two instruction sets, one for LDPC and one for turbo codes....

    [...]

Proceedings ArticleDOI
01 Oct 2017
TL;DR: A novel LDPC parallelization approach for LDPC decoding on a multi-core processor device is proposed, which reduces the processing latency down to some microseconds as highlighted by x86 multi- core experimentations.
Abstract: LDPC codes are a family of error correcting codes used in most modern digital communication standards even in future 3GPP 5G standard. Thanks to their high processing power and their parallelization capabilities, prevailing multi-core and many-core devices facilitate real-time implementations of digital communication systems, which were previously implemented on dedicated hardware targets. Through massive frame decoding parallelization, current LDPC decoders throughputs range from hundreds of Mbps up to Gbps. However, inter-frame parallelization involves latency penalties, while in future 5G wireless communication systems, the latency should be reduced as far as possible. To this end, a novel LDPC parallelization approach for LDPC decoding on a multi-core processor device is proposed in this article. It reduces the processing latency down to some microseconds as highlighted by x86 multi-core experimentations.

12 citations


Cites background or methods from "A flexible high throughput multi-AS..."

  • ...The elements VN[3,34] are loaded from memory and then the unsolicited values are masked....

    [...]

  • ...However, the multi-core processor data-path cannot be sized according to Z contrary to custom ASIP solutions [31], [32], [33], [34]....

    [...]

Journal ArticleDOI
TL;DR: This paper proposes to address the problem of designing generic and efficient LDPC decoders by using a nonsymmetric NISC-based architecture that performs layered decoding that achieves higher hardware efficiency even for the challenging-to-implement unstructured LDPC codes.
Abstract: Low density parity-check (LDPC) codes, are widely used for error correction in digital communication systems. Their inclusion in communication standards requires to define decoders able to support efficiently a set of codes with different code length, code rates or code structures. In addition to this high flexibility, these decoders still have to achieve high throughputs in order to comply with standards requirements. In this paper, we propose to address the problem of designing generic and efficient LDPC decoders by using a nonsymmetric NISC-based architecture that performs layered decoding. NISC architectures provide flexibility with a limited loss in hardware efficiency. In addition, an automated design flow is used to efficiently assign computations to the processing units (PU) and to map data to the memory units (MU). Unlike previous works, the NISC decoder can include a number of PUs that is different than the number of MUs. This nonsymmetric characteristic provides a higher degree of freedom during the computation/data assignment phase of the design flow. This whole design framework automatically generates an LDPC decoder able to support any set of predetermined LDPC codes regardless of their parameters. The automated nature of the design framework enables to easily explore the design space for a given set of codes. Compared to state of the art LDPC decoders, the automatically generated decoders achieve higher hardware efficiency even for the challenging-to-implement unstructured LDPC codes.

12 citations

References
More filters
Proceedings ArticleDOI
24 Oct 2008
TL;DR: An application-specific instruction-set processor (ASIP) which supports convolutional codes, binary/duo-binary turbo codes, and LDPC codes is presented, outperforming existing ASIP solutions for LDPC decoding by an order of magnitude.
Abstract: Future mobile and wireless communication networks require flexible modem architectures to provide seamless services between different network standards. In this paper we focus on the outer modem which has to support various advanced channel coding techniques like convolutional codes, turbo codes, and low-density parity-check (LDPC) codes. We present an application-specific instruction-set processor (ASIP) which supports convolutional codes, binary/duo-binary turbo codes, and LDPC codes. Special emphasis is put on the support of LDPC codes. The ASIP consists of a special pipeline which is completely optimized for channel decoding. Logic synthesis yields an overall area of 0.62 mm2 for this ASIP in a 65 nm low power technology. Payload throughputs of, e.g., up to 257 Mbps are possible at 400 MHz for the WiMAX and WiFi LDPC codes, outperforming existing ASIP solutions for LDPC decoding by an order of magnitude.

77 citations


"A flexible high throughput multi-AS..." refers background in this paper

  • ...However, it occupies a large area of 0.9mm2 in 45nm technology ( 3.6mm2 in 90nm)....

    [...]

Proceedings ArticleDOI
08 Jun 2008
TL;DR: A novel on-chip interconnection network adapted to a flexible multiprocessor LDPC decoder based on the de Bruijn network that allows it to efficiently support the communication intensive nature of the application.
Abstract: This paper proposes a novel on-chip interconnection network adapted to a flexible multiprocessor LDPC decoder based on the de Bruijn network. The main characteristics of this network - including its logarithmic diameter, scalable aggregate bandwidth, and optimized routing technique- allow it to efficiently support the communication intensive nature of the application. We present a detailed hardware implementation of the routers and the network interfaces as well as the packet format and the routing algorithm. The latter is a parallelized version of the shortest path with deflection routing algorithm. In order to evaluate the performance of the proposed network, a generic RTL VHDL description has been developed and synthesized with CMOS STMicroelectronics 0.18 mum technology. The flexibility and the scalability of this on-chip communication network enable it to be used for any kind of LDPC code.

57 citations

Journal ArticleDOI
TL;DR: In this article, the authors present a flexible multiprocessor platform for high throughput turbo decoding using configurable application-specific instruction set processors (ASIP) combined with an efficient memory and communication interconnect scheme.
Abstract: Emerging digital communication applications and the underlying architectures encounter drastically increasing performance and flexibility requirements. In this paper, we present a novel flexible multiprocessor platform for high throughput turbo decoding. The proposed platform enables exploiting all parallelism levels of turbo decoding applications to fulfill performance requirements. In order to fulfill flexibility requirements, the platform is structured around configurable application-specific instruction-set processors (ASIP) combined with an efficient memory and communication interconnect scheme. The designed ASIP has an single instruction multiple data (SIMD) architecture with a specialized and extensible instruction-set and 6-stages pipeline control. The attached memories and communication interfaces enable its integration in multiprocessor architectures. These multiprocessor architectures benefit from the recent shuffled decoding technique introduced in the turbo-decoding field to achieve higher throughput. The major characteristics of the proposed platform are its flexibility and scalability which make it reusable for all simple and double binary turbo codes of existing and emerging standards. Results obtained for double binary WiMAX turbo codes demonstrate around 250 Mb/s throughput using 16-ASIP multiprocessor architecture.

57 citations

01 Jan 2008
TL;DR: A novel flexible multiprocessor platform for high throughput turbo decoding that enables exploiting all parallelism levels of turbo decoding applications to fulfill performance requirements and is reusable for all simple and double binary turbo codes of existing and emerging standards.
Abstract: Emergingdigitalcommunicationapplicationsandthe underlying architectures encounter drastically increasing perfor- mance and flexibility requirements. In this paper, we present a novel flexible multiprocessor platform for high throughput turbo decoding. The proposed platform enables exploiting all parallelism levelsofturbodecodingapplicationstofulfillperformancerequire- ments. In order to fulfill flexibility requirements, the platform is structuredaroundconfigurableapplication-specificinstruction-set processors (ASIP) combined with an efficient memory and com- munication interconnect scheme. The designed ASIP has an single instruction multiple data (SIMD) architecture with a specialized and extensible instruction-set and 6-stages pipeline control. The attached memories and communication interfaces enable its inte- gration in multiprocessor architectures. These multiprocessor ar- chitectures benefit from the recent shuffled decoding technique in- troduced in the turbo-decoding field to achieve higher throughput. The major characteristics of the proposed platform are its flex- ibility and scalability which make it reusable for all simple and double binary turbo codes of existing and emerging standards. Re- sults obtained for double binary WiMAX turbo codes demonstrate around 250 Mb/s throughput using 16-ASIP multiprocessor archi- tecture.

55 citations

Journal ArticleDOI
01 Jul 2011
TL;DR: A unified message passing algorithm for LDPC and Turbo codes is proposed and a flexible soft-input soft-output (SISO) module to handle LDPC/Turbo decoding is introduced and an area-efficient flexible SISO decoder architecture is proposed to support LDPC-Turbo codes decoding.
Abstract: Low-density parity-check (LDPC) codes and convolutional Turbo codes are two of the most powerful error correcting codes that are widely used in modern communication systems. In a multi-mode baseband receiver, both LDPC and Turbo decoders may be required. However, the different decoding approaches for LDPC and Turbo codes usually lead to different hardware architectures. In this paper we propose a unified message passing algorithm for LDPC and Turbo codes and introduce a flexible soft-input soft-output (SISO) module to handle LDPC/Turbo decoding. We employ the trellis-based maximum a posteriori (MAP) algorithm as a bridge between LDPC and Turbo codes decoding. We view the LDPC code as a concatenation of n super-codes where each super-code has a simpler trellis structure so that the MAP algorithm can be easily applied to it. We propose a flexible functional unit (FFU) for MAP processing of LDPC and Turbo codes with a low hardware overhead (about 15% area and timing overhead). Based on the FFU, we propose an area-efficient flexible SISO decoder architecture to support LDPC/Turbo codes decoding. Multiple such SISO modules can be embedded into a parallel decoder for higher decoding throughput. As a case study, a flexible LDPC/Turbo decoder has been synthesized on a TSMC 90 nm CMOS technology with a core area of 3.2 mm2. The decoder can support IEEE 802.16e LDPC codes, IEEE 802.11n LDPC codes, and 3GPP LTE Turbo codes. Running at 500 MHz clock frequency, the decoder can sustain up to 600 Mbps LDPC decoding or 450 Mbps Turbo decoding.

37 citations


"A flexible high throughput multi-AS..." refers methods in this paper

  • ...A high throughput of 257Mbps is achieved for LDPC mode while a limited throughput of 37.2Mbps in DBTC and 18.6Mbps in SBTC modes are achieved at 400MHz....

    [...]