Home
/
Authors
/
Guohui Wang

Author

Guohui Wang

Other affiliations: University of Tübingen, Qualcomm

Bio: Guohui Wang is an academic researcher from Rice University. The author has contributed to research in topics: Throughput (business) & Low-density parity-check code. The author has an hindex of 17, co-authored 39 publications receiving 1124 citations. Previous affiliations of Guohui Wang include University of Tübingen & Qualcomm.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Large-Scale MIMO Detection for 3GPP LTE: Algorithms and FPGA Implementations

[...]

Michael Wu¹, Bei Yin¹, Guohui Wang¹, Christopher H. Dick², Joseph R. Cavallaro¹, Christoph Studer³ - Show less +2 more•Institutions (3)

Rice University¹, Xilinx², Cornell University³

21 Mar 2014-IEEE Journal of Selected Topics in Signal Processing

TL;DR: This work proposes a new approximate matrix inversion algorithm relying on a Neumann series expansion, which substantially reduces the complexity of linear data detection in single-carrier frequency-division multiple access (SC-FDMA)-based large-scale MIMO systems.

...read moreread less

Abstract: Large-scale (or massive) multiple-input multiple-out put (MIMO) is expected to be one of the key technologies in next-generation multi-user cellular systems based on the upcoming 3GPP LTE Release 12 standard, for example. In this work, we propose-to the best of our knowledge-the first VLSI design enabling high-throughput data detection in single-carrier frequency-division multiple access (SC-FDMA)-based large-scale MIMO systems. We propose a new approximate matrix inversion algorithm relying on a Neumann series expansion, which substantially reduces the complexity of linear data detection. We analyze the associated error, and we compare its performance and complexity to those of an exact linear detector. We present corresponding VLSI architectures, which perform exact and approximate soft-output detection for large-scale MIMO systems with various antenna/user configurations. Reference implementation results for a Xilinx Virtex-7 XC7VX980T FPGA show that our designs are able to achieve more than 600 Mb/s for a 128 antenna, 8 user 3GPP LTE-based large-scale MIMO system. We finally provide a performance/complexity trade-off comparison using the presented FPGA designs, which reveals that the detector circuit of choice is determined by the ratio between BS antennas and users, as well as the desired error-rate performance.

...read moreread less

363 citations

Proceedings Article•DOI•

Accelerating computer vision algorithms using OpenCL framework on the mobile GPU - A case study

[...]

Guohui Wang¹, Yingen Xiong², Jay Yun², Joseph R. Cavallaro¹•Institutions (2)

Rice University¹, Qualcomm²

26 May 2013

TL;DR: This work proposes to accelerate an exemplar-based inpainting algorithm for object removal on a mobile GPU using OpenCL, and is the first published implementation of general-purpose computing using Opencl on mobile GPUs.

...read moreread less

Abstract: Recently, general-purpose computing on graphics processing units (GPGPU) has been enabled on mobile devices thanks to the emerging heterogeneous programming models such as OpenCL. The capability of GPGPU on mobile devices opens a new era for mobile computing and can enable many computationally demanding computer vision algorithms on mobile devices. As a case study, this paper proposes to accelerate an exemplar-based inpainting algorithm for object removal on a mobile GPU using OpenCL. We discuss the methodology of exploring the parallelism in the algorithm as well as several optimization techniques. Experimental results demonstrate that our optimization strategies for mobile GPUs have significantly reduced the processing time and make computationally intensive computer vision algorithms feasible for a mobile device. To the best of the authors' knowledge, this work is the first published implementation of general-purpose computing using OpenCL on mobile GPUs.

...read moreread less

78 citations

Proceedings Article•DOI•

A fast and efficient sift detector using the mobile GPU

[...]

Blaine Rister¹, Guohui Wang¹, Michael Wu¹, Joseph R. Cavallaro¹•Institutions (1)

Rice University¹

26 May 2013

TL;DR: This work presents an implementation of the popular Scale-Invariant Feature Transform (SIFT) feature detection algorithm that incorporates the powerful graphics processing unit (GPU) in mobile devices and proposes a heterogeneous dataflow scheme to achieve near-realtime detection.

...read moreread less

Abstract: Emerging mobile applications, such as augmented reality, demand robust feature detection at high frame rates. We present an implementation of the popular Scale-Invariant Feature Transform (SIFT) feature detection algorithm that incorporates the powerful graphics processing unit (GPU) in mobile devices. Where the usual GPU methods are inefficient on mobile hardware, we propose a heterogeneous dataflow scheme. By methodically partitioning the computation, compressing the data for memory transfers, and taking into account the unique challenges that arise out of the mobile GPU, we are able to achieve a speedup of 4-7x over an optimized CPU version, and a 6.4x speedup over a published GPU implementation. Additionally, we reduce energy consumption by 87 percent per image. We achieve near-realtime detection without compromising the original algorithm.

...read moreread less

73 citations

Proceedings Article•DOI•

High throughput low latency LDPC decoding on GPU for SDR systems

[...]

Guohui Wang¹, Michael Wu¹, Bei Yin¹, Joseph R. Cavallaro¹•Institutions (1)

Rice University¹

01 Dec 2013

TL;DR: This paper presents optimization techniques for a parallel LDPC decoder including algorithm optimization, fully coalesced memory access, asynchronous data transfer and multi-stream concurrent kernel execution for modern GPU architectures.

...read moreread less

Abstract: In this paper, we present a high throughput and low latency LDPC (low-density parity-check) decoder implementation on GPUs (graphics processing units). The existing GPU-based LDPC decoder implementations suffer from low throughput and long latency, which prevent them from being used in practical SDR (software-defined radio) systems. To overcome this problem, we present optimization techniques for a parallel LDPC decoder including algorithm optimization, fully coalesced memory access, asynchronous data transfer and multi-stream concurrent kernel execution for modern GPU architectures. Experimental results demonstrate that the proposed LDPC decoder achieves 316 Mbps (at 10 iterations) peak throughput on a single GPU. The decoding latency, which is much lower than that of the state of the art, varies from 0.207 ms to 1.266 ms for different throughput requirements from 62.5 Mbps to 304.16 Mbps. When using four GPUs concurrently, we achieve an aggregate peak throughput of 1.25 Gbps (at 10 iterations).

...read moreread less

68 citations

Proceedings Article•DOI•

A 3.8Gb/s large-scale MIMO detector for 3GPP LTE-Advanced

[...]

Bei Yin¹, Michael Wu¹, Guohui Wang¹, Christopher H. Dick², Joseph R. Cavallaro¹, Christoph Studer³ - Show less +2 more•Institutions (3)

Rice University¹, Xilinx², Cornell University³

04 May 2014

TL;DR: This paper proposes - to the best of the knowledge - the first ASIC design for high-throughput data detection in single carrier frequency division multiple access (SC-FDMA)-based large-scale MIMO systems, such as systems building on future 3GPP LTE-Advanced standards.

...read moreread less

Abstract: This paper proposes - to the best of our knowledge - the first ASIC design for high-throughput data detection in single carrier frequency division multiple access (SC-FDMA)-based large-scale MIMO systems, such as systems building on future 3GPP LTE-Advanced standards. In order to substantially reduce the complexity of linear soft-output data detection in systems having hundreds of antennas at the base station (BS), the proposed detector builds upon a truncated Neumann series expansion to compute the necessary matrix inverse at low complexity. To achieve high throughput in the 3GPP LTE-A uplink, we develop a systolic VLSI architecture including all necessary processing blocks. We present a corresponding ASIC design that achieves 3.8 Gb/s for a 128 antenna, 8 user 3GPP LTE-A based large-scale MIMO system, while occupying 11.1 mm 2 in a TSMC 45nm CMOS technology.

...read moreread less

60 citations

1
2
3
4
…
5
6
7
8
9
10

Collapse

Cited by

PDF

Open Access

More filters

Book•

Massive MIMO Networks: Spectral, Energy, and Hardware Efficiency

[...]

Emil Björnson¹, Jakob Hoydis², Luca Sanguinetti³•Institutions (3)

Linköping University¹, Bell Labs², University of Pisa³

03 Jan 2018

TL;DR: This monograph summarizes many years of research insights in a clear and self-contained way and providest the reader with the necessary knowledge and mathematical toolsto carry out independent research in this area.

...read moreread less

Abstract: Massive multiple-input multiple-output MIMO is one of themost promising technologies for the next generation of wirelesscommunication networks because it has the potential to providegame-changing improvements in spectral efficiency SE and energyefficiency EE. This monograph summarizes many years ofresearch insights in a clear and self-contained way and providesthe reader with the necessary knowledge and mathematical toolsto carry out independent research in this area. Starting froma rigorous definition of Massive MIMO, the monograph coversthe important aspects of channel estimation, SE, EE, hardwareefficiency HE, and various practical deployment considerations.From the beginning, a very general, yet tractable, canonical systemmodel with spatial channel correlation is introduced. This modelis used to realistically assess the SE and EE, and is later extendedto also include the impact of hardware impairments. Owing tothis rigorous modeling approach, a lot of classic "wisdom" aboutMassive MIMO, based on too simplistic system models, is shownto be questionable.

...read moreread less

1,352 citations

Journal Article•DOI•

Fifty Years of MIMO Detection: The Road to Large-Scale MIMOs

[...]

Shaoshi Yang¹, Lajos Hanzo¹•Institutions (1)

University of Southampton¹

07 Sep 2015-IEEE Communications Surveys and Tutorials

TL;DR: In this article, the authors provide a recital on the historic heritages and novel challenges facing massive/large-scale multiple-input multiple-output (LS-MIMO) systems from a detection perspective.

...read moreread less

Abstract: The emerging massive/large-scale multiple-input multiple-output (LS-MIMO) systems that rely on very large antenna arrays have become a hot topic of wireless communications. Compared to multi-antenna aided systems being built at the time of this writing, such as the long-term evolution (LTE) based fourth generation (4G) mobile communication system which allows for up to eight antenna elements at the base station (BS), the LS-MIMO system entails an unprecedented number of antennas, say 100 or more, at the BS. The huge leap in the number of BS antennas opens the door to a new research field in communication theory, propagation and electronics, where random matrix theory begins to play a dominant role. Interestingly, LS-MIMOs also constitute a perfect example of one of the key philosophical principles of the Hegelian Dialectics, namely, that “quantitative change leads to qualitative change.” In this treatise, we provide a recital on the historic heritages and novel challenges facing LS-MIMOs from a detection perspective. First, we highlight the fundamentals of MIMO detection, including the nature of co-channel interference (CCI), the generality of the MIMO detection problem, the received signal models of both linear memoryless MIMO channels and dispersive MIMO channels exhibiting memory, as well as the complex-valued versus real-valued MIMO system models. Then, an extensive review of the representative MIMO detection methods conceived during the past 50 years (1965–2015) is presented, and relevant insights as well as lessons are inferred for the sake of designing complexity-scalable MIMO detection algorithms that are potentially applicable to LS-MIMO systems. Furthermore, we divide the LS-MIMO systems into two types, and elaborate on the distinct detection strategies suitable for each of them. The type-I LS-MIMO corresponds to the case where the number of active users is much smaller than the number of BS antennas, which is currently the mainstream definition of LS-MIMO. The type-II LS-MIMO corresponds to the case where the number of active users is comparable to the number of BS antennas. Finally, we discuss the applicability of existing MIMO detection algorithms in LS-MIMO systems, and review some of the recent advances in LS-MIMO detection.

...read moreread less

626 citations

Journal Article•DOI•

Large-Scale MIMO Detection for 3GPP LTE: Algorithms and FPGA Implementations

[...]

Michael Wu¹, Bei Yin¹, Guohui Wang¹, Christopher H. Dick², Joseph R. Cavallaro¹, Christoph Studer³ - Show less +2 more•Institutions (3)

Rice University¹, Xilinx², Cornell University³

21 Mar 2014-IEEE Journal of Selected Topics in Signal Processing

...read moreread less

363 citations

Journal Article•DOI•

Benefits and Impact of Cloud Computing on 5G Signal Processing: Flexible centralization through cloud-RAN

[...]

Dirk Wubben¹, Peter Rost, Jens Bartelt², Massinissa Lalam, Valentin Savin, Matteo Gorgoglione, Armin Dekorsy¹, Gerhard Fettweis² - Show less +4 more•Institutions (2)

University of Bremen¹, Dresden University of Technology²

14 Oct 2014-IEEE Signal Processing Magazine

TL;DR: The benefits that cloud computing offers for fifth-generation (5G) mobile networks are explored and the implications on the signal processing algorithms are investigated.

...read moreread less

Abstract: Cloud computing draws significant attention in the information technology (IT) community as it provides ubiquitous on-demand access to a shared pool of configurable computing resources with minimum management effort. It gains also more impact on the communication technology (CT) community and is currently discussed as an enabler for flexible, cost-efficient and more powerful mobile network implementations. Although centralized baseband pools are already investigated for the radio access network (RAN) to allow for efficient resource usage and advanced multicell algorithms, these technologies still require dedicated hardware and do not offer the same characteristics as cloud-computing platforms, i.e., on-demand provisioning, virtualization, resource pooling, elasticity, service metering, and multitenancy. However, these properties of cloud computing are key enablers for future mobile communication systems characterized by an ultradense deployment of radio access points (RAPs) leading to severe multicell interference in combination with a significant increase of the number of access nodes and huge fluctuations of the rate requirements over time. In this article, we will explore the benefits that cloud computing offers for fifth-generation (5G) mobile networks and investigate the implications on the signal processing algorithms.

...read moreread less

272 citations

Journal Article•DOI•

Massive MIMO Detection Techniques: A Survey

[...]

Mahmoud A. M. Albreem, Markku Juntti¹, Shahriar Shahabuddin¹•Institutions (1)

University of Oulu¹

16 Aug 2019-IEEE Communications Surveys and Tutorials

TL;DR: This paper discusses optimal and near-optimal detection principles specifically designed for the massive MIMO system such as detectors based on a local search, belief propagation and box detection, and presents recent advances of detection algorithms which are mostly based on machine learning or sparsity based algorithms.

...read moreread less

Abstract: Massive multiple-input multiple-output (MIMO) is a key technology to meet the user demands in performance and quality of services (QoS) for next generation communication systems. Due to a large number of antennas and radio frequency (RF) chains, complexity of the symbol detectors increased rapidly in a massive MIMO uplink receiver. Thus, the research to find the perfect massive MIMO detection algorithm with optimal performance and low complexity has gained a lot of attention during the past decade. A plethora of massive MIMO detection algorithms has been proposed in the literature. The aim of this paper is to provide insights on such algorithms to a generalist of wireless communications. We garner the massive MIMO detection algorithms and classify them so that a reader can find a distinction between different algorithms from a wider range of solutions. We present optimal and near-optimal detection principles specifically designed for the massive MIMO system such as detectors based on a local search, belief propagation and box detection. In addition, we cover detectors based on approximate inversion, which has gained popularity among the VLSI signal processing community due to their deterministic dataflow and low complexity. We also briefly explore several nonlinear small-scale MIMO (2-4 antenna receivers) detectors and their applicability in the massive MIMO context. In addition, we present recent advances of detection algorithms which are mostly based on machine learning or sparsity based algorithms. In each section, we also mention the related implementations of the detectors. A discussion of the pros and cons of each detector is provided.

...read moreread less

262 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse