scispace - formally typeset
Search or ask a question

Showing papers on "Codebook published in 2022"


Journal ArticleDOI
TL;DR: A new image compression scheme called the GenPSOWVQ method that uses a recurrent neural network with wavelet VQ that attains precise compression while maintaining image accuracy with lower computational costs when encoding clinical images.
Abstract: Medical diagnosis is always a time and a sensitive approach to proper medical treatment. Automation systems have been developed to improve these issues. In the process of automation, images are processed and sent to the remote brain for processing and decision making. It is noted that the image is written for compaction to reduce processing and computational costs. Images require large storage and transmission resources to perform their operations. A good strategy for pictures compression can help minimize these requirements. The question of compressing data on accuracy is always a challenge. Therefore, to optimize imaging, it is necessary to reduce inconsistencies in medical imaging. So this document introduces a new image compression scheme called the GenPSOWVQ method that uses a recurrent neural network with wavelet VQ. The codebook is built using a combination of fragments and genetic algorithms. The newly developed image compression model attains precise compression while maintaining image accuracy with lower computational costs when encoding clinical images. The proposed method was tested using real-time medical imaging using PSNR, MSE, SSIM, NMSE, SNR, and CR indicators. Experimental results show that the proposed GenPSOWVQ method yields higher PSNR SSIMM values for a given compression ratio than the existing methods. In addition, the proposed GenPSOWVQ method yields lower values of MSE, RMSE, and SNR for a given compression ratio than the existing methods.

65 citations


Journal ArticleDOI
TL;DR: The IntOPMICM technique is introduced, a new image compression scheme that combines GenPSO and VQ that produces higher PSNR SSIM values for a given compression ratio than existing methods, according to experimental data.
Abstract: Due to the increasing number of medical imaging images being utilized for the diagnosis and treatment of diseases, lossy or improper image compression has become more prevalent in recent years. The compression ratio and image quality, which are commonly quantified by PSNR values, are used to evaluate the performance of the lossy compression algorithm. This article introduces the IntOPMICM technique, a new image compression scheme that combines GenPSO and VQ. A combination of fragments and genetic algorithms was used to create the codebook. PSNR, MSE, SSIM, NMSE, SNR, and CR indicators were used to test the suggested technique using real-time medical imaging. The suggested IntOPMICM approach produces higher PSNR SSIM values for a given compression ratio than existing methods, according to experimental data. Furthermore, for a given compression ratio, the suggested IntOPMICM approach produces lower MSE, RMSE, and SNR values than existing methods.

65 citations


Journal ArticleDOI
TL;DR: An intelligent satin bowerbird optimizer based compression technique (ISBO-CT) for remote sensing images is presented in this paper . But the technique is not suitable for the remote sensing applications.
Abstract: Due to latest advancements in the field of remote sensing, it becomes easier to acquire high quality images by the use of various satellites along with the sensing components. But the massive quantity of data poses a challenging issue to store and effectively transmit the remote sensing images. Therefore, image compression techniques can be utilized to process remote sensing images. In this aspect, vector quantization (VQ) can be employed for image compression and the widely applied VQ approach is Linde–Buzo–Gray (LBG) which creates a local optimum codebook for image construction. The process of constructing the codebook can be treated as the optimization issue and the metaheuristic algorithms can be utilized for resolving it. With this motivation, this article presents an intelligent satin bowerbird optimizer based compression technique (ISBO-CT) for remote sensing images. The goal of the ISBO-CT technique is to proficiently compress the remote sensing images by the effective design of codebook. Besides, the ISBO-CT technique makes use of satin bowerbird optimizer (SBO) with LBG approach is employed. The design of SBO algorithm for remote sensing image compression depicts the novelty of the work. To showcase the enhanced efficiency of ISBO-CT approach, an extensive range of simulations were applied and the outcomes reported the optimum performance of ISBO-CT technique related to the recent state of art image compression approaches.

60 citations


Journal ArticleDOI
TL;DR: This paper derives a closed-form expression of the ambiguity function for the proposed SVC-ISAC waveform, and proves that it exhibits low sidelobes in the delay and Doppler domains, regardless of the distribution of the transmitted bit stream.
Abstract: Integrated sensing and communication (ISAC) can provide efficient usage for both spectrum and hardware resources. A critical challenge, however, is to design the dual-functional waveform for simultaneous radar sensing and communication. In this paper, we propose a sparse vector coding-based ISAC (SVC-ISAC) waveform to simultaneously provide low sidelobes for radar sensing and ultra reliability for communication transmission. The key idea of the proposed waveform is to embed the communication information into the support of one sparse vector and transmit a low-dimensional signal via the spreading codebook. We derive a closed-form expression of the ambiguity function for the proposed SVC-ISAC waveform, and prove that it exhibits low sidelobes in the delay and Doppler domains, regardless of the distribution of the transmitted bit stream. In addition, the information decoding at the communication receiver is solved through the support identification and sparse demapping. Simulation results demonstrate that the proposed waveform improves the reliability while consistently suppressing the sidelobe levels.

24 citations


Journal ArticleDOI
TL;DR: This letter aims to leverage deep learning to jointly design the downlink SCMA encoder and decoder with the aid of autoencoder and introduces a novel end-to-end learning based SCMA (E2E-SCMA) design framework, under which improved sparse codebooks and low-complexity decoder are obtained.
Abstract: Sparse code multiple access (SCMA) is a promising code-domain non-orthogonal multiple access (NOMA) scheme for the enabling of massive machine-type communication. In SCMA, the design of good sparse codebooks and efficient multiuser decoding have attracted tremendous research attention in the past few years. This letter aims to leverage deep learning to jointly design the downlink SCMA encoder and decoder with the aid of autoencoder. We introduce a novel end-to-end learning based SCMA (E2E-SCMA) design framework, under which improved sparse codebooks and low-complexity decoder are obtained. Compared to conventional SCMA schemes, our numerical results show that the proposed E2E-SCMA leads to significant improvements in terms of error rate and computational complexity.

20 citations


Journal ArticleDOI
TL;DR: In this paper , a multi-agent double deep Q network (DDQN)-based approach was proposed to jointly optimize the beamforming vectors and power splitting ratio in multi-user multiple-input single-output (MU-MISO) simultaneous wireless information and power transfer (SWIPT)-enabled heterogeneous networks (HetNets).
Abstract: This paper proposes a multi-agent double deep Q network (DDQN)-based approach to jointly optimize the beamforming vectors and power splitting (PS) ratio in multi-user multiple-input single-output (MU-MISO) simultaneous wireless information and power transfer (SWIPT)-enabled heterogeneous networks (HetNets), where a macro base station (MBS) and several femto base stations (FBSs) serve multiple macro user equipments (MUEs) and femto user equipments (FUEs). The PS receiver architecture is deployed at FUEs. An optimization problem is formulated to maximize the achievable sum information rate of FUEs under the constraints of the achievable information rate requirements of MUEs and FUEs and the energy harvesting (EH) requirements of FUEs. Since the optimization problem is challenging to handle due to the high dimension and time-varying environment, an efficient multi-agent DDQN-based algorithm is presented, which is trained in a centralized manner and runs in a distributed manner, where two sets of deep neural network parameters are jointly updated and trained to tackle the problem and avoid overestimation. To facilitate the presented multi-agent DDQN-based algorithm, the action space, the state space and the reward function are designed, where the codebook matrix is employed to deal with the complex transmit beamforming vectors. Simulation results validate the proposed algorithm. Notable performance gains are achieved by the proposed algorithm due to considering the beam directions in the action space and the adaptability to the Doppler frequency shifts. Besides, the proposed algorithm is shown to be superior to other benchmark ones numerically.

16 citations


Journal ArticleDOI
TL;DR: In this paper , a deep reinforcement learning framework is proposed to optimize the codebook beam patterns relying only on the receive power measurements, which can adapt the beam patterns based on the surrounding environment, user distribution, hardware impairments and array geometry.
Abstract: Millimeter wave (mmWave) and terahertz MIMO systems rely on pre-defined beamforming codebooks for both initial access and data transmission. These pre-defined codebooks, however, are commonly not optimized for specific environments, user distributions, and/or possible hardware impairments. This leads to large codebook sizes with high beam training overhead which makes it hard for these systems to support highly mobile applications. To overcome these limitations, this paper develops a deep reinforcement learning framework that learns how to optimize the codebook beam patterns relying only on the receive power measurements. The developed model learns how to adapt the beam patterns based on the surrounding environment, user distribution, hardware impairments, and array geometry. Further, this approach does not require any knowledge about the channel, RF hardware, or user positions. To reduce the learning time, the proposed model designs a novel Wolpertinger -variant architecture that is capable of efficiently searching the large discrete action space. The proposed learning framework respects the RF hardware constraints such as the constant-modulus and quantized phase shifter constraints. Simulation results confirm the ability of the developed framework to learn near-optimal beam patterns for line-of-sight (LOS), non-LOS (NLOS), mixed LOS/NLOS scenarios and for arrays with hardware impairments without requiring any channel knowledge.

15 citations


Journal ArticleDOI
TL;DR: Zhang et al. as mentioned in this paper proposed the visual words learning module and hybrid pooling approach, and incorporated them in classification network to mitigate the problem of weakly-supervised semantic segmentation.
Abstract: Weakly-supervised semantic segmentation (WSSS) methods with image-level labels generally train a classification network to generate the Class Activation Maps (CAMs) as the initial coarse segmentation labels. However, current WSSS methods still perform far from satisfactorily because their adopted CAMs (1) typically focus on partial discriminative object regions and (2) usually contain useless background regions. These two problems are attributed to the sole image-level supervision and aggregation of global information when training the classification networks. In this work, we propose the visual words learning module and hybrid pooling approach, and incorporate them in classification network to mitigate the above problems. In visual words learning module, we counter the first problem by enforcing the classification network to learn fine-grained visual word labels so that more object extents could be discovered. Specifically, the visual words are learned with a codebook, which could be updated via two proposed strategies, i.e. learning-based strategy and memory-bank strategy. The second drawback of CAMs is alleviated with the proposed hybrid pooling, which incorporates the global average and local discriminative information to simultaneously ensure object completeness and reduce background regions. We evaluated our methods on PASCAL VOC 2012 and MS COCO 2014 datasets. Without any extra saliency prior, our method achieved 70.6% and 70.7% mIoU on the val and test set of PASCAL VOC dataset, respectively, and 36.2% mIoU on the val set of MS COCO dataset, which significantly surpassed the performance of state-of-the-art WSSS methods.

14 citations


Journal ArticleDOI
TL;DR: A new intelligent reflecting surface (IRS)-aided spectrum sensing scheme for CR is considered, by exploiting the large aperture and passive beamforming gains of IRS to boost the PU signal strength received at the SU to facilitate its spectrum sensing.
Abstract: Spectrum sensing is a key enabling technique for cognitive radio (CR), which provides essential information on the spectrum availability. However, due to severe wireless channel fading and path loss, the primary user (PU) signals received at the CR or secondary user (SU) can be practically too weak for reliable detection. To tackle this issue, we consider in this letter a new intelligent reflecting surface (IRS)-aided spectrum sensing scheme for CR, by exploiting the large aperture and passive beamforming gains of IRS to boost the PU signal strength received at the SU to facilitate its spectrum sensing. Specifically, by dynamically changing the IRS reflection over time according to a given codebook, its reflected signal power varies substantially at the SU, which is utilized for opportunistic signal detection. Furthermore, we propose a weighted energy detection method by combining the received signal power values over different IRS reflections, which significantly improves the detection performance. Simulation results validate the performance gain of the proposed IRS-aided spectrum sensing scheme, as compared to different benchmark schemes.

14 citations


Journal ArticleDOI
TL;DR: In this article , a unified 3D beam training and tracking procedure is proposed to effectively realize the beamforming in THz communications, by considering the line-of-sight (LoS) propagation.
Abstract: Terahertz (THz) communication is considered as an attractive way to overcome the bandwidth bottleneck and satisfy the ever-increasing capacity demand in the future. Due to the high directivity and propagation loss of THz waves, a massive MIMO system using beamforming is envisioned as a promising technology in THz communication to realize high-gain and directional transmission. However, pilots, which are the fundamentals for many beamforming schemes, are challenging to be accurately detected in the THz band owing to the severe propagation loss. In this paper, a unified 3D beam training and tracking procedure is proposed to effectively realize the beamforming in THz communications, by considering the line-of-sight (LoS) propagation. In particular, a novel quadruple-uniform planar array (QUPA) architecture is analyzed to enlarge the signal coverage, increase the beam gain, and reduce the beam squint loss. Then, a new 3D grid-based (GB) beam training is developed with low complexity, including the design of the 3D codebook and training protocol. Finally, a simple yet effective grid-based hybrid (GBH) beam tracking is investigated to support THz beamforming in an efficient manner. The communication framework based on this procedure can dynamically trigger beam training/tracking depending on the real-time quality of service. Numerical results are presented to demonstrate the superiority of our proposed beam training and tracking over the benchmark methods.

12 citations


Journal ArticleDOI
01 Mar 2022-Sensors
TL;DR: This paper proposes codebook-based phase shifters for mmWave TX and RIS to overcome the difficulty of estimating their mmWave channel state information (CSI) and leverages and implements two standard MAB algorithms, namely Thompson sampling (TS) and upper confidence bound (UCB).
Abstract: A reconfigurable intelligent surface (RIS) is a promising technology that can extend short-range millimeter wave (mmWave) communications coverage. However, phase shifts (PSs) of both mmWave transmitter (TX) and RIS antenna elements need to be optimally adjusted to effectively cover a mmWave user. This paper proposes codebook-based phase shifters for mmWave TX and RIS to overcome the difficulty of estimating their mmWave channel state information (CSI). Moreover, to adjust the PSs of both, an online learning approach in the form of a multiarmed bandit (MAB) game is suggested, where a nested two-stage stochastic MAB strategy is proposed. In the proposed strategy, the PS vector of the mmWave TX is adjusted in the first MAB stage. Based on it, the PS vector of the RIS is calibrated in the second stage and vice versa over the time horizon. Hence, we leverage and implement two standard MAB algorithms, namely Thompson sampling (TS) and upper confidence bound (UCB). Simulation results confirm the superior performance of the proposed nested two-stage MAB strategy; in particular, the nested two-stage TS nearly matches the optimal performance.

Journal ArticleDOI
TL;DR: In this paper , the authors considered the unsourced random access problem on a Rayleigh block-fading AWGN channel with multiple receive antennas and proposed an approach based on splitting the user messages into two parts.
Abstract: We consider the unsourced random access problem on a Rayleigh block-fading AWGN channel with multiple receive antennas. Specifically, we treat the slow fading scenario where the coherence blocklength is large compared to the number of active users and a message can be transmitted in a single fading coherence block. Unsourced random access refers to a form of grant-free random access where users are constrained to use the same codebook and therefore are a priori indistinguishable. The receiver must recover the list of transmitted messages up to permutations. In this paper, we propose an approach based on splitting the user messages into two parts. First, a small block of bits selects a relatively short codeword from a common “pilot” codebook. Then the remaining message bits are encoded by a standard block code for the Gaussian channel. The receiver makes use of a multiple measurement vector approximate message passing (MMV-AMP) algorithm to estimate the active user channels from the “pilot” part, and then uses the estimated channels to perform coherent maximum ratio combining (MRC) to decode the second part. We provide an accurate closed-form approximated analysis of the proposed scheme. Furthermore, we analyze the MRC decoding when successive interference cancellation is performed over groups of users, striking an attractive tradeoff between complexity and performance. Finally, we investigate the impact of power control policies, taking into account the unique nature of massive random access. As a byproduct, we also present an extension of the MMV-AMP algorithm which allows pathloss coefficients to be treated as deterministic unknowns by performing maximum likelihood estimation in each step of the MMV-AMP algorithm.

Journal ArticleDOI
TL;DR: Based on the Star-QAM mother constellation structure and with the aid of genetic algorithm, the minimum Euclidean distance (MED) and the minimum product distance (MPD) of the proposed codebooks lead to significantly improved error rate performances over Gaussian channels and Rayleigh fading channels as mentioned in this paper .
Abstract: Sparse code multiple access (SCMA) is a promising multiuser communication technique for the enabling of future massive machine-type networks. Unlike existing codebook design schemes assuming uniform power allocation, we present a novel class of SCMA codebooks which display power imbalance among different users for downlink transmission. Based on the Star-QAM mother constellation structure and with the aid of genetic algorithm, we optimize the minimum Euclidean distance (MED) and the minimum product distance (MPD) of the proposed codebooks. Numerical simulation results show that our proposed codebooks lead to significantly improved error rate performances over Gaussian channels and Rayleigh fading channels.

Journal ArticleDOI
Li Chai, Zilong Liu, Pei Xiao, Amine Maaref, Lin Bai 
TL;DR: This letter conceive the uplink transmissions of the low-density parity check (LDPC) coded SCMA system by applying the expectation propagation algorithm (EPA) to the joint detection and decoding (JDD) involving an aggregated factor graph of LDPC code and sparse codebooks.
Abstract: Sparse code multiple access (SCMA) is an emerging paradigm for efficient enabling of massive connectivity in future machine-type communications (MTC). In this letter, we conceive the uplink transmissions of the low-density parity check (LDPC) coded SCMA system. Traditional receiver design of LDPC-SCMA system, which is based on message passing algorithm (MPA) for multiuser detection followed by individual LDPC decoding, may suffer from the drawback of the high complexity and large decoding latency, especially when the system has large codebook size and/or high overloading factor. To address this problem, we introduce a novel receiver design by applying the expectation propagation algorithm (EPA) to the joint detection and decoding (JDD) involving an aggregated factor graph of LDPC code and sparse codebooks. Our numerical results demonstrate the superiority of the proposed EPA based JDD receiver over the conventional Turbo receiver in terms of both significantly lower complexity and faster convergence rate without noticeable error rate performance degradation.

Journal ArticleDOI
TL;DR: A novel codebook optimization method referred to as joint bare bones particle swarm optimization (JBBPSO) through maximizing the AMI lower bound is proposed, which jointly optimizes the mother codebook including basic constellation and other non-zero-dimensional constellations, and the rotation angles of multiple users.
Abstract: Sparse code multiple access (SCMA) is a promising non-orthogonal multiple access technique to support massive connectivity for future wireless Internet of Things (IoT) networks. As the main feature of SCMA, modulation and spread spectrum is embedded into codebook mapping, offering significant codebook shaping gains to mitigate inter-cell interference. Maximizing the constellation-constrained average mutual information (AMI) is an effective way for SCMA codebook optimization. However, deriving a closed-form expression of the AMI is analytically intractable, while it is computationally costly to estimate the AMI by numerical methods. To address this challenge, this paper first derives a lower bound of the AMI with a closed-form expression. On this basis, we propose a novel codebook optimization method referred to as joint bare bones particle swarm optimization (JBBPSO) through maximizing the AMI lower bound. The proposed low-complexity method jointly optimizes the mother codebook including basic constellation and other non-zero-dimensional constellations, and the rotation angles of multiple users. Numerical results show that our proposed optimized codebooks outperform the state-of-the-art SCMA codebooks in terms of both the lower bound and the error performance.

Journal ArticleDOI
TL;DR: In this article , the authors proposed a DL-based implicit feedback architecture to inherit the low-overhead characteristic, which uses neural networks (NNs) to replace the precoding matrix indicator (PMI) encoding and decoding modules.
Abstract: Massive multiple-input multiple-output can obtain more performance gain by exploiting the downlink channel state information (CSI) at the base station (BS). Therefore, studying CSI feedback with limited communication resources in frequency-division duplexing systems is of great importance. Recently, deep learning (DL)-based CSI feedback has shown considerable potential. However, the existing DL-based explicit feedback schemes are difficult to deploy because current fifth-generation mobile communication protocols and systems are designed based on an implicit feedback mechanism. In this paper, we propose a DL-based implicit feedback architecture to inherit the low-overhead characteristic, which uses neural networks (NNs) to replace the precoding matrix indicator (PMI) encoding and decoding modules. By using environment information, the NNs can achieve a more refined mapping between the precoding matrix and the PMI compared with codebooks. The correlation between subbands is also used to further improve the feedback performance. Simulation results show that, for a single resource block (RB), the proposed architecture can save 25.0% – 40.0% of overhead compared with the Type I codebook under different antenna configurations. For a wideband system with 52 RBs, overhead can be saved by 30.7% and 48.0% compared with the Type II codebook when ignoring and considering extracting subband correlation, respectively.

Journal ArticleDOI
TL;DR: In this article , two probabilistic codebook (PCB) techniques of prioritized beams are proposed to fasten the current 5G standard approach, targeting an efficient 6G design.

Journal ArticleDOI
TL;DR: In this article , a new iterative algorithm based on alternating maximization with exact penalty is proposed for the MED maximization problem, which achieves a set of codebooks of all users whose MED is larger than any previously reported results.
Abstract: Sparse code multiple access (SCMA), as a codebook-based non-orthogonal multiple access (NOMA) technique, has received research attention in recent years. The codebook design problem for SCMA has also been studied to some extent since codebook choices are highly related to the system's error rate performance. In this paper, we approach the SCMA codebook design problem by formulating an optimization problem to maximize the minimum Euclidean distance (MED) of superimposed codewords under power constraints. While SCMA codebooks with a larger MED are expected to obtain a better BER performance, no optimal SCMA codebook in terms of MED maximization, to the authors' best knowledge, has been reported in the SCMA literature yet. In this paper, a new iterative algorithm based on alternating maximization with exact penalty is proposed for the MED maximization problem. The proposed algorithm, when supplied with appropriate initial points and parameters, achieves a set of codebooks of all users whose MED is larger than any previously reported results. A Lagrange dual problem is derived which provides an upper bound of MED of any set of codebooks. Even though there is still a nonzero gap between the achieved MED and the upper bound given by the dual problem, simulation results demonstrate clear advantages in error rate performances of the proposed set of codebooks over all existing ones not only in AWGN channels but also in some downlink scenarios that fit in 5G/NR applications, making it a good codebook candidate thereof. The proposed set of SCMA codebooks, however, are not shown to outperform existing ones in uplink channels or in the case where non-consecutive OFDMA subcarriers are used. The correctness and accuracy of error curves in the simulation results are further confirmed by the coincidences with the theoretical upper bounds of error rates derived for any given set of codebooks.

Journal ArticleDOI
TL;DR: In this paper , the authors introduce a scheme for employing binarized symbol soft information within Guessing Random Additive Noise Decoding, a universal hard detection decoder, which incorporates codebook independent quantization of soft information to indicate demodulated symbols to be reliable or unreliable.
Abstract: The design and implementation of error correcting codes has long been informed by two fundamental results: Shannon’s 1948 capacity theorem, which established that long codes use noisy channels most efficiently; and Berlekamp, McEliece, and Van Tilborg’s 1978 theorem on the NP-completeness of decoding linear codes. These results shifted focus away from creating code-independent decoders, but recent low-latency communication applications necessitate relatively short codes, providing motivation to reconsider the development of universal decoders. We introduce a scheme for employing binarized symbol soft information within Guessing Random Additive Noise Decoding, a universal hard detection decoder. We incorporate codebook-independent quantization of soft information to indicate demodulated symbols to be reliable or unreliable. We introduce two decoding algorithms: one identifies a conditional Maximum Likelihood (ML) decoding; the other either reports a conditional ML decoding or an error. For random codebooks, we present error exponents and asymptotic complexity, and show benefits over hard detection. As empirical illustrations, we compare performance with majority logic decoding of Reed-Muller codes, with Berlekamp-Massey decoding of Bose-Chaudhuri-Hocquenghem codes, with CA-SCL decoding of CA-Polar codes, and establish the performance of Random Linear Codes, which require a universal decoder and offer a broader palette of code sizes and rates than traditional codes.

Journal ArticleDOI
TL;DR: In this article , a Gaussian mixture model (GMM) based clustering codebook design is proposed, which is inspired by the strong classification and analytical abilities of clustering techniques.
Abstract: The codebook design is the most essential core technique in constrained feedback massive multi-input multi-output (MIMO) system communications. MIMO vectors have been generally isotropic or evenly distributed in traditional codebook designs. In this paper, Gaussian mixture model (GMM) based clustering codebook design is proposed, which is inspired by the strong classification and analytical abilities of clustering techniques. Huge quantities of channel state information (CSI) are initially saved as entry data of the clustering process. Further, split into N number of clusters based on the shortest distance. The centroids part of clustering has been utilized for constructing a codebook with statistic channel information, with an average distance that is the shortest towards the true channel data. The enhanced GMM based clustering codebook design outperforms traditional methods, particularly in the situations of non-uniform distribution of channels as demonstrated via simulation results which match theoretical analyses concerning achievable rate. The proposed GMM based clustering codebook design is compared with DFT-based clustering codebook design and k-means based clustering codebook design.

Journal ArticleDOI
TL;DR: Extensive quantitative and qualitative evaluations demonstrate that the proposed Text2Human framework can generate more diverse and realistic human images compared to state-of-the-art methods.
Abstract: Generating high-quality and diverse human images is an important yet challenging task in vision and graphics. However, existing generative models often fall short under the high diversity of clothing shapes and textures. Furthermore, the generation process is even desired to be intuitively controllable for layman users. In this work, we present a text-driven controllable framework, Text2Human, for a high-quality and diverse human generation. We synthesize full-body human images starting from a given human pose with two dedicated steps. 1) With some texts describing the shapes of clothes, the given human pose is first translated to a human parsing map. 2) The final human image is then generated by providing the system with more attributes about the textures of clothes. Specifically, to model the diversity of clothing textures, we build a hierarchical texture-aware codebook that stores multi-scale neural representations for each type of texture. The codebook at the coarse level includes the structural representations of textures, while the codebook at the fine level focuses on the details of textures. To make use of the learned hierarchical codebook to synthesize desired images, a diffusion-based transformer sampler with mixture of experts is firstly employed to sample indices from the coarsest level of the codebook, which then is used to predict the indices of the codebook at finer levels. The predicted indices at different levels are translated to human images by the decoder learned accompanied with hierarchical codebooks. The use of mixture-of-experts allows for the generated image conditioned on the fine-grained text input. The prediction for finer level indices refines the quality of clothing textures. Extensive quantitative and qualitative evaluations demonstrate that our proposed Text2Human framework can generate more diverse and realistic human images compared to state-of-the-art methods. Our project page is https://yumingj.github.io/projects/Text2Human.html. Code and pretrained models are available at https://github.com/yumingj/Text2Human.


Journal ArticleDOI
TL;DR: In this article , the authors proposed the efficient near-field beam training schemes by designing the nearfield codebook to match the near field channel model, where different levels of subcodebooks are searched in turn with reduced codebook size.
Abstract: Reconfigurable intelligent surface (RIS) is more likely to develop into extremely large-scale RIS (XL-RIS) to efficiently boost the system capacity for future 6G communications. Beam training is an effective way to acquire channel state information (CSI) for XL-RIS. Existing beam training schemes rely on the far-field codebook. However, due to the large aperture of XL-RIS, the scatters are more likely to be in the near-field region of XL-RIS. The far-field codebook mismatches the near-field channel model. Thus, the existing far-field beam training scheme will cause severe performance loss in the XL-RIS assisted near-field communications. To solve this problem, we propose the efficient near-field beam training schemes by designing the near-field codebook to match the near-field channel model. Specifically, we firstly design the near-field codebook by considering the near-field cascaded array steering vector of XL-RIS. Then, the optimal codeword for XL-RIS is obtained by the exhausted training procedure. To reduce the beam training overhead, we further design a hierarchical near-field codebook and propose the corresponding hierarchical near-field beam training scheme, where different levels of sub-codebooks are searched in turn with reduced codebook size. Simulation results show the proposed near-field beam training schemes outperform the existing far-field beam training scheme.

Journal ArticleDOI
TL;DR: In this paper , the problem of spatial signal design for multipath-assisted mm-wave positioning under limited prior knowledge on the user's location and clock bias was considered and an optimal robust design was proposed based on the low-dimensional precoder structure under perfect prior knowledge, with optimized beam power allocation.
Abstract: We consider the problem of spatial signal design for multipath-assisted mmWave positioning under limited prior knowledge on the user’s location and clock bias. We propose an optimal robust design and, based on the low-dimensional precoder structure under perfect prior knowledge, a codebook-based heuristic design with optimized beam power allocation. Through numerical results, we characterize different position-error-bound (PEB) regimes with respect to clock bias uncertainty and show that the proposed low-complexity codebook-based designs outperform the conventional directional beam codebook and achieve near-optimal PEB performance for both analog and digital architectures.

BookDOI
01 Jan 2022
TL;DR: In this paper , the authors present a codebook for a revised version of the institutional grammar, the Institutional Grammar 2.0 (IG2.0), which is a specification that aims at facilitating the encoding of policy to meet varying analytical objectives.
Abstract: The Grammar of Institutions, or Institutional Grammar, is an established approach to encode policy information in terms of institutional statements based on a set of pre-defined syntactic components. This codebook provides coding guidelines for a revised version of the Institutional Grammar, the Institutional Grammar 2.0 (IG 2.0). IG 2.0 is a specification that aims at facilitating the encoding of policy to meet varying analytical objectives. To this end, it revises the grammar with respect to comprehensiveness, flexibility, and specificity by offering multiple levels of expressiveness (IG Core, IG Extended, IG Logico). In addition to the encoding of regulative statements, it further introduces the encoding of constitutive institutional statements, as well as statements that exhibit both constitutive and regulative characteristics. Introducing those aspects, the codebook initially covers fundamental concepts of IG 2.0, before providing an overview of pre-coding steps relevant for document preparation. Detailed coding guidelines are provided for both regulative and constitutive statements across all levels of expressiveness, along with the encoding guidelines for statements of mixed form -- hybrid and polymorphic institutional statements. The document further provides an overview of taxonomies used in the encoding process and referred to throughout the codebook. The codebook concludes with a summary and discussion of relevant considerations to facilitate the coding process. An initial Reader's Guide helps the reader tailor the content to her interest. Note that this codebook specifically focuses on operational aspects of IG 2.0 in the context of policy coding. Links to additional resources such as the underlying scientific literature (that offers a comprehensive treatment of the underlying theoretical concepts) are referred to in the DOI and the concluding section of the codebook.

Proceedings ArticleDOI
26 Feb 2022
TL;DR: This work proposes Feature Matching SR (FeMaSR), which restores realistic HR images in a much more compact feature space by matching distorted LR image features to their distortion-free HR counterparts in the authors' pretrained HR priors, and decoding the matched features to obtain realisticHR images.
Abstract: A key challenge of real-world image super-resolution (SR) is to recover the missing details in low-resolution (LR) images with complex unknown degradations (\eg, downsampling, noise and compression). Most previous works restore such missing details in the image space. To cope with the high diversity of natural images, they either rely on the unstable GANs that are difficult to train and prone to artifacts, or resort to explicit references from high-resolution (HR) images that are usually unavailable. In this work, we propose Feature Matching SR (FeMaSR), which restores realistic HR images in a much more compact feature space. Unlike image-space methods, our FeMaSR restores HR images by matching distorted LR image features to their distortion-free HR counterparts in our pretrained HR priors, and decoding the matched features to obtain realistic HR images. Specifically, our HR priors contain a discrete feature codebook and its associated decoder, which are pretrained on HR images with a Vector Quantized Generative Adversarial Network (VQGAN). Notably, we incorporate a novel semantic regularization in VQGAN to improve the quality of reconstructed images. For the feature matching, we first extract LR features with an LR encoder consisting of several Swin Transformer blocks and then follow a simple nearest neighbour strategy to match them with the pretrained codebook. In particular, we equip the LR encoder with residual shortcut connections to the decoder, which is critical to the optimization of feature matching loss and also helps to complement the possible feature matching errors.Experimental results show that our approach produces more realistic HR images than previous methods. Code will be made publicly available.

Journal ArticleDOI
TL;DR: In this paper , the authors consider a variant of the sequence reconstruction problem where the number of noisy reads is fixed and design reconstruction codes for all values of $N$>.
Abstract: The sequence reconstruction problem, introduced by Levenshtein in 2001, considers a communication scenario where the sender transmits a codeword from some codebook and the receiver obtains multiple noisy reads of the codeword. The common setup assumes the codebook to be the entire space and the problem is to determine the minimum number of distinct reads that is required to reconstruct the transmitted codeword. Motivated by modern storage devices, we study a variant of the problem where the number of noisy reads $N$ is fixed. Specifically, we design reconstruction codes that reconstruct a codeword from $N$ distinct noisy reads. We focus on channels that introduce a single edit error (i.e. a single substitution, insertion, or deletion) and their variants, and design reconstruction codes for all values of $N$ . In particular, for the case of a single edit, we show that as the number of noisy reads increases, the number of redundant symbols required can be gracefully reduced from $\log _{q} n+O(1)$ to $\log _{q} \log _{q} n+O(1)$ , and then to $O(1)$ , where $n$ denotes the length of a codeword. We also show that these reconstruction codes are asymptotically optimal. Finally, via computer simulations, we demonstrate that in certain cases, reconstruction codes can achieve similar performance as classical error-correcting codes with less redundant symbols.

Journal ArticleDOI
TL;DR: To alleviate the resource allocation (RA) challenges of such a system at the transmitter, dual parameter ranking (DPR) and alternate search method (ASM) based RA schemes are proposed and the results show significant capacity gain with DPR-RA in comparison with conventional schemes.
Abstract: Hybrid multiple access schemes are considered potential technologies towards achieving optimal spectrum sharing for future heterogeneous networks. Multiple users are multiplexed on a single resource unit in the code domain for sparse code multiple access (SCMA) or power domain for power domain non-orthogonal multiple access (PD-NOMA) and in both domains for the hybrid power domain sparse code non-orthogonal multiple access (PD-SCMA). This allows for effective spectrum usage but comes at a cost of increased detection complexity resulting in loss of performance in terms of outages. It is therefore imperative to determine the user multiplexing capacity for effective performance. This work investigates codebook capacity bounds in small cells, pairing and power capacity bounds for the number of small cell user equipment’s (SUEs) and macro cell user equipment’s (MUEs) that can be multiplexed on a codebook for the developed PD-SCMA technology. Closed-form solutions for codebook, pairing and power multiplexing capacity bounds are derived. The performance of the system results into low outage when the system’s point of operation is within the multiplexing bounds. To alleviate the resource allocation (RA) challenges of such a system at the transmitter, dual parameter ranking (DPR) and alternate search method (ASM) based RA schemes are proposed. The results show significant capacity gain with DPR-RA in comparison with conventional schemes.

Journal ArticleDOI
TL;DR: In this paper , the tradeoff between local privacy and system utility is formalized as a multiobjective optimization problem, and the optimal solution is theoretically derived, and an optimal task allocation scheme is obtained.
Abstract: The location information for tasks may expose sensitive information, which impedes the practical use of mobile crowdsensing in the industrial Internet. In this article, to our knowledge, we are the first to discuss the privacy protection of task locations and propose a codebook-based task allocation mechanism to protect it. Considering the cost of system utility caused by privacy protection technology, the tradeoff between local privacy and system utility is formalized a multiobjective optimization problem. The optimal solution is theoretically derived, and the optimal task allocation scheme is obtained. In addition, the selected allocation codebook (SAC) method is introduced to solve the problem of high computational resource consumption in the task allocation process and protect the task location privacy to some extent. The experimental results show that the SAC method sacrifices system utility but improves the privacy protection for task locations by 60% on average.

Journal ArticleDOI
TL;DR: In this article , an end-to-end neural network (NN) architecture is designed to jointly learn the probing codebook and the beam predictor, which can capture particular characteristics of the propagation environment.
Abstract: Beam alignment – the process of finding an optimal directional beam pair – is a challenging procedure crucial to millimeter wave (mmWave) communication systems. We propose a novel beam alignment method that learns a site-specific probing codebook and uses the probing codebook measurements to predict the optimal narrow beam. An end-to-end neural network (NN) architecture is designed to jointly learn the probing codebook and the beam predictor. The learned codebook consists of site-specific probing beams that can capture particular characteristics of the propagation environment. The proposed method relies on beam sweeping of the learned probing codebook, does not require additional context information, and is compatible with the beam sweeping-based beam alignment framework in 5G. Using realistic ray-tracing datasets, we demonstrate that the proposed method can achieve high beam alignment accuracy and signal-to-noise ratio (SNR) while significantly – by roughly a factor of 3 in our setting – reducing the beam sweeping complexity and latency.