Showing papers by "Rong Zhang published in 2023"

PDF

Open Access

Journal Article•DOI•

Infection rates of 70% of the population observed within 3 weeks after release of COVID-19 restrictions in Macao, China

[...]

Jingyi Liang, Ruibin Liu, Wei He, Zhiqi Zeng, Yan Wang, Boyuan Wang, Lixi Liang, Rong Zhang, C.L. Philip Chen, ChiWai Chang, Chi Ho Ivan Hon, Eric H. Y. Lau, Zifeng Yang, KaLok Tong - Show less +10 more

31 Jan 2023-Journal of Infection

TL;DR: Lau et al. as mentioned in this paper reported the rapid spread of SARS-CoV-2 in Macao, Special Administrative Region, China, and conducted an online survey during December 27-30, 2022.

...read moreread less

5 citations

Journal Article•DOI•

Multi-Person Pose Estimation in the Wild: Using Adversarial Method to Train a Top-Down Pose Estimation Network

[...]

Rong Zhang, Jingxiang Lian, Jingtao Wen, C. L. Philip Chen

01 Jul 2023

TL;DR: In this article , a top-down convolutional network is proposed to incorporate priors about the structure of pose components and body configuration during training, which can improve the robustness under complex field conditions in the wild.

...read moreread less

Abstract: Recent studies estimate human anatomical key points through the single monocular image, in which multichannel heatmaps are the key factor in determining the quality of human pose estimation. Multichannel heatmaps can efficiently handle the image-to-coordinate mapping task and the processing of semantic features. Most methods ignore physical constraints and internal relationships of human body parts, which easily misclassify left and right symmetrical parts as similar features. Some studies use RNNs on the top to incorporate priors about the structure of pose components and body configuration. Therefore, a novel top-down convolutional network is proposed to consider these priors during training, which can improve the robustness under complex field conditions in the wild. In order to learn the prior knowledge of human pose configuration, the hierarchy of fully convolutional networks (discriminator) is used to distinguish real poses from fake ones. Consequently, the pose network is inclined to make a pose estimation that the discriminator misjudges as true, which is reasonable in complex situations. The performance of the method is experimentally validated by pose estimation on the MS COCO human key point detection task. The proposed approach outperforms the original method and generates robust pose predictions, demonstrating efficiency by using adversarial learning.

...read moreread less

1 citations

Journal Article•DOI•

MIA-Net: Multi-Modal Interactive Attention Network for Multi-Modal Affective Analysis

[...]

Shuzhen Li, Rong Zhang, Bianna Chen, C. L. Philip Chen

01 Jan 2023-IEEE Transactions on Affective Computing

TL;DR: In this paper , a multi-modal interactive attention network (MIA-Net) is proposed, which takes the modality that contributes the most to emotion as the main modality and the others as auxiliary modalities.

...read moreread less

Abstract: When a multi-modal affective analysis model generalizes from a bimodal task to a trimodal or multi-modal task, it is usually transformed into a hierarchical fusion model based on every two pairwise modalities, similar to a binary tree structure. This easily leads to large growth in model parameters and computation as the number of modalities increases, which limits the model's generalization. Moreover, many multi-modal fusion methods ignore that different modalities contribute differently to affective analysis. To tackle these challenges, this paper proposes a general multi-modal fusion model that supports trimodal or multi-modal affective analysis tasks, called Multi-modal Interactive Attention Network (MIA-Net). Instead of treating different modalities equally, MIA-Net takes the modality that contributes the most to emotion as the main modality and the others as auxiliary modalities. MIA-Net introduces multi-modal interactive attention modules to adaptively select the important information of each auxiliary modality one by one to improve the main-modal representation. Moreover, MIA-Net enables quick generalization to trimodal or multi-modal tasks through stacking multiple MIA modules, which maintains efficient training and only requires linear computation and stable parameter counts. Experimental results of the transfer, generalization, and efficiency experiments on the widely-used datasets demonstrate the effectiveness and generalization of the proposed method.

...read moreread less

1 citations

Journal Article•DOI•

Constructing Microstructural Evolution System for Cement Hydration From Observed Data Using Deep Learning

[...]

Jifeng Guo, C. L. Philip Chen, Li Wang, Bo Yang, Rong Zhang, Liangliang Zhang - Show less +2 more

01 Jul 2023

TL;DR: In this paper , a near-realistic microstructural model is proposed to simulate the cement hydration system using deep learning and cellular automata, in which behavior is controlled by deep neural networks distilled from micro-structural images.

...read moreread less

Abstract: Cement has been widely used in civil engineering directly and plays a critical role in cement-based materials, e.g., concrete. As the microstructural evolution of cement hydration predominates the final physical properties, an accurate simulation of hydration is highly required to enable scientists to evaluate the performance and help design new cementitious materials. However, despite significant effort and progress, a satisfactory model to realistically and accurately simulate the evolution of three-dimensional (3-D) microstructure has not yet to be constructed, mainly because cement hydration is one of the most complex phenomena in material science. In this work, a novel near-realistic microstructural model is proposed to simulate the cement hydration system using deep learning and cellular automata. It is designed to break through the bottleneck of fidelity to real microstructural evolution. The dynamical system is constructed based on a 3-D cellular automaton, in which behavior is controlled by deep neural networks distilled from microstructural images. In addition, a dynamic stratified sampling method with variable capacity is proposed to ensure the representativeness of samples for reducing the computation cost of training. Experiments manifest that the simulated hydration is in accordance with the actual development in different aspects, such as near-realistic microstructure and approximate process. Furthermore, the constructed system also demonstrates promising generalization capability even under various conditions.

...read moreread less

Journal Article•DOI•

Saliency-Guided Mutual Learning Network for Few-shot Fine-grained Visual Recognition

[...]

Haiqi Liu, C. L. Philip Chen, Xin-Rong Gong, Rong Zhang

12 May 2023-arXiv.org

TL;DR: SGML-Net as mentioned in this paper incorporates auxiliary information via saliency detection to guide discriminative representation learning, achieving high performance and low model complexity for few-shot fine-grained visual recognition.

...read moreread less

Abstract: Recognizing novel sub-categories with scarce samples is an essential and challenging research topic in computer vision. Existing literature focus on addressing this challenge through global-based or local-based representation approaches. The former employs global feature representations for recognization, which may lack fine-grained information. The latter captures local relationships with complex structures, possibly leading to high model complexity. To address the above challenges, this article proposes a novel framework called SGML-Net for few-shot fine-grained visual recognition. SGML-Net incorporates auxiliary information via saliency detection to guide discriminative representation learning, achieving high performance and low model complexity. Specifically, SGML-Net utilizes the saliency detection model to emphasize the key regions of each sub-category, providing a strong prior for representation learning. SGML-Net transfers such prior with two independent branches in a mutual learning paradigm. To achieve effective transfer, SGML-Net leverages the relationships among different regions, making the representation more informative and thus providing better guidance. The auxiliary branch is excluded upon the transfer's completion, ensuring low model complexity in deployment. The proposed approach is empirically evaluated on three widely-used benchmarks, demonstrating its superior performance.

...read moreread less

Journal Article•DOI•

ConvBLS: An Effective and Efficient Incremental Convolutional Broad Learning System for Image Classification

[...]

Chunyu Lei, C. L. Philip Chen, Jifeng Guo, Rong Zhang

01 Apr 2023-arXiv.org

TL;DR: Wang et al. as discussed by the authors proposed a convolutional broad learning system (ConvBLS) based on the spherical K-means (SKM) algorithm and two-stage multi-scale (TSMS) feature fusion.

...read moreread less

Abstract: Deep learning generally suffers from enormous computational resources and time-consuming training processes. Broad Learning System (BLS) and its convolutional variants have been proposed to mitigate these issues and have achieved superb performance in image classification. However, the existing convolutional-based broad learning system (C-BLS) either lacks an efficient training method and incremental learning capability or suffers from poor performance. To this end, we propose a convolutional broad learning system (ConvBLS) based on the spherical K-means (SKM) algorithm and two-stage multi-scale (TSMS) feature fusion, which consists of the convolutional feature (CF) layer, convolutional enhancement (CE) layer, TSMS feature fusion layer, and output layer. First, unlike the current C-BLS, the simple yet efficient SKM algorithm is utilized to learn the weights of CF layers. Compared with random filters, the SKM algorithm makes the CF layer learn more comprehensive spatial features. Second, similar to the vanilla BLS, CE layers are established to expand the feature space. Third, the TSMS feature fusion layer is proposed to extract more effective multi-scale features through the integration of CF layers and CE layers. Thanks to the above design and the pseudo-inverse calculation of the output layer weights, our proposed ConvBLS method is unprecedentedly efficient and effective. Finally, the corresponding incremental learning algorithms are presented for rapid remodeling if the model deems to expand. Experiments and comparisons demonstrate the superiority of our method.

...read moreread less

Journal Article•DOI•

Semantic Learning for Facial Action Unit Detection

[...]

Xue-han Wang, C. L. Philip Chen, Hao Yuan, Rong Zhang

01 Jun 2023-IEEE Transactions on Computational Social Systems

TL;DR: Zhang et al. as discussed by the authors proposed semantic embedding for image transformers (SEiTs) to explore semantic features of facial morphology in the AU detection task, which can learn morphological features intrinsically from the face image.

...read moreread less

Abstract: This article proposes semantic embedding for image transformers (SEiTs) to explore semantic features of facial morphology in the action unit (AU) detection task. The conventional approaches typically rely on external information (e.g., facial landmarks) to obtain the location of facial components, whereas the SEiT can learn morphological features intrinsically from the face image. The pre-training task, namely semantic masked facial image modeling (SMFIM), aims to actively obtain facial morphological information. The pixels of the input facial image are randomly erased with semantic masks (e.g., nose, eyes, eyebrows, mouth, and lip). The embedding model tries to predict the presence of facial components for the input image that can learn semantic representations of the face simultaneously. The learned semantic embeddings are fed to transformer blocks, which enable global interaction between semantic elements. The SEiT integrates facial morphological information and global interaction characters, appropriate for AU detection. The experiments are conducted on the Binghamton-Pittsburgh 4D (BP4D) dataset and Denver intensity of spontaneous facial action (DISFA) dataset, and the results demonstrate the effectiveness of the proposed SEiT.

...read moreread less

Journal Article•DOI•

A Broad Generative Network for Two-Stage Image Outpainting.

[...]

Zongyan Zhang, Haohan Weng, Rong Zhang, C. L. Philip Chen

23 May 2023-IEEE transactions on neural networks and learning systems

TL;DR: Zhang et al. as mentioned in this paper proposed a broad generative network (BG-Net) for two-stage image outpainting, in which a reconstruction network was trained by ridge regression optimization and a seam line discriminator was designed for transition smoothing.

...read moreread less

Abstract: Image outpainting is a challenge for image processing since it needs to produce a big scenery image from a few patches. In general, two-stage frameworks are utilized to unpack complex tasks and complete them step-by-step. However, the time consumption caused by training two networks will hinder the method from adequately optimizing the parameters of networks with limited iterations. In this article, a broad generative network (BG-Net) for two-stage image outpainting is proposed. As a reconstruction network in the first stage, it can be quickly trained by utilizing ridge regression optimization. In the second stage, a seam line discriminator (SLD) is designed for transition smoothing, which greatly improves the quality of images. Compared with state-of-the-art image outpainting methods, the experimental results on the Wiki-Art and Place365 datasets show that the proposed method achieves the best results under evaluation metrics: the Fréchet inception distance (FID) and the kernel inception distance (KID). The proposed BG-Net has good reconstructive ability with faster training speed than those of deep learning-based networks. It reduces the overall training duration of the two-stage framework to the same level as the one-stage framework. Furthermore, the proposed method is adapted to image recurrent outpainting, demonstrating the powerful associative drawing capability of the model.

...read moreread less