Home
/
Authors
/
Haoyu Ma

Author

Haoyu Ma

Other affiliations: Southeast University, Tencent

Bio: Haoyu Ma is an academic researcher from University of California, Irvine. The author has contributed to research in topics: Computer science & Pose. The author has an hindex of 6, co-authored 20 publications receiving 83 citations. Previous affiliations of Haoyu Ma include Southeast University & Tencent.

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

LSTM Multi-modal UNet for Brain Tumor Segmentation

[...]

Fan Xu¹, Haoyu Ma¹, Junxiao Sun¹, Rui Wu¹, Xu Liu¹, Youyong Kong¹ - Show less +2 more•Institutions (1)

Southeast University¹

05 Jul 2019

TL;DR: This work proposed an architecture for brain tumor segmentation in multi- modal magnetic resonance images (MRI), named LSTM multi-modal UNet, and shows that this method outperforms the state-of-the-art biomedical segmentation approaches.

...read moreread less

Abstract: Deep learning models such as convolutional neural network has been widely used in 3D biomedical image segmentation. However, most of them neither consider the correlations between different modalities, nor fully exploit depth information. To better leverage the multi-modalities and depth information, we proposed an architecture for brain tumor segmentation in multi- modal magnetic resonance images (MRI), named LSTM multi- modal UNet. Experiments results on BRATS-2015 show that our method outperforms the state-of-the-art biomedical segmentation approaches.

...read moreread less

47 citations

Proceedings Article•DOI•

Nonparametric Structure Regularization Machine for 2D Hand Pose Estimation

[...]

Yifei Chen¹, Haoyu Ma², Deying Kong², Xiangyi Yan², Jianbao Wu¹, Wei Fan¹, Xiaohui Xie² - Show less +3 more•Institutions (2)

Tencent¹, University of California, Irvine²

01 Mar 2020

TL;DR: In this paper, a nonparametric structure regularization machine (NSRM) is proposed to learn hand structure and keypoint representations jointly, guided by synthetic hand mask representations, and further strengthened by a novel probabilistic representation of hand limbs and an anatomically inspired composition strategy of mask synthesis.

...read moreread less

Abstract: Hand pose estimation is more challenging than body pose estimation due to severe articulation, self-occlusion and high dexterity of the hand. Current approaches often rely on a popular body pose algorithm, such as the Convolutional Pose Machine (CPM), to learn 2D keypoint features. These algorithms cannot adequately address the unique challenges of hand pose estimation, because they are trained solely based on keypoint positions without seeking to explicitly model structural relationship between them. We propose a novel Nonparametric Structure Regularization Machine (NSRM) for 2D hand pose estimation, adopting a cascade multi-task architecture to learn hand structure and keypoint representations jointly. The structure learning is guided by synthetic hand mask representations, which are directly computed from keypoint positions, and is further strengthened by a novel probabilistic representation of hand limbs and an anatomically inspired composition strategy of mask synthesis. We conduct extensive studies on two public datasets - OneHand 10k and CMU Panoptic Hand. Experimental results demonstrate that explicitly enforcing structure learning consistently improves pose estimation accuracy of CPM baseline models, by 1.17% on the first dataset and 4.01% on the second one. The implementation and experiment code is freely available online1. Our proposal of incorporating structural learning to hand pose estimation requires no additional training information, and can be a generic add-on module to other pose estimation models.

...read moreread less

34 citations

Posted Content•

AFTer-UNet: Axial Fusion Transformer UNet for Medical Image Segmentation.

[...]

Xiangyi Yan, Hao Tang, Shanlin Sun, Haoyu Ma, Deying Kong, Xiaohui Xie¹ - Show less +2 more•Institutions (1)

University of California, Irvine¹

20 Oct 2021-arXiv: Image and Video Processing

TL;DR: In this article, Axial Fusion Transformer UNet (AFTer-UNet) is proposed, which takes both advantages of convolutional layers' capability of extracting detailed features and transformers' strength on long sequence modeling.

...read moreread less

Abstract: Recent advances in transformer-based models have drawn attention to exploring these techniques in medical image segmentation, especially in conjunction with the U-Net model (or its variants), which has shown great success in medical image segmentation, under both 2D and 3D settings. Current 2D based methods either directly replace convolutional layers with pure transformers or consider a transformer as an additional intermediate encoder between the encoder and decoder of U-Net. However, these approaches only consider the attention encoding within one single slice and do not utilize the axial-axis information naturally provided by a 3D volume. In the 3D setting, convolution on volumetric data and transformers both consume large GPU memory. One has to either downsample the image or use cropped local patches to reduce GPU memory usage, which limits its performance. In this paper, we propose Axial Fusion Transformer UNet (AFTer-UNet), which takes both advantages of convolutional layers' capability of extracting detailed features and transformers' strength on long sequence modeling. It considers both intra-slice and inter-slice long-range cues to guide the segmentation. Meanwhile, it has fewer parameters and takes less GPU memory to train than the previous transformer-based models. Extensive experiments on three multi-organ segmentation datasets demonstrate that our method outperforms current state-of-the-art methods.

...read moreread less

30 citations

Proceedings Article•

Sparsity Winning Twice: Better Robust Generalization from More Efficient Training

[...]

Tianlong Chen, Zhenyu (Allen) Zhang, Pengju Wang, Santosh Balachandra, Haoyu Ma, Zehao Wang, Zhangyang Wang - Show less +3 more

20 Feb 2022

TL;DR: Two alternatives for sparse adversarial training are introduced: static sparsity and dynamic sparsity, both of which allow the sparse subnetwork to adaptively adjust its connectivity pattern (while sticking to the same sparsity ratio) throughout training.

...read moreread less

Abstract: Recent studies demonstrate that deep networks, even robustified by the state-of-the-art adversarial training (AT), still suffer from large robust generalization gaps, in addition to the much more expensive training costs than standard training. In this paper, we investigate this intriguing problem from a new perspective, i.e., injecting appropriate forms of sparsity during adversarial training. We introduce two alternatives for sparse adversarial training: (i) static sparsity, by leveraging recent results from the lottery ticket hypothesis to identify critical sparse subnetworks arising from the early training; (ii) dynamic sparsity, by allowing the sparse subnetwork to adaptively adjust its connectivity pattern (while sticking to the same sparsity ratio) throughout training. We find both static and dynamic sparse methods to yield win-win: substantially shrinking the robust generalization gap and alleviating the robust overfitting, meanwhile significantly saving training and inference FLOPs. Extensive experiments validate our proposals with multiple network architectures on diverse datasets, including CIFAR-10/100 and Tiny-ImageNet. For example, our methods reduce robust generalization gap and overfitting by 34.44% and 4.02%, with comparable robust/standard accuracy boosts and 87.83%/87.82% training/inference FLOPs savings on CIFAR-100 with ResNet-18. Besides, our approaches can be organically combined with existing regularizers, establishing new state-of-the-art results in AT. Codes are available in https://github.com/VITA-Group/Sparsity-Win-Robust-Generalization.

...read moreread less

26 citations

Proceedings Article•

Adaptive Graphical Model Network for 2D Handpose Estimation.

[...]

Deying Kong¹, Yifei Chen¹, Haoyu Ma², Xiangyi Yan¹, Xiaohui Xie³ - Show less +1 more•Institutions (3)

University of California, Irvine¹, Southeast University², Harvard University³

01 Jan 2019

TL;DR: A new architecture called Adaptive Graphical Model Network (AGMN) is proposed to tackle the task of 2D hand pose estimation from a monocular RGB image and outperforms the state-of-the-art method used in 2DHand keypoints estimation by a notable margin on two public datasets.

...read moreread less

Abstract: In this paper, we propose a new architecture called Adaptive Graphical Model Network (AGMN) to tackle the task of 2D hand pose estimation from a monocular RGB image. The AGMN consists of two branches of deep convolutional neural networks for calculating unary and pairwise potential functions, followed by a graphical model inference module for integrating unary and pairwise potentials. Unlike existing architectures proposed to combine DCNNs with graphical models, our AGMN is novel in that the parameters of its graphical model are conditioned on and fully adaptive to individual input images. Experiments show that our approach outperforms the state-of-the-art method used in 2D hand keypoints estimation by a notable margin on two public datasets.

...read moreread less

16 citations

1
2
3
4
…
5
6
7

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Segmentation of the multimodal brain tumor image used the multi-pathway architecture method based on 3D FCN

[...]

Jindong Sun¹, Yanjun Peng¹, Yanfei Guo¹, Dapeng Li¹•Institutions (1)

Shandong University of Science and Technology¹

29 Jan 2021-Neurocomputing

TL;DR: A novel model based on 3D fully convolutional network is proposed that applies multi-pathway architecture to feature extraction so as to effectively extract features from multi-modal MRI images.

...read moreread less

70 citations

Posted Content•

Augmented Skeleton Space Transfer for Depth-based Hand Pose Estimation

[...]

Seungryul Baek¹, Kwang In Kim, Tae-Kyun Kim¹•Institutions (1)

Imperial College London¹

11 May 2018-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work proposes to complete existing databases by generating new database entries by synthesizing data in the skeleton space (instead of doing so in the depth-map space) which enables an easy and intuitive way of manipulating data entries.

...read moreread less

Abstract: Crucial to the success of training a depth-based 3D hand pose estimator (HPE) is the availability of comprehensive datasets covering diverse camera perspectives, shapes, and pose variations. However, collecting such annotated datasets is challenging. We propose to complete existing databases by generating new database entries. The key idea is to synthesize data in the skeleton space (instead of doing so in the depth-map space) which enables an easy and intuitive way of manipulating data entries. Since the skeleton entries generated in this way do not have the corresponding depth map entries, we exploit them by training a separate hand pose generator (HPG) which synthesizes the depth map from the skeleton entries. By training the HPG and HPE in a single unified optimization framework enforcing that 1) the HPE agrees with the paired depth and skeleton entries; and 2) the HPG-HPE combination satisfies the cyclic consistency (both the input and the output of HPG-HPE are skeletons) observed via the newly generated unpaired skeletons, our algorithm constructs a HPE which is robust to variations that go beyond the coverage of the existing database. Our training algorithm adopts the generative adversarial networks (GAN) training process. As a by-product, we obtain a hand pose discriminator (HPD) that is capable of picking out realistic hand poses. Our algorithm exploits this capability to refine the initial skeleton estimates in testing, further improving the accuracy. We test our algorithm on four challenging benchmark datasets (ICVL, MSRA, NYU and Big Hand 2.2M datasets) and demonstrate that our approach outperforms or is on par with state-of-the-art methods quantitatively and qualitatively.

...read moreread less

56 citations

Journal Article•DOI•

An artificial intelligence framework and its bias for brain tumor segmentation: A narrative review

[...]

Suchismita Das, T. K. Nayak, Luca Saba, Mannudeep K. Kalra, Jasjit S. Suri, Sanjaya Saxena - Show less +2 more

01 Feb 2022-Computers in Biology and Medicine

TL;DR: In this article , the authors focused on linking risk-of-bias (RoB) and different AI-based architectures in the DL framework, and presented a set of three primary and six secondary recommendations for lowering the RoB.

...read moreread less

38 citations

Journal Article•DOI•

A Survey of Brain Tumor Segmentation and Classification Algorithms.

[...]

Erena Siyoum Biratu¹, Friedhelm Schwenker², Yehualashet Megersa Ayano³, Taye Girma Debelee³, Taye Girma Debelee¹ - Show less +1 more•Institutions (3)

College of Electrical and Mechanical Engineering¹, University of Ulm², Artificial Intelligence Center³

06 Sep 2021-Journal of Imaging

TL;DR: A comprehensive survey of three, recently proposed, major brain tumor segmentation and classification model techniques, namely, region growing, shallow machine learning and deep learning, can be found in this paper.

...read moreread less

Abstract: A brain Magnetic resonance imaging (MRI) scan of a single individual consists of several slices across the 3D anatomical view. Therefore, manual segmentation of brain tumors from magnetic resonance (MR) images is a challenging and time-consuming task. In addition, an automated brain tumor classification from an MRI scan is non-invasive so that it avoids biopsy and make the diagnosis process safer. Since the beginning of this millennia and late nineties, the effort of the research community to come-up with automatic brain tumor segmentation and classification method has been tremendous. As a result, there are ample literature on the area focusing on segmentation using region growing, traditional machine learning and deep learning methods. Similarly, a number of tasks have been performed in the area of brain tumor classification into their respective histological type, and an impressive performance results have been obtained. Considering state of-the-art methods and their performance, the purpose of this paper is to provide a comprehensive survey of three, recently proposed, major brain tumor segmentation and classification model techniques, namely, region growing, shallow machine learning and deep learning. The established works included in this survey also covers technical aspects such as the strengths and weaknesses of different approaches, pre- and post-processing techniques, feature extraction, datasets, and models’ performance evaluation metrics.

...read moreread less

37 citations

Proceedings Article•

Robust Training under Label Noise by Over-parameterization

[...]

Sheng Liu, Zhihui Zhu, Qi Qu, Chong You

28 Feb 2022

TL;DR: This work proposes a principled approach for robust training of over-parameterized deep networks in classiﬁcation tasks where a proportion of training labels are corrupted, and demonstrates state-of-the-art test accuracy against label noise on a variety of real datasets.

...read moreread less

Abstract: Recently, over-parameterized deep networks, with increasingly more network parameters than training samples, have dominated the performances of modern machine learning. However, when the training data is corrupted, it has been well-known that over-parameterized networks tend to overfit and do not generalize. In this work, we propose a principled approach for robust training of over-parameterized deep networks in classification tasks where a proportion of training labels are corrupted. The main idea is yet very simple: label noise is sparse and incoherent with the network learned from clean data, so we model the noise and learn to separate it from the data. Specifically, we model the label noise via another sparse over-parameterization term, and exploit implicit algorithmic regularizations to recover and separate the underlying corruptions. Remarkably, when trained using such a simple method in practice, we demonstrate state-of-the-art test accuracy against label noise on a variety of real datasets. Furthermore, our experimental results are corroborated by theory on simplified linear models, showing that exact separation between sparse noise and low-rank data can be achieved under incoherent conditions. The work opens many interesting directions for improving over-parameterized models by using sparse over-parameterization and implicit regularization.

...read moreread less

34 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43

Collapse