Showing papers by "Sheng Tang published in 2022"

PDF

Open Access

Journal Article•DOI•

You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model

[...]

Sheng Tang, Yaqing Wang, Zhenglun Kong, Tianchi Zhang, Yao Li, Cai Ding, Yanzhi Wang, Yi Liang, Dongkuan Xu - Show less +5 more

21 Nov 2022-arXiv.org

TL;DR: In this paper , the authors propose a novel early exiting strategy for unified visual language models, which allows dynamically skip the layers in encoder and decoder simultaneously in terms of input layer-wise similarities with multiple times of early exiting, namely \textbf{MuE}.

...read moreread less

Abstract: Large-scale Transformer models bring significant improvements for various downstream vision language tasks with a unified architecture. The performance improvements come with increasing model size, resulting in slow inference speed and increased cost for severing. While some certain predictions benefit from the full complexity of the large-scale model, not all of inputs need the same amount of computation to conduct, potentially leading to computation resource waste. To handle this challenge, early exiting is proposed to adaptively allocate computational power in term of input complexity to improve inference efficiency. The existing early exiting strategies usually adopt output confidence based on intermediate layers as a proxy of input complexity to incur the decision of skipping following layers. However, such strategies cannot apply to encoder in the widely-used unified architecture with both encoder and decoder due to difficulty of output confidence estimation in the encoder. It is suboptimal in term of saving computation power to ignore the early exiting in encoder component. To handle this challenge, we propose a novel early exiting strategy for unified visual language models, which allows dynamically skip the layers in encoder and decoder simultaneously in term of input layer-wise similarities with multiple times of early exiting, namely \textbf{MuE}. By decomposing the image and text modalities in the encoder, MuE is flexible and can skip different layers in term of modalities, advancing the inference efficiency while minimizing performance drop. Experiments on the SNLI-VE and MS COCO datasets show that the proposed approach MuE can reduce expected inference time by up to 50\% and 40\% while maintaining 99\% and 96\% performance respectively.

...read moreread less

3 citations

Journal Article•DOI•

Actively Learning Gaussian Process Dynamical Systems Through Global and Local Explorations

[...]

Sheng Tang, Kenji Fujimoto, Ichiro Maruta

IEEE Access

TL;DR: Novel and more sample-efficient methods which combine global and local explorations which are capable of exploring the state-action space more efficiently, and have much lower computational complexity and memory demand are proposed.

...read moreread less

Abstract: Usually learning dynamical systems by data-driven methods requires large amount of training data, which may be time consuming and expensive. Active learning, which aims at choosing the most informative samples to make learning more efficient is a promising way to solve this issue. However, actively learning dynamical systems is difficult since it is not possible to arbitrarily sample the state-action space under the constraint of system dynamics. The state-of-the-art methods for actively learning dynamical systems iteratively search for an informative state-action pair by maximizing the differential entropy of the predictive distribution, or iteratively search for a long informative trajectory by maximizing the sum of predictive variances along the trajectory. These methods suffer from low efficiency or high computational complexity and memory demand. To solve these problems, this paper proposes novel and more sample-efficient methods which combine global and local explorations. As the global exploration, the agent searches for a relatively short informative trajectory in the whole state-action space of the dynamical system. Then, as the local exploration, an action sequence is optimized to drive the system’s state towards the initial state of the local informative trajectory found by the global exploration and the agent explores this local informative trajectory. Compared to the state-of-the-art methods, the proposed methods are capable of exploring the state-action space more efficiently, and have much lower computational complexity and memory demand. With the state-of-the-art methods as baselines, the advantages of the proposed methods are verified via various numerical examples.

...read moreread less

1 citations

Journal Article•DOI•

Experimental studies on nuclide identification radiography with a CMOS camera at Back-n white neutron source

[...]

Lijiao Wang, Qiang Li, Jingyu Tang, Yonghao Chen, Hantao Jing, Zhixin Tan, Binbin Tian, Gong Li, Z. Jin, R. R. Fan, Changjun Ning, Qi An, Haofan Bai, Jiangbo Bai, Jie Bao, Ping Cao, Qiping Chen, Zhen Chen, Zengqi Cui, Anchuan Fan, C. Q. Feng, F. Z. Feng, Keqing Gao, M. H. Gu, Changcai Han, Zijie Han, Guozhu He, Yongcheng He, Yang Hong, Yiwei Hu, Hanxiong Huang, Weihua Jia, Haoyu Jiang, Wei Jiang, Zhijie Jiang, Ling Kang, Bo Li, Chao Li, Jiawen Li, Xiao Yue Li, Yang Liu, Jie Liu, Rong-Guang Liu, Shubin Liu, Guangyuan Luan, Binbin Qi, Jie Ren, Zhizhou Ren, X. C. Ruan, Zhaohui Song, Kang Sun, Sheng Tang, Pengcheng Wang, Zhaohui Wang, Zhongwei Wen, Xiaoguang Wu, Xuan Wu, Li Xie, Yiwei Yang, H. Yi, Yongji Yu, Guohui Zhang, Linhao Zhang, Mohan Zhang, Qi-Wei Zhang, Xianpeng Zhang, Yue Zhang, Zhiyong Zhang, Maoyuan Zhao, Luping Zhou, Zhihao Zhou, K. J. Zhu - Show less +68 more

01 Dec 2022-Nuclear instruments and methods in physics research

TL;DR: In this paper , the authors used the back-n white neutron source at China Spallation Neutron Source (CSNS) to demonstrate the NIR technique for the identification and imaging of nuclides with cross-section resonances covering from eV to several MeV.

...read moreread less

Abstract: As an advanced neutron imaging technique, nuclide identification radiography (NIR) using neutron resonance or cross-section differences is a promising technique for investigating nuclide spatial distribution inside samples. Since its proposal in 1980s, this technique has advanced very slowly due to many limitations in neutron source, detector efficiency and resolution. At the Back-n white neutron source at China Spallation Neutron Source, the NIR technique based on gated CMOS camera has been developed by taking the advantage of very high neutron flux and suitable energy spectrum of the Back-n neutron beam. Heavy nuclides, medium-mass nuclides and light nuclides have been tested to validate the effectiveness of the technique, covering the resonances from eV to MeV regions. Typical heavy elements such as Ag, In, Au and W, medium-mass elements such Al, Fe and Cu, light elements such as O have been used for experiments. The study shows, that nuclides with resonance peaks in the eV region can be easily identified, while the nuclides with resonance peaks above keV are difficult to be identified individually due to their narrow resonance peaks affecting the signal-to-noise ratio in the camera measurements. However, these nuclides can be differentiated for imaging by using adjacent multiple resonance peaks or distinct cross-section differences over wider energy ranges. The experiments are perhaps the first ones that demonstrate the NIR technique for the identification and imaging of nuclides with cross-section resonances covering from eV to several MeV.

...read moreread less

Proceedings Article•DOI•

Learning dynamical systems using a novel multiple-output Gaussian process model

[...]

Sheng Tang, Kenji Fujimoto, Ichiro Maruta

11 Mar 2022

TL;DR: A novel multiple-output Gaussian process model to learn the dynamical system with ordinary differential equations (ODEs) with unknown physical parameters available as prior information is proposed.

...read moreread less

Abstract: The Gaussian process (GP) has been widely used to learn the dynamical system from training data. Current methods ignore the dependencies among the multiple dimensions of the system function and model each dimension of the system function using an independent single-output GP. This paper proposes a novel multiple-output Gaussian process model to learn the dynamical system. We assume that ordinary differential equations (ODEs) with unknown physical parameters are available as prior information. The system function is modeled by a GP with the mean function given by the one-step integration of ODEs. By this way, all dimensions of the system function are correlated because they share the same unknown physical parameters. This correlation allows the flow of information between multiple dimensions of the system function which in turn will lead to an improved model. With the existing models as baselines, we show the benefits of the proposed model by simulations.

...read moreread less

Journal Article•DOI•

VoxSeP: semi-positive voxels assist self-supervised 3D medical segmentation

[...]

Zijie Yang, Lingxi Xie, Wei Zhou, Xinyue Huo, Long Wei, Jian Lu, Qi Tian, Sheng Tang - Show less +4 more

23 Jul 2022-Multimedia Systems

Proceedings Article•DOI•

Finding the Host from the Lesion by Iteratively Mining the Registration Graph

[...]

Zijie Yang, Lingxi Xie, Xinyue Huo, Sheng Tang, Qi Tian, Yongdong Zhang - Show less +2 more

10 Oct 2022

TL;DR: This paper investigates an interesting problem that finds the host organ of a lesion without actually labeling the organ, and boosts the accuracy of tumor segmentation to a considerable degree, $3%$, which even surpasses the model trained with ground-truth of both organ and tumor.

...read moreread less

Abstract: Voxel-level annotation has always been a burden of training medical image segmentation models. This paper investigates an interesting problem that finds the host organ of a lesion without actually labeling the organ. To remedy the missing annotation, we construct a graph using an off-the-shelf registration algorithm, on which lesion labels over the training set are accumulated to obtain the pseudo organ for each case. These pseudo labels are used to train a deep network, whose predictions determine the affinity of each lesion on the registration graph. We iteratively update the pseudo labels with the affinity until the training convergence. Our method is evaluated on the MSD Liver and KiTS datasets, without seeing any organ annotation, we achieve the test Dice score of 93% for liver and 92% for kidney, and boosts the accuracy of tumor segmentation to a considerable degree, $3%$, which even surpasses the model trained with ground-truth of both organ and tumor.

...read moreread less