scispace - formally typeset
Search or ask a question

Showing papers by "Sheng Tang published in 2022"


Journal ArticleDOI
TL;DR: In this paper , the authors propose a novel early exiting strategy for unified visual language models, which allows dynamically skip the layers in encoder and decoder simultaneously in terms of input layer-wise similarities with multiple times of early exiting, namely \textbf{MuE}.
Abstract: Large-scale Transformer models bring significant improvements for various downstream vision language tasks with a unified architecture. The performance improvements come with increasing model size, resulting in slow inference speed and increased cost for severing. While some certain predictions benefit from the full complexity of the large-scale model, not all of inputs need the same amount of computation to conduct, potentially leading to computation resource waste. To handle this challenge, early exiting is proposed to adaptively allocate computational power in term of input complexity to improve inference efficiency. The existing early exiting strategies usually adopt output confidence based on intermediate layers as a proxy of input complexity to incur the decision of skipping following layers. However, such strategies cannot apply to encoder in the widely-used unified architecture with both encoder and decoder due to difficulty of output confidence estimation in the encoder. It is suboptimal in term of saving computation power to ignore the early exiting in encoder component. To handle this challenge, we propose a novel early exiting strategy for unified visual language models, which allows dynamically skip the layers in encoder and decoder simultaneously in term of input layer-wise similarities with multiple times of early exiting, namely \textbf{MuE}. By decomposing the image and text modalities in the encoder, MuE is flexible and can skip different layers in term of modalities, advancing the inference efficiency while minimizing performance drop. Experiments on the SNLI-VE and MS COCO datasets show that the proposed approach MuE can reduce expected inference time by up to 50\% and 40\% while maintaining 99\% and 96\% performance respectively.

3 citations


Journal ArticleDOI
TL;DR: Novel and more sample-efficient methods which combine global and local explorations which are capable of exploring the state-action space more efficiently, and have much lower computational complexity and memory demand are proposed.
Abstract: Usually learning dynamical systems by data-driven methods requires large amount of training data, which may be time consuming and expensive. Active learning, which aims at choosing the most informative samples to make learning more efficient is a promising way to solve this issue. However, actively learning dynamical systems is difficult since it is not possible to arbitrarily sample the state-action space under the constraint of system dynamics. The state-of-the-art methods for actively learning dynamical systems iteratively search for an informative state-action pair by maximizing the differential entropy of the predictive distribution, or iteratively search for a long informative trajectory by maximizing the sum of predictive variances along the trajectory. These methods suffer from low efficiency or high computational complexity and memory demand. To solve these problems, this paper proposes novel and more sample-efficient methods which combine global and local explorations. As the global exploration, the agent searches for a relatively short informative trajectory in the whole state-action space of the dynamical system. Then, as the local exploration, an action sequence is optimized to drive the system’s state towards the initial state of the local informative trajectory found by the global exploration and the agent explores this local informative trajectory. Compared to the state-of-the-art methods, the proposed methods are capable of exploring the state-action space more efficiently, and have much lower computational complexity and memory demand. With the state-of-the-art methods as baselines, the advantages of the proposed methods are verified via various numerical examples.

1 citations


Journal ArticleDOI
TL;DR: In this paper , the authors used the back-n white neutron source at China Spallation Neutron Source (CSNS) to demonstrate the NIR technique for the identification and imaging of nuclides with cross-section resonances covering from eV to several MeV.
Abstract: As an advanced neutron imaging technique, nuclide identification radiography (NIR) using neutron resonance or cross-section differences is a promising technique for investigating nuclide spatial distribution inside samples. Since its proposal in 1980s, this technique has advanced very slowly due to many limitations in neutron source, detector efficiency and resolution. At the Back-n white neutron source at China Spallation Neutron Source, the NIR technique based on gated CMOS camera has been developed by taking the advantage of very high neutron flux and suitable energy spectrum of the Back-n neutron beam. Heavy nuclides, medium-mass nuclides and light nuclides have been tested to validate the effectiveness of the technique, covering the resonances from eV to MeV regions. Typical heavy elements such as Ag, In, Au and W, medium-mass elements such Al, Fe and Cu, light elements such as O have been used for experiments. The study shows, that nuclides with resonance peaks in the eV region can be easily identified, while the nuclides with resonance peaks above keV are difficult to be identified individually due to their narrow resonance peaks affecting the signal-to-noise ratio in the camera measurements. However, these nuclides can be differentiated for imaging by using adjacent multiple resonance peaks or distinct cross-section differences over wider energy ranges. The experiments are perhaps the first ones that demonstrate the NIR technique for the identification and imaging of nuclides with cross-section resonances covering from eV to several MeV.

Proceedings ArticleDOI
11 Mar 2022
TL;DR: A novel multiple-output Gaussian process model to learn the dynamical system with ordinary differential equations (ODEs) with unknown physical parameters available as prior information is proposed.
Abstract: The Gaussian process (GP) has been widely used to learn the dynamical system from training data. Current methods ignore the dependencies among the multiple dimensions of the system function and model each dimension of the system function using an independent single-output GP. This paper proposes a novel multiple-output Gaussian process model to learn the dynamical system. We assume that ordinary differential equations (ODEs) with unknown physical parameters are available as prior information. The system function is modeled by a GP with the mean function given by the one-step integration of ODEs. By this way, all dimensions of the system function are correlated because they share the same unknown physical parameters. This correlation allows the flow of information between multiple dimensions of the system function which in turn will lead to an improved model. With the existing models as baselines, we show the benefits of the proposed model by simulations.


Proceedings ArticleDOI
10 Oct 2022
TL;DR: This paper investigates an interesting problem that finds the host organ of a lesion without actually labeling the organ, and boosts the accuracy of tumor segmentation to a considerable degree, $3%$, which even surpasses the model trained with ground-truth of both organ and tumor.
Abstract: Voxel-level annotation has always been a burden of training medical image segmentation models. This paper investigates an interesting problem that finds the host organ of a lesion without actually labeling the organ. To remedy the missing annotation, we construct a graph using an off-the-shelf registration algorithm, on which lesion labels over the training set are accumulated to obtain the pseudo organ for each case. These pseudo labels are used to train a deep network, whose predictions determine the affinity of each lesion on the registration graph. We iteratively update the pseudo labels with the affinity until the training convergence. Our method is evaluated on the MSD Liver and KiTS datasets, without seeing any organ annotation, we achieve the test Dice score of 93% for liver and 92% for kidney, and boosts the accuracy of tumor segmentation to a considerable degree, $3%$, which even surpasses the model trained with ground-truth of both organ and tumor.