scispace - formally typeset
Search or ask a question

Showing papers by "Jinli Suo published in 2022"


Journal ArticleDOI
15 Mar 2022-Optica
TL;DR: In this article , point spread function (PSF) engineering via deep optics was introduced to encode fine spatial details and temporal compressive sensing to encode fast motions into a low-resolution snapshot, and an end-to-end deep neural network was trained on simulation data and fine-tuned to fit system settings.
Abstract: Capturing both fine structure and high dynamics of the scene is demanded in many applications. However, such high throughput recording requires significant transmission bandwidth and large storage. Off-the-shelf super-resolution and temporal compressive sensing can partially address this challenge, but directly concatenating the two techniques fails to boost throughput because of artifact accumulation and magnification in sequential reconstruction. In this Letter, we propose an encoded capturing approach to simultaneously increase spatial and temporal resolvability with a low-bandwidth camera sensor. Specifically, we introduce point-spread-function (PSF) engineering via deep optics to encode fine spatial details and temporal compressive sensing to encode fast motions into a low-resolution snapshot. Furthermore, we develop an end-to-end deep neural network to optimize PSF and retrieve high-throughput videos from a compactly compressed measurement. Trained on simulation data and fine-tuned to fit system settings, our end-to-end system offers a 128 × data throughput compared to conventional imaging.

8 citations


Journal ArticleDOI
06 Sep 2022-PhotoniX
TL;DR: In this paper , a joint framework for snapshot compressive imaging (SCI) and semantic computer vision (SCV) is proposed to bridge the gap between SCI and SCV to take full advantage of both approaches.
Abstract: Abstract High-throughput imaging is highly desirable in intelligent analysis of computer vision tasks. In conventional design, throughput is limited by the separation between physical image capture and digital post processing. Computational imaging increases throughput by mixing analog and digital processing through the image capture pipeline. Yet, recent advances of computational imaging focus on the “compressive sampling”, this precludes the wide applications in practical tasks. This paper presents a systematic analysis of the next step for computational imaging built on snapshot compressive imaging (SCI) and semantic computer vision (SCV) tasks, which have independently emerged over the past decade as basic computational imaging platforms. SCI is a physical layer process that maximizes information capacity per sample while minimizing system size, power and cost. SCV is an abstraction layer process that analyzes image data as objects and features, rather than simple pixel maps. In current practice, SCI and SCV are independent and sequential. This concatenated pipeline results in the following problems: i ) a large amount of resources are spent on task-irrelevant computation and transmission, ii ) the sampling and design efficiency of SCI is attenuated, and iii ) the final performance of SCV is limited by the reconstruction errors of SCI. Bearing these concerns in mind, this paper takes one step further aiming to bridge the gap between SCI and SCV to take full advantage of both approaches. After reviewing the current status of SCI, we propose a novel joint framework by conducting SCV on raw measurements captured by SCI to select the region of interest, and then perform reconstruction on these regions to speed up processing time. We use our recently built SCI prototype to verify the framework. Preliminary results are presented and the prospects for a joint SCI and SCV regime are discussed. By conducting computer vision tasks in the compressed domain, we envision that a new era of snapshot compressive imaging with limited end-to-end bandwidth is coming.

7 citations


Journal ArticleDOI
TL;DR: Zhang et al. as discussed by the authors proposed a high-fidelity PSR phase retrieval method with plug-and-play optimization, which decomposes PSR reconstruction into independent sub-problems based on generalized alternating projection framework.
Abstract: In order to increase signal-to-noise ratio in optical imaging, most detectors sacrifice resolution to increase pixel size in a confined area, which impedes further development of high throughput holographic imaging. Although the pixel super-resolution technique (PSR) enables resolution enhancement, it suffers from the trade-off between reconstruction quality and super-resolution ratio. In this work, we report a high-fidelity PSR phase retrieval method with plug-and-play optimization, termed PNP-PSR. It decomposes PSR reconstruction into independent sub-problems based on generalized alternating projection framework. An alternating projection operator and an enhancing neural network are employed to tackle the measurement fidelity and statistical prior regularization, respectively. PNP-PSR incorporates the advantages of individual operators, achieving both high efficiency and noise robustness. Extensive experiments show that PNP-PSR outperforms the existing techniques in both resolution enhancement and noise suppression.

7 citations


Journal ArticleDOI
06 Sep 2022-PhotoniX
TL;DR: In this article , a joint framework for snapshot compressive imaging (SCI) and semantic computer vision (SCV) is proposed to bridge the gap between SCI and SCV to take full advantage of both approaches.
Abstract: Abstract High-throughput imaging is highly desirable in intelligent analysis of computer vision tasks. In conventional design, throughput is limited by the separation between physical image capture and digital post processing. Computational imaging increases throughput by mixing analog and digital processing through the image capture pipeline. Yet, recent advances of computational imaging focus on the “compressive sampling”, this precludes the wide applications in practical tasks. This paper presents a systematic analysis of the next step for computational imaging built on snapshot compressive imaging (SCI) and semantic computer vision (SCV) tasks, which have independently emerged over the past decade as basic computational imaging platforms. SCI is a physical layer process that maximizes information capacity per sample while minimizing system size, power and cost. SCV is an abstraction layer process that analyzes image data as objects and features, rather than simple pixel maps. In current practice, SCI and SCV are independent and sequential. This concatenated pipeline results in the following problems: i ) a large amount of resources are spent on task-irrelevant computation and transmission, ii ) the sampling and design efficiency of SCI is attenuated, and iii ) the final performance of SCV is limited by the reconstruction errors of SCI. Bearing these concerns in mind, this paper takes one step further aiming to bridge the gap between SCI and SCV to take full advantage of both approaches. After reviewing the current status of SCI, we propose a novel joint framework by conducting SCV on raw measurements captured by SCI to select the region of interest, and then perform reconstruction on these regions to speed up processing time. We use our recently built SCI prototype to verify the framework. Preliminary results are presented and the prospects for a joint SCI and SCV regime are discussed. By conducting computer vision tasks in the compressed domain, we envision that a new era of snapshot compressive imaging with limited end-to-end bandwidth is coming.

5 citations


Journal ArticleDOI
30 Sep 2022
TL;DR: An adaptive compression approach SCI is proposed, which adaptively partitions the complex biomed- ical data into blocks matching INR’s concentrated spectrum envelop, and design a funnel shaped neural network capable of representing each block with a small number of parameters.
Abstract: Massive collection and explosive growth of biomedical data, demands effective compression for efficient storage, transmission and sharing. Readily available visual data compression techniques have been studied extensively but tailored for natural images/videos, and thus show limited performance on biomedical data which are of different features and larger diversity. Emerging implicit neural representation (INR) is gaining momentum and demonstrates high promise for fitting diverse visual data in target-data-specific manner, but a general compression scheme covering diverse biomedical data is so far absent. To address this issue, we firstly derive a mathematical explanation for INR's spectrum concentration property and an analytical insight on the design of INR based compressor. Further, we propose a Spectrum Concentrated Implicit neural compression (SCI) which adaptively partitions the complex biomedical data into blocks matching INR's concentrated spectrum envelop, and design a funnel shaped neural network capable of representing each block with a small number of parameters. Based on this design, we conduct compression via optimization under given budget and allocate the available parameters with high representation accuracy. The experiments show SCI's superior performance to state-of-the-art methods including commercial compressors, data-driven ones, and INR based counterparts on diverse biomedical data. The source code can be found at https://github.com/RichealYoung/ImplicitNeuralCompression.git.

2 citations


Journal ArticleDOI
TL;DR: This paper proposes to integrate snapshot compressive imaging (SCI)-a recently proposed computational imaging approach-into DIC for high-speed, high-resolution deformation measurement, and proposes three techniques under SCI reconstruction framework to secure high-precision reconstruction.
Abstract: The limited throughput of a digital image correlation (DIC) system hampers measuring deformations at both high spatial resolution and high temporal resolution. To address this dilemma, in this paper we propose to integrate snapshot compressive imaging (SCI)-a recently proposed computational imaging approach-into DIC for high-speed, high-resolution deformation measurement. Specifically, an SCI-DIC system is established to encode a sequence of fast changing speckle patterns into a snapshot and a high-accuracy speckle decompress SCI (Sp-DeSCI) algorithm is proposed for computational reconstruction of the speckle sequence. To adapt SCI reconstruction to the unique characteristics of speckle patterns, we propose three techniques under SCI reconstruction framework to secure high-precision reconstruction, including the normalized sum squared difference criterion, speckle-adaptive patch search strategy, and adaptive group aggregation. For efficacy validation of the proposed Sp-DeSCI, we conducted extensive simulated experiments and a four-point bending SCI-DIC experiment on real data. Both simulation and real experiments verify that the Sp-DeSCI successfully removes the deviations of reconstructed speckles in DeSCI and provides the highest displacement accuracy among existing algorithms. The SCI-DIC system together with the Sp-DeSCI algorithm can offer temporally super-resolved deformation measurement at full spatial resolution, and can potentially replace conventional high-speed DIC in real measurements.

2 citations


Journal ArticleDOI
TL;DR: In this article , the authors focus on deep learning for high-dimensional sensing and propose new sensing and processing techniques with high performance to capture highdimensional data by leveraging recent advances in deep learning (DL).
Abstract: The papers in this special section focus on deep learning for high-dimensional sensing. People live in a high-dimensional world and sensing is the first step to perceive and understand the environment for both human beings and machines. Therefore, high-dimensional sensing (HDS) plays a pivotal role in many fields such as robotics, signal processing, computer vision and surveillance. The recent explosive growth of artificial intelligence has provided new opportunities and tools for HDS, especially for machine vision. In many emerging real applications such as advanced driver assistance systems/autonomous driving systems, large-scale, high-dimensional and diverse types of data need to be captured and processed with high accuracy and in a real-time manner. Bearing this in mind, now is the time to develop new sensing and processing techniques with high performance to capture high-dimensional data by leveraging recent advances in deep learning (DL).

1 citations


Journal ArticleDOI
TL;DR: In this article , a tree-structured Implicit Neural Compression (TINC) is proposed to conduct compact representation for local regions and extract the shared features of these local representations in a hierarchical manner.
Abstract: Implicit neural representation (INR) can describe the target scenes with high fidelity using a small number of parameters, and is emerging as a promising data compression technique. However, limited spectrum coverage is intrinsic to INR, and it is non-trivial to remove redundancy in diverse complex data effectively. Preliminary studies can only exploit either global or local correlation in the target data and thus of limited performance. In this paper, we propose a Tree-structured Implicit Neural Compression (TINC) to conduct compact representation for local regions and extract the shared features of these local representations in a hierarchical manner. Specifically, we use Multi-Layer Perceptrons (MLPs) to fit the partitioned local regions, and these MLPs are organized in tree structure to share parameters according to the spatial distance. The parameter sharing scheme not only ensures the continuity between adjacent regions, but also jointly removes the local and non-local redundancy. Extensive experiments show that TINC improves the compression fidelity of INR, and has shown impressive compression capabilities over commercial tools and other deep learning based methods. Besides, the approach is of high flexibility and can be tailored for different data and parameter settings. The source code can be found at https://github.com/RichealYoung/TINC .

Journal ArticleDOI
TL;DR: A novel non-blind deblurring method dubbed image and feature space Wiener deconvolution network (INFWIDE) that can recover details while suppressing the unpleasant artifacts duringdeblurring and optimize INFWIDE’s applicability in real low-light conditions.
Abstract: Under low-light environment, handheld photography suffers from severe camera shake under long exposure settings. Although existing deblurring algorithms have shown promising performance on well-exposed blurry images, they still cannot cope with low-light snapshots. Sophisticated noise and saturation regions are two dominating challenges in practical low-light deblurring: the former violates the Gaussian or Poisson assumption widely used in most existing algorithms and thus degrades their performance badly, while the latter introduces non-linearity to the classical convolution-based blurring model and makes the deblurring task even challenging. In this work, we propose a novel non-blind deblurring method dubbed image and feature space Wiener deconvolution network (INFWIDE) to tackle these problems systematically. In terms of algorithm design, INFWIDE proposes a two-branch architecture, which explicitly removes noise and hallucinates saturated regions in the image space and suppresses ringing artifacts in the feature space, and integrates the two complementary outputs with a subtle multi-scale fusion network for high quality night photograph deblurring. For effective network training, we design a set of loss functions integrating a forward imaging model and backward reconstruction to form a close-loop regularization to secure good convergence of the deep neural network. Further, to optimize INFWIDE’s applicability in real low-light conditions, a physical-process-based low-light noise model is employed to synthesize realistic noisy night photographs for model training. Taking advantage of the traditional Wiener deconvolution algorithm’s physically driven characteristics and deep neural network’s representation ability, INFWIDE can recover fine details while suppressing the unpleasant artifacts during deblurring. Extensive experiments on synthetic data and real data demonstrate the superior performance of the proposed approach.

Posted ContentDOI
05 Dec 2022-bioRxiv
TL;DR: Wang et al. as mentioned in this paper proposed to conduct biomedical data compRession with Implicit nEural Function (BRIEF) by representing the original data with compact deep neural networks, which are data specific and thus have no generalization issues.
Abstract: Efficient storage and sharing of massive biomedical data would open up their wide accessibility to different institutions and disciplines. However, compressors tailored for natural photos/videos are rapidly limited for biomedical data, while emerging deep learning based methods demand huge training data and are difficult to generalize. Here we propose to conduct biomedical data compRession with Implicit nEural Function (BRIEF) by representing the original data with compact deep neural networks, which are data specific and thus have no generalization issues. Benefiting from the strong representation capability of implicit neural function, BRIEF achieves significantly higher-fidelity on diverse biomedical data than existing techniques. Besides, BRIEF is of consistent performance across the whole data volume, supports customized spatially-varying fidelity. BRIEF’s multi-fold advantageous features also serve reliable downstream tasks at low bandwidth. Our approach will facilitate biomedical data sharing at low bandwidth and maintenance costs, and promote collaboration and progress in the biomedical field.

Journal ArticleDOI
TL;DR: Li et al. as discussed by the authors proposed a dual-channel multi-frame attention network (DCMAN) utilizing spatial-temporal-spectral priors to mutually boost the quality of low-light RGB and NIR videos.

Book ChapterDOI
TL;DR: In this paper , a geometry and texture joint representation based on implicit semantic template mapping is proposed to recover the 3D shape and texture of vehicles from uncalibrated monocular inputs under real-world street environments.
Abstract: We introduce VERTEX, an effective solution to recovering the 3D shape and texture of vehicles from uncalibrated monocular inputs under real-world street environments. To fully utilize the semantic prior of vehicles, we propose a novel geometry and texture joint representation based on implicit semantic template mapping. Compared to existing representations which infer 3D texture fields, our method explicitly constrains the texture distribution on the 2D surface of the template and avoids the limitation of fixed topology. Moreover, we propose a joint training strategy that leverages the texture distribution to learn a semantic-preserving mapping from vehicle instances to the canonical template. We also contribute a new synthetic dataset containing 830 elaborately textured car models labeled with key points and rendered using Physically Based Rendering (PBRT) system with measured HDRI skymaps to obtain highly realistic images. Experiments demonstrate the superior performance of our approach on both testing dataset and in-the-wild images. Furthermore, the presented technique enables additional applications such as 3D vehicle texture transfer and material identification, and can be generalized to other shape categories.

Journal ArticleDOI
TL;DR: A weighted optimization technique for sampling-adaptive single-pixel sensing, which only needs to train the network once for any dynamic sampling rate, and innovatively introduces a weighting scheme in the encoding process to characterize different patterns' modulation efficiencies.
Abstract: The novel single-pixel sensing technique that uses an end-to-end neural network for joint optimization achieves high-level semantic sensing, which is effective but computation-consuming for varied sampling rates. In this Letter, we report a weighted optimization technique for sampling-adaptive single-pixel sensing, which only needs to train the network once for any dynamic sampling rate. Specifically, we innovatively introduce a weighting scheme in the encoding process to characterize different patterns' modulation efficiencies, in which the modulation patterns and their corresponding weights are updated iteratively. The optimal pattern series with the highest weights is employed for light modulation in the experimental implementation, thus achieving highly efficient sensing. Experiments validated that once the network is trained with a sampling rate of 1, the single-target classification accuracy reaches up to 95.00% at a sampling rate of 0.03 on the MNIST dataset and 90.20% at a sampling rate of 0.07 on the CCPD dataset for multi-target sensing.