scispace - formally typeset
Search or ask a question

Showing papers on "Image sensor published in 2021"


Journal ArticleDOI
TL;DR: This work proposes a novel recurrent network to reconstruct videos from a stream of events, and trains it on a large amount of simulated event data, and shows that off-the-shelf computer vision algorithms can be applied to the reconstructions and that this strategy consistently outperforms algorithms that were specifically designed for event data.
Abstract: Event cameras are novel sensors that report brightness changes in the form of a stream of asynchronous “events” instead of intensity frames. They offer significant advantages with respect to conventional cameras: high temporal resolution, high dynamic range, and no motion blur. While the stream of events encodes in principle the complete visual signal, the reconstruction of an intensity image from a stream of events is an ill-posed problem in practice. Existing reconstruction approaches are based on hand-crafted priors and strong assumptions about the imaging process as well as the statistics of natural images. In this work we propose to learn to reconstruct intensity images from event streams directly from data instead of relying on any hand-crafted priors. We propose a novel recurrent network to reconstruct videos from a stream of events, and train it on a large amount of simulated event data. During training we propose to use a perceptual loss to encourage reconstructions to follow natural image statistics. We further extend our approach to synthesize color images from color event streams. Our quantitative experiments show that our network surpasses state-of-the-art reconstruction methods by a large margin in terms of image quality ( $>\!20\%$ > 20 % ), while comfortably running in real-time. We show that the network is able to synthesize high framerate videos ( $>5,000$ > 5 , 000 frames per second) of high-speed phenomena (e.g., a bullet hitting an object) and is able to provide high dynamic range reconstructions in challenging lighting conditions. As an additional contribution, we demonstrate the effectiveness of our reconstructions as an intermediate representation for event data. We show that off-the-shelf computer vision algorithms can be applied to our reconstructions for tasks such as object classification and visual-inertial odometry and that this strategy consistently outperforms algorithms that were specifically designed for event data. We release the reconstruction code, a pre-trained model and the datasets to enable further research.

164 citations


Journal ArticleDOI
10 Feb 2021-Nature
TL;DR: This result paves the way for the development and proliferation of low-cost, compact and high-performance 3D imaging cameras that could be used in applications from robotics and autonomous navigation to augmented reality and healthcare.
Abstract: Accurate three-dimensional (3D) imaging is essential for machines to map and interact with the physical world1,2 Although numerous 3D imaging technologies exist, each addressing niche applications with varying degrees of success, none has achieved the breadth of applicability and impact that digital image sensors have in the two-dimensional imaging world3–10 A large-scale two-dimensional array of coherent detector pixels operating as a light detection and ranging system could serve as a universal 3D imaging platform Such a system would offer high depth accuracy and immunity to interference from sunlight, as well as the ability to measure the velocity of moving objects directly11 Owing to difficulties in providing electrical and photonic connections to every pixel, previous systems have been restricted to fewer than 20 pixels12–15 Here we demonstrate the operation of a large-scale coherent detector array, consisting of 512 pixels, in a 3D imaging system Leveraging recent advances in the monolithic integration of photonic and electronic circuits, a dense array of optical heterodyne detectors is combined with an integrated electronic readout architecture, enabling straightforward scaling to arbitrarily large arrays Two-axis solid-state beam steering eliminates any trade-off between field of view and range Operating at the quantum noise limit16,17, our system achieves an accuracy of 31 millimetres at a distance of 75 metres when using only 4 milliwatts of light, an order of magnitude more accurate than existing solid-state systems at such ranges Future reductions of pixel size using state-of-the-art components could yield resolutions in excess of 20 megapixels for arrays the size of a consumer camera sensor This result paves the way for the development and proliferation of low-cost, compact and high-performance 3D imaging cameras that could be used in applications from robotics and autonomous navigation to augmented reality and healthcare A compact, high-performance silicon photonics-based light detection and ranging system for three-dimensional imaging is developed that should be amenable to low-cost mass manufacturing

118 citations


Journal ArticleDOI
TL;DR: Flexible image sensors have attracted increasing attention as new imaging devices owing to their lightness, softness, and bendability as discussed by the authors, and they are expected to gain wide application to wearable devices, as well as home medical care.
Abstract: Flexible image sensors have attracted increasing attention as new imaging devices owing to their lightness, softness, and bendability. Since light can measure inside information from outside of the body, optical-imaging-based approaches, such as X-rays, are widely used for disease diagnosis in hospitals. Unlike conventional sensors, flexible image sensors are soft and can be directly attached to a curved surface, such as the skin, for continuous measurement of biometric information with high accuracy. Therefore, they are expected to gain wide application to wearable devices, as well as home medical care. Herein, the application of such sensors to the biomedical field is introduced. First, their individual components, photosensors, and switching elements, are explained. Then, the basic parameters used to evaluate the performance of each of these elements and the image sensors are described. Finally, examples of measuring the dynamic and static biometric information using flexible image sensors, together with relevant real-world measurement cases, are presented. Furthermore, recent applications of the flexible image sensors in the biomedical field are introduced.

76 citations


Proceedings ArticleDOI
01 Jan 2021
TL;DR: This paper proposes a middle-fusion approach to exploit both radar and camera data for 3D object detection and solves the key data association problem using a novel frustum-based method.
Abstract: The perception system in autonomous vehicles is responsible for detecting and tracking the surrounding objects. This is usually done by taking advantage of several sensing modalities to increase robustness and accuracy, which makes sensor fusion a crucial part of the perception system. In this paper, we focus on the problem of radar and camera sensor fusion and propose a middle-fusion approach to exploit both radar and camera data for 3D object detection. Our approach, called CenterFusion, first uses a center point detection network to detect objects by identifying their center points on the image. It then solves the key data association problem using a novel frustum-based method to associate the radar detections to their corresponding object’s center point. The associated radar detections are used to generate radar-based feature maps to complement the image features, and regress to object properties such as depth, rotation and velocity. We evaluate CenterFusion on the challenging nuScenes dataset, where it improves the overall nuScenes Detection Score (NDS) of the state-of-the-art camera-based algorithm by more than 12%. We further show that CenterFusion significantly improves the velocity estimation accuracy without using any additional temporal information. The code is available at https://github.com/mrnabati/CenterFusion.

69 citations


Journal ArticleDOI
TL;DR: In this article, a bilayer MoS2 phototransistor was used to synthesize an active pixel image sensor array for image sensing applications, which is composed of two-dimensional transition metal dichalcogenides (MoS2).
Abstract: Various large-area growth methods for two-dimensional transition metal dichalcogenides have been developed recently for future electronic and photonic applications. However, they have not yet been employed for synthesizing active pixel image sensors. Here, we report on an active pixel image sensor array with a bilayer MoS2 film prepared via a two-step large-area growth method. The active pixel of image sensor is composed of 2D MoS2 switching transistors and 2D MoS2 phototransistors. The maximum photoresponsivity (Rph) of the bilayer MoS2 phototransistors in an 8 × 8 active pixel image sensor array is statistically measured as high as 119.16 A W−1. With the aid of computational modeling, we find that the main mechanism for the high Rph of the bilayer MoS2 phototransistor is a photo-gating effect by the holes trapped at subgap states. The image-sensing characteristics of the bilayer MoS2 active pixel image sensor array are successfully investigated using light stencil projection. Here, the authors report the realization of an active pixel image sensor array composed by 64 pairs of switching transistors and phototransistors, based on wafer-scale bilayer MoS2. The device exhibits sensitive photoresponse under RGB light illumination, showing the potential of 2D MoS2 for image sensing applications.

62 citations


Journal ArticleDOI
12 Jun 2021-PhotoniX
TL;DR: Four smart computational light microscopes (SCLMs) developed by the SCILab of Nanjing University of Science and Technology, China are presented, empowered by advanced computational microscopy techniques, which not only enables multi-modal contrast-enhanced observations for unstained specimens, but also can recover their three-dimensional profiles quantitatively.
Abstract: Computational microscopy, as a subfield of computational imaging, combines optical manipulation and image algorithmic reconstruction to recover multi-dimensional microscopic images or information of micro-objects. In recent years, the revolution in light-emitting diodes (LEDs), low-cost consumer image sensors, modern digital computers, and smartphones provide fertile opportunities for the rapid development of computational microscopy. Consequently, diverse forms of computational microscopy have been invented, including digital holographic microscopy (DHM), transport of intensity equation (TIE), differential phase contrast (DPC) microscopy, lens-free on-chip holography, and Fourier ptychographic microscopy (FPM). These computational microscopy techniques not only provide high-resolution, label-free, quantitative phase imaging capability but also decipher new and advanced biomedical research and industrial applications. Nevertheless, most computational microscopy techniques are still at an early stage of “proof of concept” or “proof of prototype” (based on commercially available microscope platforms). Translating those concepts to stand-alone optical instruments for practical use is an essential step for the promotion and adoption of computational microscopy by the wider bio-medicine, industry, and education community. In this paper, we present four smart computational light microscopes (SCLMs) developed by our laboratory, i.e., smart computational imaging laboratory (SCILab) of Nanjing University of Science and Technology (NJUST), China. These microscopes are empowered by advanced computational microscopy techniques, including digital holography, TIE, DPC, lensless holography, and FPM, which not only enables multi-modal contrast-enhanced observations for unstained specimens, but also can recover their three-dimensional profiles quantitatively. We introduce their basic principles, hardware configurations, reconstruction algorithms, and software design, quantify their imaging performance, and illustrate their typical applications for cell analysis, medical diagnosis, and microlens characterization.

51 citations


Proceedings ArticleDOI
06 May 2021
TL;DR: In this article, a convolutional neural network model was proposed to recognize four directional swipes and an omni-swipe using a miniaturized 60 GHz radar sensor.
Abstract: Gestures are a promising candidate as an input modality for ambient computing where conventional input modalities such as touchscreens are not available. Existing works have focused on gesture recognition using image sensors. However, their cost, high battery consumption, and privacy concerns made cameras challenging as an always-on solution. This paper introduces an efficient gesture recognition technique using a miniaturized 60 GHz radar sensor. The technique recognizes four directional swipes and an omni-swipe using a radar chip (6.5 × 5.0 mm) integrated into a mobile phone. We developed a convolutional neural network model efficient enough for battery powered and computationally constrained processors. Its model size and inference time is less than 1/5000 compared to an existing gesture recognition technique using radar. Our evaluations with large scale datasets consisting of 558,000 gesture samples and 3,920,000 negative samples demonstrated our algorithm’s efficiency, robustness, and readiness to be deployed outside of research laboratories.

50 citations


Journal ArticleDOI
TL;DR: In this paper, a post-annealed metal-semiconductor-metal (MSM) a-Ga2 O3 photodetector (SBPD) array with photoelectric properties is presented.
Abstract: The growing demand for scalable solar-blind image sensors with remarkable photosensitive properties has stimulated the research on more advanced solar-blind photodetector (SBPD) arrays. In this work, the authors demonstrate ultrahigh-performance metal-semiconductor-metal (MSM) SBPDs based on amorphous (a-) Ga2 O3 via a post-annealing process. The post-annealed MSM a-Ga2 O3 SBPDs exhibit superhigh sensitivity of 733 A/W and high response speed of 18 ms, giving a high gain-bandwidth product over 104 at 5 V. The SBPDs also show ultrahigh photo-to-dark current ratio of 3.9 × 107 . Additionally, the PDs demonstrate super-high specific detectivity of 3.9 × 1016 Jones owing to the extremely low noise down to 3.5 fW Hz-1/2 , suggesting high signal-to-noise ratio. Underlying mechanism for such superior photoelectric properties is revealed by Kelvin probe force microscopy and first principles calculation. Furthermore, for the first time, a large-scale, high-uniformity 32 × 32 image sensor array based on the post-annealed a-Ga2 O3 SBPDs is fabricated. Clear image of target object with high contrast can be obtained thanks to the high sensitivity and uniformity of the array. These results demonstrate the feasibility and practicality of the Ga2 O3 PDs for applications in solar-blind imaging, environmental monitoring, artificial intelligence and machine vision.

47 citations



Journal ArticleDOI
TL;DR: In this article, strong anisotropy of 1D layered bismuth sulfide (Bi2 S3 ) is demonstrated experimentally and theoretically, which enables high photoresponsivity (32 A W-1 ), Ion/Ioff ratio (1.08 × 104 ), robust linearly dichroic ratio ( 1.9), and Hooge parameter (2.0 × 10-5 at 1 Hz) which refer to lower noise than most reported low-dimensional materials-based devices.
Abstract: With the increasing demand for detection accuracy and sensitivity, dual-band polarimetric image sensor has attracted considerable attention due to better object recognition by processing signals from diverse wavebands. However, the widespread use of polarimetric sensors is still limited by high noise, narrow photoresponse range, and low linearly dichroic ratio. Recently, the low-dimensional materials with intrinsic in-plane anisotropy structure exhibit the great potential to realize direct polarized photodetection. Here, strong anisotropy of 1D layered bismuth sulfide (Bi2 S3 ) is demonstrated experimentally and theoretically. The Bi2 S3 photodetector exhibits excellent device performance, which enables high photoresponsivity (32 A W-1 ), Ion /Ioff ratio (1.08 × 104 ), robust linearly dichroic ratio (1.9), and Hooge parameter (2.0 × 10-5 at 1 Hz) which refer to lower noise than most reported low-dimensional materials-based devices. Impressively, such Bi2 S3 nanowire exhibits a good broadband photoresponse, ranging from ultraviolet (360 nm) to short-wave infrared (1064 nm). Direct polarimetric imaging is implemented at the wavelengths of 532 and 808 nm. With these remarkable features, the 1D Bi2 S3 nanowires show great potential for direct dual-band polarimetric image sensors without using any external optical polarizer.

34 citations


Posted Content
TL;DR: In this article, a multispectral infrared image sensor based on an array of black phosphorous programmable phototransistors (bP-PPT) is presented.
Abstract: Image sensors with internal computing capability enable in-sensor computing that can significantly reduce the communication latency and power consumption for machine vision in distributed systems and robotics. Two-dimensional semiconductors are uniquely advantageous in realizing such intelligent visionary sensors because of their tunable electrical and optical properties and amenability for heterogeneous integration. Here, we report a multifunctional infrared image sensor based on an array of black phosphorous programmable phototransistors (bP-PPT). By controlling the stored charges in the gate dielectric layers electrically and optically, the bP-PPT's electrical conductance and photoresponsivity can be locally or remotely programmed with high precision to implement an in-sensor convolutional neural network (CNN). The sensor array can receive optical images transmitted over a broad spectral range in the infrared and perform inference computation to process and recognize the images with 92% accuracy. The demonstrated multispectral infrared imaging and in-sensor computing with the black phosphorous optoelectronic sensor array can be scaled up to build a more complex visionary neural network, which will find many promising applications for distributed and remote multispectral sensing.

Journal ArticleDOI
TL;DR: The C2IS prototype sensor is used as a real-time edge feature detection frond-end camera and accompanied with a simplified convolutional neural network (CNN) architecture to demonstrate the hand gesture recognition.
Abstract: As the growing demand on artificial intelligence (AI) Internet-of-Things (IoT) devices, smart vision sensors with energy-efficient computing capability are required. This article presents a low-power and low-voltage dual mode 0.5-V computational CMOS image sensor (C2IS) with array-parallel computing capability for feature extraction using convolution. In the feature extraction mode, by applying the pulsewidth modulation (PWM) pixel and switch-current integration (SCI) circuit, the in-sensor eight-directional matrix-parallel multiply–accumulate (MAC) operation is realized. Furthermore, the analog-domain convolution-on-readout (COR) operation, the programmable $3\times3$ kernel with ±3-bit weights, and the tunable-resolution column-parallel analog-to-digital converter (ADC) (1–8 bit) are implemented to achieve the real-time feature extraction without using additional memory and sacrificing frame rate. In the image capturing mode, the sensor provides the linear-response 8-bit raw image data. The C2IS prototype has been fabricated in the TSMC 0.18- $\mu \text{m}$ standard process technology and verified to demonstrate the raw and feature images at 480 frames/s with a power consumption of 77/ $117~\mu \text{W}$ and the resultant FoM of 9.8/14.8 pJ/pixel/frame, respectively. The prototype sensor is used as a real-time edge feature detection frond-end camera and accompanied with a simplified convolutional neural network (CNN) architecture to demonstrate the hand gesture recognition. The prototype system achieves more than 95% validation accuracy.

Journal ArticleDOI
TL;DR: In this paper, the authors present a systematic study of three thermal imaging sensors with different resolutions, with a focus on sensor characterization, estimation algorithms, and comparative analysis of occupancy estimation performance.
Abstract: Occupancy estimation has a broad range of applications in security, surveillance, traffic and resource management in smart building environments. Low-resolution thermal imaging sensors can be used for real-time non-intrusive occupancy estimation. Such sensors have a resolution that is too low to identify occupants, but it may provide sufficient data for real-time occupancy estimation. In this paper, we present a systematic study of three thermal imaging sensors with different resolutions, with a focus on sensor characterization, estimation algorithms, and comparative analysis of occupancy estimation performance. A unified processing algorithms pipeline for occupancy estimation is presented and the performance of three sensors are compared side-by-side. A number of specific algorithms are proposed for pre-processing of sensor data, feature extraction, and fine-tuning of the occupancy estimation algorithms. Our results show that it is possible to achieve about 99% accuracy for occupancy estimation with our proposed approach, which might be sufficient for many practical smart building applications.

Journal ArticleDOI
TL;DR: A novel unsupervised multispectral denoising method for satellite imagery using a wavelet directional cycle-consistent adversarial network (WavCycleGAN) and in contrast to the standard image-domain cycleGAN, this method introduces aWavelet directional learning scheme for effective denoised without sacrificing high-frequency components such as edges and detailed information.
Abstract: Multispectral satellite imaging sensors acquire various spectral band images and have a unique spectroscopic property in each band. Unfortunately, image artifacts from imaging sensor noise often affect the quality of scenes and have a negative impact on applications for satellite imagery. Recently, deep learning approaches have been extensively explored to remove noise in satellite imagery. Most deep learning denoising methods, however, follow a supervised learning scheme, which requires matched noisy image and clean image pairs that are difficult to collect in real situations. In this article, we propose a novel unsupervised multispectral denoising method for satellite imagery using a wavelet directional cycle-consistent adversarial network (WavCycleGAN). The proposed method is based on an unsupervised learning scheme using adversarial loss and cycle-consistency loss to overcome the lack of paired data. Moreover, in contrast to the standard image-domain cycleGAN, we introduce a wavelet directional learning scheme for effective denoising without sacrificing high-frequency components such as edges and detailed information. Experimental results for the removal of vertical stripes and wave noise in satellite imaging sensors demonstrate that the proposed method effectively removes noise and preserves important high-frequency features of satellite images.

Journal ArticleDOI
TL;DR: A deep network built to take advantage of the multiple features that can be extracted from a camera's histogram data is developed, providing significant image resolution enhancement and image denoising across a wide range of signal-to-noise ratios and photon levels.
Abstract: The number of applications that use depth imaging is increasing rapidly, e.g. self-driving autonomous vehicles and auto-focus assist on smartphone cameras. Light detection and ranging (LIDAR) via single-photon sensitive detector (SPAD) arrays is an emerging technology that enables the acquisition of depth images at high frame rates. However, the spatial resolution of this technology is typically low in comparison to the intensity images recorded by conventional cameras. To increase the native resolution of depth images from a SPAD camera, we develop a deep network built to take advantage of the multiple features that can be extracted from a camera’s histogram data. The network is designed for a SPAD camera operating in a dual-mode such that it captures alternate low resolution depth and high resolution intensity images at high frame rates, thus the system does not require any additional sensor to provide intensity images. The network then uses the intensity images and multiple features extracted from down-sampled histograms to guide the up-sampling of the depth. Our network provides significant image resolution enhancement and image denoising across a wide range of signal-to-noise ratios and photon levels. Additionally, we show that the network can be applied to other data types of SPAD data, demonstrating the generality of the algorithm.

Journal ArticleDOI
TL;DR: In this paper, a 16.7 Mpixel, 3D-stacked backside illuminated Quanta Image Sensor (QIS) with 1.1 $\mu \text{m}$ -pitch pixels was reported, which achieves 0.19 e-rms array read noise and 0.12 e- rms read noise under room temperature operation.
Abstract: This letter reports a 16.7 Mpixel, 3D-stacked backside illuminated Quanta Image Sensor (QIS) with 1.1 $\mu \text{m}$ -pitch pixels which achieves 0.19 e- rms array read noise and 0.12 e- rms best single-pixel read noise under room temperature operation. The accurate photon-counting capability enables superior imaging performance under ultra-low-light conditions. The sensor supports programmable analog-to-digital convertor (ADC) resolution from 1–14 bits and video frame rates up to 40 fps with $4096\times4096$ resolution and 600 mW power consumption.

Journal ArticleDOI
TL;DR: In this paper, an interferometer based on two frequency combs of slightly different repetition frequencies and a lensless camera sensor is used to record time-varying spatial interference patterns that generate spectral hypercubes of complex holograms, revealing the amplitudes and phases of scattered wave fields for each comb line frequency.
Abstract: Holography1 has always held special appeal as it is able to record and display spatial information in three dimensions2–10. Here we show how to augment the capabilities of digital holography11,12 by using a large number of narrow laser lines at precisely defined optical frequencies simultaneously. Using an interferometer based on two frequency combs13–15 of slightly different repetition frequencies and a lensless camera sensor, we record time-varying spatial interference patterns that generate spectral hypercubes of complex holograms, revealing the amplitudes and phases of scattered wave-fields for each comb line frequency. Advancing beyond multicolour holography and low-coherence holography (including with a frequency comb16), the synergy of broad spectral bandwidth and high temporal coherence in dual-comb holography opens up novel optical diagnostics, such as precise dimensional metrology over large distances without interferometric phase ambiguity, or hyperspectral three-dimensional imaging with high spectral resolving power, as we demonstrate with molecule-selective imaging of an absorbing gas. Dual-comb digital holography based on an interferometer composed of two frequency combs of slightly different repetition frequencies and a lensless camera sensor allows highly frequency-multiplexed holography with high temporal coherence.

Journal ArticleDOI
28 Jul 2021-ACS Nano
TL;DR: In this paper, an electrically switchable color-selective organic photodetector (OPD) comprising a double organic bulk heterojunction structure has been developed for full-color imaging.
Abstract: The present full-color imaging techniques rely on the use of broadband inorganic photodetectors with dedicated color filters, which is one of the practical challenges for large-area, flexible, and high-solution imaging applications. The development of high-performance color-selective photodetectors is one of the key solutions to overcome this challenge. In this work, an electrically switchable color-selective organic photodetector (OPD) comprising a double organic bulk heterojunction structure has been developed for full-color imaging. The color-selective sensing capability over the visible spectrum ranges can be realized by controlling the bias across the OPD, achieving a high responsivity of ∼200 mA/W, a large linear dynamic range of 122 dB, a viewing angle of 120°, and a -3 dB cutoff frequency of >50 kHz. A full-color imaging function has been demonstrated using electrically switchable red-, green-, and blue-color selective OPD sensors with an excellent operational stability. The results of this work provide a practical solution for applications in high-resolution full-color imaging and artificial vision.

Journal ArticleDOI
TL;DR: A light emitting diode (LED) light panel and rolling shutter image sensor based OCC system using frame-averaging background removal (FABR) technique, Z-score normalization, and neural network (NN) is proposed and demonstrated.
Abstract: Optical wireless communication (OWC) has emerged as a complementary or alternative technology to the radio-frequency (RF) communication. OWC based on image sensor, which is also called optical camera communication (OCC) has attracted much attention from industrial and academic societies. Here, we discuss several recent OCC technologies. We propose and demonstrate a light emitting diode (LED) light panel and rolling shutter image sensor based OCC system using frame-averaging background removal (FABR) technique, Z-score normalization, and neural network (NN). Here, a driver circuit for the LED display panel based on a bipolar-junction-transistor (BJT) and a metal-oxide-semiconductor field-effect-transistor (MOSFET) is also discussed. It can provide low enough driving frequency from a few Hz to kHz and high bias current for the LED light panel. Experimental results show that the proposed scheme can mitigate the inter-symbol interference (ISI) observed in the rolling shutter pattern produced by the high noise-ratio (NR) of the display contents.

Journal ArticleDOI
TL;DR: The proposed method identifies displacements in the frequency domain directly on the camera sensor, resulting in orders-of-magnitude smaller data sizes and post-processing times compared with conventional multiview image-based methods.

Journal ArticleDOI
TL;DR: The structure and interface of image acquisition unit of solid state image sensor are designed, and the wavelet neural network reflection model is used to reconstruct the single frame feature image and improve the resolution of the image.
Abstract: The application of the traditional single frame character image super-resolution reconstruction method has some problems, such as noise can not be removed completely and anti-interference performance is poor. A new method for the super-resolution reconstruction of single frame character image based on wavelet neural network is proposed. The structure and interface of image acquisition unit of solid state image sensor are designed. Combined with pinhole imaging model and camera self-calibration, image acquisition of Internet of Things is completed. An image degradation model was established to simulate the degradation process of ideal high-resolution image to low-resolution image. Wavelet threshold denoising method is used to remove the noise in a single frame character image and improve the anti-interference performance of the method. The wavelet neural network reflection model is used to reconstruct the single frame feature image and improve the resolution of the image. The experimental results show that the blur degree of the reconstructed image is always less than 5%. In the whole experiment, the accuracy of this method can be maintained at 80% ~ 90%. The image detail retention rate of the research method is relatively stable. With the increase of the number of experimental images, the retention rate of image details remains between 80% and 95%, indicating that the method is effective in practical application.

Proceedings ArticleDOI
19 Jun 2021
TL;DR: In this article, a joint demosaicing and denoising (JDD) method was proposed to perform essential image signal processing (ISP) tasks with non-Bayer color filter array (CFA) patterns.
Abstract: Pixel binning is considered one of the most prominent solutions to tackle the hardware limitation of smartphone cameras. Despite numerous advantages, such an image sensor has to appropriate an artefact-prone non-Bayer colour filter array (CFA) to enable the binning capability. Contrarily, performing essential image signal processing (ISP) tasks like demosaicking and denoising, explicitly with such CFA patterns, makes the reconstruction process notably complicated. In this paper, we tackle the challenges of joint demosaicing and denoising (JDD) on such an image sensor by introducing a novel learning-based method. The proposed method leverages the depth and spatial attention in a deep network. The proposed network is guided by a multi-term objective function, including two novel perceptual losses to produce visually plausible images. On top of that, we stretch the proposed image processing pipeline to comprehensively reconstruct and enhance the images captured with a smartphone camera, which uses pixel binning techniques. The experimental results illustrate that the proposed method can outperform the existing methods by a noticeable margin in qualitative and quantitative comparisons. Code available: https://github.com/sharif-apu/BJDD_CVPR21.

Journal ArticleDOI
TL;DR: In this article, a small pixel pitch image sensor optimized for high external quantum efficiency in short-wavelength infrared (SWIR) was presented, where thin-film photodiodes based on PbS colloidal quantum dot (CQD) absorber allow to exceed the spectral limitations of silicon's absorption while maintaining the benefits of CMOS technology.
Abstract: In this letter, we present a small pixel pitch image sensor optimized for high external quantum efficiency in short-wavelength infrared (SWIR). Thin-film photodiodes based on PbS colloidal quantum dot (CQD) absorber allow us to exceed the spectral limitations of silicon’s absorption while maintaining the benefits of CMOS technology. By monolithically integrating PbS CDQ thin films with CMOS readout arrays, high-pixel density SWIR image sensors can be achieved. To overcome the remaining disadvantages of the CQD-based image sensors over their bulk III-V semiconductor counterparts (lower sensitivity and reduced linearity), the thin-film photodiode stack is adapted towards the used readout circuit. A prototype image sensor with a $768\times 512$ resolution of 5- $\mu \text{m}$ pitch pixels is fabricated by using a modified 130 nm CMOS process for readout IC, together with the new CQD thin-film photodiode on top. Thanks to the optimized photodiode stack and co-integration process, the prototype image sensor shows less than 5% linearity error while having 40% external quantum efficiency in SWIR, which enables acquisition of high-quality images.

Journal ArticleDOI
TL;DR: In this paper, a learning-based joint demosaicing and denoising algorithm for low-light color imaging is proposed, which combines the classical theory of color filter arrays and modern deep learning.
Abstract: Low-light imaging is a challenging task because of the excessive photon shot noise. Color imaging in low-light is even more difficult because one needs to demosaick and denoise simultaneously. Existing demosaicking algorithms are mostly designed for well-illuminated scenarios, which fail to work with low-light. Recognizing the recent development of small pixels and low read noise image sensors, we propose a learning-based joint demosaicking and denoising algorithm for low-light color imaging. Our method combines the classical theory of color filter arrays and modern deep learning. We use an explicit carrier to demodulate the color from the input Bayer pattern image. We integrate trainable filters into the demodulation scheme to improve flexibility. We introduce a guided filtering module to transfer knowledge from the luma channel to the chroma channels, thus offering substantially more reliable denoising. Extensive experiments are performed to evaluate the performance of the proposed method, using both synthetic datasets and real data. Results indicate that the proposed method offers consistently better performance over the current state-of-the-art, across several standard evaluation metrics.

Posted Content
TL;DR: In this article, the extrinsic parameters of any pair of sensors involving LiDARs, monocular or stereo cameras, of the same or different modalities are calibrated.
Abstract: Most sensor setups for onboard autonomous perception are composed of LiDARs and vision systems, as they provide complementary information that improves the reliability of the different algorithms necessary to obtain a robust scene understanding. However, the effective use of information from different sources requires an accurate calibration between the sensors involved, which usually implies a tedious and burdensome process. We present a method to calibrate the extrinsic parameters of any pair of sensors involving LiDARs, monocular or stereo cameras, of the same or different modalities. The procedure is composed of two stages: first, reference points belonging to a custom calibration target are extracted from the data provided by the sensors to be calibrated, and second, the optimal rigid transformation is found through the registration of both point sets. The proposed approach can handle devices with very different resolutions and poses, as usually found in vehicle setups. In order to assess the performance of the proposed method, a novel evaluation suite built on top of a popular simulation framework is introduced. Experiments on the synthetic environment show that our calibration algorithm significantly outperforms existing methods, whereas real data tests corroborate the results obtained in the evaluation suite. Open-source code is available at this https URL

Journal ArticleDOI
TL;DR: In this article, a lensfree on-chip microscopy approach for wide-field quantitative phase imaging (QPI) based on wavelength scanning was proposed, where a relatively large-range wavelength diversity not only provides information to overcome spatial aliasing of the image sensor but also creates sufficient diffraction variations that can be used to achieve motion-free, pixel-super-resolved phase recovery.
Abstract: We propose a lensfree on-chip microscopy approach for wide-field quantitative phase imaging (QPI) based on wavelength scanning. Unlike previous methods, we found that a relatively large-range wavelength diversity not only provides information to overcome spatial aliasing of the image sensor but also creates sufficient diffraction variations that can be used to achieve motion-free, pixel-super-resolved phase recovery. Based on an iterative phase retrieval and pixel-super-resolution technique, the proposed wavelength-scanning approach uses only eight undersampled holograms to achieve a half-pitch lateral resolution of 691 nm across a large field-of-view of 29.85mm2, surpassing 2.41 times the theoretical Nyquist–Shannon sampling resolution limit imposed by the pixel size of the sensor (1.67 µm). We confirmed the effectiveness of this technique in QPI and resolution enhancement by measuring the benchmark quantitative phase microscopy target. We also showed that this method can track HeLa cell growth within an incubator, revealing cellular morphologies and subcellular dynamics of a large cell population over an extended period of time.

Proceedings ArticleDOI
01 Jan 2021
TL;DR: In this paper, Li et al. proposed a framework to estimate the scene depth directly from a single thermal image that can still observe the scene in the low lighting condition, which also mitigates the training condition due to the easy availability of RGB cameras.
Abstract: Most existing autonomous driving vehicles and robots rely on active LiDAR sensors to detect the depth of the surrounding environment, which usually has limited resolution, and the emitted laser can be harmful to people and the environment. Current passive image-based depth estimation algorithms focus on color images from RGB sensors, which is not suitable for dark and night environment with limited lighting resource. In this paper, we propose a framework to estimate the scene depth directly from a single thermal image that can still observe the scene in the low lighting condition. We learn the thermal image depth estimation frame-work together with RGB cameras, which also mitigates the training condition due to the easy availability of RGB cameras. With the translated thermal images from color images from our generative adversarial network, our depth estimation method can explore the unique characteristics in thermal images through our novel contour and edge-aware constraints to obtain a stable and anti-artifact disparity. We apply the commonly available color cameras to navigate the learning process of thermal image depth estimation frame-work. With our approach, an accurate depth map can be predicted without any prior knowledge under various illumination conditions.

Journal ArticleDOI
01 Mar 2021
TL;DR: In this paper, a color router that achieves perfect routing for sub-wavelength pixels is proposed, where all incident light is routed based on color instead of filtering, and the color router achieves perfect filtering for subwavelength pixel.
Abstract: We demonstrate a conceptually novel approach for color functionality in image sensors by designing a color router that achieves perfect routing for sub-wavelength pixels. Instead of filtering, all incident light is routed based on color.

Proceedings ArticleDOI
01 Jun 2021
TL;DR: In this article, a neural net-work for exposure selection was proposed for automotive object detection, which is trained jointly with an object detector and an image signal processing (ISP) pipeline.
Abstract: Real-world scenes have a dynamic range of up to 280 dB that todays imaging sensors cannot directly capture. Existing live vision pipelines tackle this fundamental challenge by relying on high dynamic range (HDR) sensors that try to recover HDR images from multiple captures with different exposures. While HDR sensors substantially increase the dynamic range, they are not without disadvantages, including severe artifacts for dynamic scenes, reduced fill-factor, lower resolution, and high sensor cost. At the same time, traditional auto-exposure methods for low-dynamic range sensors have advanced as proprietary methods relying on image statistics separated from downstream vision algorithms. In this work, we revisit auto-exposure control as an alternative to HDR sensors. We propose a neural net-work for exposure selection that is trained jointly, end-to-end with an object detector and an image signal processing (ISP) pipeline. To this end, we use an HDR dataset for automotive object detection and an HDR training procedure. We validate that the proposed neural auto-exposure control, which is tailored to object detection, outperforms conventional auto-exposure methods by more than 6 points in mean average precision (mAP).

Proceedings ArticleDOI
13 Feb 2021
TL;DR: In this paper, the authors proposed a system solution that is larger, higher power and more costly than the conventional CMOS Image Sensors (CIS) that only output the raw data of the captured image.
Abstract: Within the Internet of Things (IoT) market, retail, smart city, and so on, the need for camera products which have Artificial Intelligence (AI) processing capabilities is growing. AI processing capability on such edge devices solves some issues of cloud-only computing systems, such as latency, cloud communication, processing cost, and privacy concerns. The market demand for smart cameras with AI processing capabilities includes small size, low cost, low power and ease of installation. However, conventional CMOS Image Sensors (CIS) only output the raw data of the captured image. Therefore, when developing a smart camera that has AI processing capabilities, it is necessary to utilize ICs that include an image signal processor (ISP), CNN processing, DRAM and so on. Unfortunately, this results in system solution that is larger, higher power and more costly.