Showing papers on "Image sensor published in 2020"

PDF

Open Access

Journal Article•DOI•

Ultrafast machine vision with 2D material neural network image sensors

[...]

Lukas Mennel¹, Joanna Symonowicz¹, Stefan Wachter¹, Dmitry K. Polyushkin¹, Aday J. Molina-Mendoza¹, Thomas Mueller¹ - Show less +2 more•Institutions (1)

Vienna University of Technology¹

04 Mar 2020-Nature

TL;DR: It is demonstrated that an image sensor can itself constitute an ANN that can simultaneously sense and process optical images without latency, and is trained to classify and encode images with high throughput, acting as an artificial neural network.

...read moreread less

Abstract: Machine vision technology has taken huge leaps in recent years, and is now becoming an integral part of various intelligent systems, including autonomous vehicles and robotics. Usually, visual information is captured by a frame-based camera, converted into a digital format and processed afterwards using a machine-learning algorithm such as an artificial neural network (ANN)1. The large amount of (mostly redundant) data passed through the entire signal chain, however, results in low frame rates and high power consumption. Various visual data preprocessing techniques have thus been developed2-7 to increase the efficiency of the subsequent signal processing in an ANN. Here we demonstrate that an image sensor can itself constitute an ANN that can simultaneously sense and process optical images without latency. Our device is based on a reconfigurable two-dimensional (2D) semiconductor8,9 photodiode10-12 array, and the synaptic weights of the network are stored in a continuously tunable photoresponsivity matrix. We demonstrate both supervised and unsupervised learning and train the sensor to classify and encode images that are optically projected onto the chip with a throughput of 20 million bins per second.

...read moreread less

436 citations

Journal Article•DOI•

Flat optics for image differentiation

[...]

You Zhou¹, Hanyu Zheng¹, Ivan I. Kravchenko², Jason Valentine¹•Institutions (2)

Vanderbilt University¹, Oak Ridge National Laboratory²

24 Feb 2020-Nature Photonics

TL;DR: Flat optics for direct image differentiation is demonstrated, allowing us to significantly shrink the required optical system size, significantly reducing the size and complexity of conventional optical systems.

...read moreread less

Abstract: Image processing has become a critical technology in a variety of science and engineering disciplines. Although most image processing is performed digitally, optical analog processing has the advantages of being low-power and high-speed, but it requires a large volume. Here, we demonstrate flat optics for direct image differentiation, allowing us to significantly shrink the required optical system size. We first demonstrate how the differentiator can be combined with traditional imaging systems such as a commercial optical microscope and camera sensor for edge detection with a numerical aperture up to 0.32. We next demonstrate how the entire processing system can be realized as a monolithic compound flat optic by integrating the differentiator with a metalens. The compound nanophotonic system manifests the advantage of thin form factor as well as the ability to implement complex transfer functions, and could open new opportunities in applications such as biological imaging and computer vision. Vertical integration of a metalens to realize compound nanophotonic systems for optical analog image processing is realized, significantly reducing the size and complexity of conventional optical systems.

...read moreread less

256 citations

Book Chapter•DOI•

3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-view Spatial Feature Fusion for 3D Object Detection

[...]

Jin Hyeok Yoo¹, Yeocheol Kim¹, Ji Song Kim¹, Jun Won Choi•Institutions (1)

Hanyang University¹

27 Apr 2020

TL;DR: Li et al. as discussed by the authors proposed a 3D-CVF that combines the camera and LiDAR features using the cross-view spatial feature fusion strategy, which achieved state-of-the-art performance in the KITTI benchmark.

...read moreread less

Abstract: In this paper, we propose a new deep architecture for fusing camera and LiDAR sensors for 3D object detection. Because the camera and LiDAR sensor signals have different characteristics and distributions, fusing these two modalities is expected to improve both the accuracy and robustness of 3D object detection. One of the challenges presented by the fusion of cameras and LiDAR is that the spatial feature maps obtained from each modality are represented by significantly different views in the camera and world coordinates; hence, it is not an easy task to combine two heterogeneous feature maps without loss of information. To address this problem, we propose a method called 3D-CVF that combines the camera and LiDAR features using the cross-view spatial feature fusion strategy. First, the method employs auto-calibrated projection, to transform the 2D camera features to a smooth spatial feature map with the highest correspondence to the LiDAR features in the bird’s eye view (BEV) domain. Then, a gated feature fusion network is applied to use the spatial attention maps to mix the camera and LiDAR features appropriately according to the region. Next, camera-LiDAR feature fusion is also achieved in the subsequent proposal refinement stage. The low-level LiDAR features and camera features are separately pooled using region of interest (RoI)-based feature pooling and fused with the joint camera-LiDAR features for enhanced proposal refinement. Our evaluation, conducted on the KITTI and nuScenes 3D object detection datasets, demonstrates that the camera-LiDAR fusion offers significant performance gain over the LiDAR-only baseline and that the proposed 3D-CVF achieves state-of-the-art performance in the KITTI benchmark.

...read moreread less

231 citations

Proceedings Article•DOI•

Single-Image HDR Reconstruction by Learning to Reverse the Camera Pipeline

[...]

Yu-Lun Liu¹, Wei-Sheng Lai², Yu-Sheng Chen¹, Yi-Lung Kao¹, Ming-Hsuan Yang³, Yung-Yu Chuang¹, Jia-Bin Huang⁴ - Show less +3 more•Institutions (4)

National Taiwan University¹, Google², University of California, Merced³, Virginia Tech⁴

14 Jun 2020

TL;DR: Zhang et al. as discussed by the authors model the HDR-to-LDR image formation pipeline as the dynamic range clipping, non-linear mapping from a camera response function, and quantization.

...read moreread less

Abstract: Recovering a high dynamic range (HDR) image from a single low dynamic range (LDR) input image is challenging due to missing details in under-/over-exposed regions caused by quantization and saturation of camera sensors. In contrast to existing learning-based methods, our core idea is to incorporate the domain knowledge of the LDR image formation pipeline into our model. We model the HDR-to-LDR image formation pipeline as the (1) dynamic range clipping, (2) non-linear mapping from a camera response function, and (3) quantization. We then propose to learn three specialized CNNs to reverse these steps. By decomposing the problem into specific sub-tasks, we impose effective physical constraints to facilitate the training of individual sub-networks. Finally, we jointly fine-tune the entire model end-to-end to reduce error accumulation. With extensive quantitative and qualitative experiments on diverse image datasets, we demonstrate that the proposed method performs favorably against state-of-the-art single-image HDR reconstruction algorithms.

...read moreread less

167 citations

Proceedings Article•DOI•

Replacing Mobile Camera ISP with a Single Deep Learning Model

[...]

Andrey Ignatov¹, Luc Van Gool¹, Radu Timofte¹•Institutions (1)

ETH Zurich¹

13 Feb 2020

TL;DR: Huang et al. as discussed by the authors presented PyNET, a novel pyramidal CNN architecture designed for fine-grained image restoration that implicitly learns to perform all ISP steps such as image demosaicing, denoising, white balancing, color and contrast correction, etc.

...read moreread less

Abstract: As the popularity of mobile photography is growing constantly, lots of efforts are being invested now into building complex hand-crafted camera ISP solutions. In this work, we demonstrate that even the most sophisticated ISP pipelines can be replaced with a single end-to-end deep learning model trained without any prior knowledge about the sensor and optics used in a particular device. For this, we present PyNET, a novel pyramidal CNN architecture designed for fine-grained image restoration that implicitly learns to perform all ISP steps such as image demosaicing, denoising, white balancing, color and contrast correction, demoireing, etc. The model is trained to convert RAW Bayer data obtained directly from mobile camera sensor into photos captured with a professional high-end DSLR camera, making the solution independent of any particular mobile ISP implementation. To validate the proposed approach on the real data, we collected a large-scale dataset consisting of 10 thousand full-resolution RAW-RGB image pairs captured in the wild with the Huawei P20 cameraphone (12.3 MP Sony Exmor IMX380 sensor) and Canon 5D Mark IV DSLR. The experiments demonstrate that the proposed solution can easily get to the level of the embedded P20's ISP pipeline that, unlike our approach, is combining the data from two (RGB + B/W) camera sensors. The dataset, pretrained models and codes used in this paper are available on the project website: https://people.ee.ethz.ch/~ihnatova/pynet.html.

...read moreread less

156 citations

Journal Article•DOI•

Megapixel time-gated SPAD image sensor for 2D and 3D imaging applications

[...]

Kazuhiro Morimoto¹, Andrei Ardelean¹, Ming-Lo Wu¹, Arin Can Ulku¹, Ivan Michel Antolovic¹, Claudio Bruschini¹, Edoardo Charbon¹ - Show less +3 more•Institutions (1)

École Polytechnique Fédérale de Lausanne¹

20 Apr 2020

TL;DR: In this paper, a 1 Mpixel single-photon avalanche diode camera with 3.8 ns time gating and 24 kfps frame rate is presented, fabricated in 180 nm CMOS image sensor technology.

...read moreread less

Abstract: We present a 1 Mpixel single-photon avalanche diode camera featuring 3.8 ns time gating and 24 kfps frame rate, fabricated in 180 nm CMOS image sensor technology. We designed two pixels with a pitch of 9.4 µm in 7 T and 5.75 T configurations respectively, achieving a maximum fill factor of 13.4%. The maximum photon detection probability is 27%, median dark count rate is 2.0 cps, variation in gating length is 120 ps, position skew is 410 ps, and rise/fall time is ${ \lt }{550}\;{\rm ps}$<550ps, all FWHM at 3.3 V excess bias. The sensor was used to capture 2D/3D scenes over 2 m with resolution (least significant bit) of 5.4 mm and precision better than 7.8 mm (rms). We demonstrate extended dynamic range in dual exposure operation mode and show spatially overlapped multi-object detection in single-photon time-gated time-of-flight experiments.

...read moreread less

156 citations

Journal Article•DOI•

Curved neuromorphic image sensor array using a MoS2-organic heterostructure inspired by the human visual recognition system.

[...]

Changsoon Choi¹, Juyoung Leem², Min Sung Kim¹, Amir Taqieddin², Chullhee Cho², Kyoung Won Cho¹, Gil Ju Lee³, Hyojin Seung¹, Hyung Jong Bae², Young Min Song³, Taeghwan Hyeon¹, Narayana R. Aluru², SungWoo Nam², Dae-Hyeong Kim¹ - Show less +10 more•Institutions (3)

Seoul National University¹, University of Illinois at Urbana–Champaign², Gwangju Institute of Science and Technology³

23 Nov 2020-Nature Communications

TL;DR: The curved neuromorphic image sensor array integrated with a plano-convex lens derives a pre-processed image from a set of noisy optical inputs without redundant data storage, processing, and communications as well as without complex optics.

...read moreread less

Abstract: Conventional imaging and recognition systems require an extensive amount of data storage, pre-processing, and chip-to-chip communications as well as aberration-proof light focusing with multiple lenses for recognizing an object from massive optical inputs. This is because separate chips (i.e., flat image sensor array, memory device, and CPU) in conjunction with complicated optics should capture, store, and process massive image information independently. In contrast, human vision employs a highly efficient imaging and recognition process. Here, inspired by the human visual recognition system, we present a novel imaging device for efficient image acquisition and data pre-processing by conferring the neuromorphic data processing function on a curved image sensor array. The curved neuromorphic image sensor array is based on a heterostructure of MoS2 and poly(1,3,5-trimethyl-1,3,5-trivinyl cyclotrisiloxane). The curved neuromorphic image sensor array features photon-triggered synaptic plasticity owing to its quasi-linear time-dependent photocurrent generation and prolonged photocurrent decay, originated from charge trapping in the MoS2-organic vertical stack. The curved neuromorphic image sensor array integrated with a plano-convex lens derives a pre-processed image from a set of noisy optical inputs without redundant data storage, processing, and communications as well as without complex optics. The proposed imaging device can substantially improve efficiency of the image acquisition and recognition process, a step forward to the next generation machine vision. Designing efficient bio-inspired visual recognition system remains a challenge. Here the authors present a curved neuromorphic image sensor array based on a heterostructure of MoS2 and pV3D3 integrated with a plano-convex lens for efficient image acquisition and data pre-processing.

...read moreread less

118 citations

Proceedings Article•DOI•

CenterFusion: Center-based Radar and Camera Fusion for 3D Object Detection

[...]

Ramin Nabati¹, Hairong Qi¹•Institutions (1)

University of Tennessee¹

10 Nov 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: CenterFusion as discussed by the authors uses a center point detection network to detect objects by identifying their center points on the image and then solves the key data association problem using a novel frustum-based method to associate the radar detections to their corresponding object's center point.

...read moreread less

Abstract: The perception system in autonomous vehicles is responsible for detecting and tracking the surrounding objects. This is usually done by taking advantage of several sensing modalities to increase robustness and accuracy, which makes sensor fusion a crucial part of the perception system. In this paper, we focus on the problem of radar and camera sensor fusion and propose a middle-fusion approach to exploit both radar and camera data for 3D object detection. Our approach, called CenterFusion, first uses a center point detection network to detect objects by identifying their center points on the image. It then solves the key data association problem using a novel frustum-based method to associate the radar detections to their corresponding object's center point. The associated radar detections are used to generate radar-based feature maps to complement the image features, and regress to object properties such as depth, rotation and velocity. We evaluate CenterFusion on the challenging nuScenes dataset, where it improves the overall nuScenes Detection Score (NDS) of the state-of-the-art camera-based algorithm by more than 12%. We further show that CenterFusion significantly improves the velocity estimation accuracy without using any additional temporal information. The code is available at this https URL .

...read moreread less

111 citations

Proceedings Article•DOI•

Fast Image Reconstruction with an Event Camera

[...]

Cedric Scheerlinck¹, Henri Rebecq², Daniel Gehrig², Nick Barnes¹, Robert Mahony¹, Davide Scaramuzza² - Show less +2 more•Institutions (2)

Australian National University¹, University of Zurich²

01 Mar 2020

TL;DR: A novel neural network architecture for video reconstruction from events that is smaller (38k vs. 10M parameters) and faster than state-of-the-art with minimal impact to performance is proposed.

...read moreread less

Abstract: Event cameras are powerful new sensors able to capture high dynamic range with microsecond temporal resolution and no motion blur. Their strength is detecting brightness changes (called events) rather than capturing direct brightness images; however, algorithms can be used to convert events into usable image representations for applications such as classification. Previous works rely on hand-crafted spatial and temporal smoothing techniques to reconstruct images from events. State-of-the-art video reconstruction has recently been achieved using neural networks that are large (10M parameters) and computationally expensive, requiring 30ms for a forward-pass at 640 × 480 resolution on a modern GPU. We propose a novel neural network architecture for video reconstruction from events that is smaller (38k vs. 10M parameters) and faster (10ms vs. 30ms) than state-of-the-art with minimal impact to performance.

...read moreread less

111 citations

Journal Article•DOI•

Resolution Analysis in a Lens-Free On-Chip Digital Holographic Microscope

[...]

Jialin Zhang¹, Jiasong Sun¹, Chen Qian¹, Chao Zuo¹•Institutions (1)

Nanjing University of Science and Technology¹

06 Jan 2020-IEEE Transactions on Computational Imaging

TL;DR: In this article, the authors derived transfer function models that account for all these physical effects and interactions of these models on the imaging resolution of a lens-free on-chip digital holographic microscopy (LFOCDHM) system.

...read moreread less

Abstract: Lens-free on-chip digital holographic microscopy (LFOCDHM) is a modern imaging technique whereby the sample is placed directly onto or very close to the digital sensor, and illuminated by a partially coherent source located far above it. The scattered object wave interferes with the reference (unscattered) wave at the plane where a digital sensor is situated, producing a digital hologram that can be processed in several ways to extract and numerically reconstruct an in-focus image using the back propagation algorithm. Without requiring any lenses and other intermediate optical components, the LFOCDHM has unique advantages of offering a large effective numerical aperture (NA) close to unity across the native wide field-of-view (FOV) of the imaging sensor in a cost-effective and compact design. However, unlike conventional coherent diffraction limited imaging systems, where the limiting aperture is used to define the system performance, typical lens-free microscopes only produce compromised imaging resolution that far below the ideal coherent diffraction limit. At least five major factors may contribute to this limitation, namely, the sample-to-sensor distance, spatial and temporal coherence of the illumination, finite size of the equally spaced sensor pixels, and finite extent of the image sub-FOV used for the reconstruction, which have not been systematically and rigorously explored until now. In this article, we derive five transfer function models that account for all these physical effects and interactions of these models on the imaging resolution of LFOCDHM. We also examine how our theoretical models can be utilized to optimize the optical design or predict the theoretical resolution limit of a given LFOCDHM system. We present a series of simulations and experiments to confirm the validity of our theoretical models.

...read moreread less

103 citations

Journal Article•DOI•

A universal 3D imaging sensor on a silicon photonics platform

[...]

Christopher Rogers, Alexander Y. Piggott, David J. Thomson¹, Robert Francis Wiser, Ion E. Opris, Steven A. Fortune, Andrew J. Compston, Alexander Gondarenko, Fanfan Meng¹, Xia Chen¹, Graham T. Reed¹, Remus Nicolaescu - Show less +8 more•Institutions (1)

University of Southampton¹

06 Aug 2020-arXiv: Applied Physics

TL;DR: In this article, the authors demonstrate the first large-scale coherent detector array consisting of 512 pixels and its operation in a 3D imaging system, achieving an accuracy of 3.1~mW using only 4mW of light.

...read moreread less

Abstract: Accurate 3D imaging is essential for machines to map and interact with the physical world. While numerous 3D imaging technologies exist, each addressing niche applications with varying degrees of success, none have achieved the breadth of applicability and impact that digital image sensors have achieved in the 2D imaging world. A large-scale two-dimensional array of coherent detector pixels operating as a light detection and ranging (LiDAR) system could serve as a universal 3D imaging platform. Such a system would offer high depth accuracy and immunity to interference from sunlight, as well as the ability to directly measure the velocity of moving objects. However, due to difficulties in providing electrical and photonic connections to every pixel, previous systems have been restricted to fewer than 20 pixels. Here, we demonstrate the first large-scale coherent detector array consisting of 512 ($32 \times 16$) pixels, and its operation in a 3D imaging system. Leveraging recent advances in the monolithic integration of photonic and electronic circuits, a dense array of optical heterodyne detectors is combined with an integrated electronic readout architecture, enabling straightforward scaling to arbitrarily large arrays. Meanwhile, two-axis solid-state beam steering eliminates any tradeoff between field of view and range. Operating at the quantum noise limit, our system achieves an accuracy of $3.1~\mathrm{mm}$ at a distance of 75 metres using only $4~\mathrm{mW}$ of light, an order of magnitude more accurate than existing solid-state systems at such ranges. Future reductions of pixel size using state-of-the-art components could yield resolutions in excess of 20 megapixels for arrays the size of a consumer camera sensor. This result paves the way for the development and proliferation of low cost, compact, and high performance 3D imaging cameras.

...read moreread less

Journal Article•DOI•

An Atomically Thin Optoelectronic Machine Vision Processor.

[...]

Houk Jang¹, Chengye Liu¹, Henry Hinton¹, Min-Hyun Lee², Haeryong Kim², Minsu Seol², Hyeon-Jin Shin², Seongjun Park², Donhee Ham¹ - Show less +5 more•Institutions (2)

Harvard University¹, Samsung²

23 Jul 2020-Advanced Materials

TL;DR: A decisive advance in 2D integrated circuits is reported, where the device integration scale is increased by tenfold and the functional complexity of 2D electronics is propelled to an unprecedented level.

...read moreread less

Abstract: 2D semiconductors, especially transition metal dichalcogenide (TMD) monolayers, are extensively studied for electronic and optoelectronic applications. Beyond intensive studies on single transistors and photodetectors, the recent advent of large-area synthesis of these atomically thin layers has paved the way for 2D integrated circuits, such as digital logic circuits and image sensors, achieving an integration level of ≈100 devices thus far. Here, a decisive advance in 2D integrated circuits is reported, where the device integration scale is increased by tenfold and the functional complexity of 2D electronics is propelled to an unprecedented level. Concretely, an analog optoelectronic processor inspired by biological vision is developed, where 32 × 32 = 1024 MoS2 photosensitive field-effect transistors manifesting persistent photoconductivity (PPC) effects are arranged in a crossbar array. This optoelectronic processor with PPC memory mimics two core functions of human vision: it captures and stores an optical image into electrical data, like the eye and optic nerve chain, and then recognizes this electrical form of the captured image, like the brain, by executing analog in-memory neural net computing. In the highlight demonstration, the MoS2 FET crossbar array optically images 1000 handwritten digits and electrically recognizes these imaged data with 94% accuracy.

...read moreread less

A megapixel time-gated SPAD image sensor for 2D and 3D imaging applications

[...]

Kazuhiro Morimoto, Andrei Ardelean, Ming-Lo Wu, Arin Can Ulku, Ivan Michel Antolovic, Claudio Bruschini, Edoardo Charbon - Show less +3 more

20 Mar 2020

TL;DR: Spatially overlapped multi-object detection is experimentally demonstrated in single-photon time-gated ToF for the first time and extended dynamic range is demonstrated in dual exposure operation mode.

...read moreread less

Abstract: We present the first 1Mpixel SPAD camera ever reported. The camera features 3.8ns time gating and 24kfps frame rate; it was fabricated in 180nm CIS technology. Two pixels have been designed with a pitch of 9.4$\mu$m in 7T and 5.75T configurations, respectively, achieving a maximum fill factor of 13.4%. The maximum PDP is 27%, median DCR 2.0cps, variation in gating length 120ps, position skew 410ps, and rise/fall time <550ps, all FWHM at 3.3V of excess bias. The sensor was used to capture 2D/3D scenes over 2m with an LSB of 5.4mm and a precision better than 7.8mm. Extended dynamic range is demonstrated in dual exposure operation mode. Spatially overlapped multi-object detection is experimentally demonstrated in single-photon time-gated ToF for the first time.

...read moreread less

Journal Article•DOI•

Spectral DiffuserCam: lensless snapshot hyperspectral imaging with a spectral filter array

[...]

Kristina Monakhova¹, Kyrollos Yanny¹, Neerja Aggarwal¹, Laura Waller¹•Institutions (1)

University of California, Berkeley¹

15 Jun 2020-arXiv: Image and Video Processing

TL;DR: This paper proposes a novel, compact, and inexpensive computational camera for snapshot hyperspectral imaging that consists of a repeated spectral filter array placed directly on the image sensor and a diffuser placed close to the sensor.

...read moreread less

Abstract: Hyperspectral imaging is useful for applications ranging from medical diagnostics to agricultural crop monitoring; however, traditional scanning hyperspectral imagers are prohibitively slow and expensive for widespread adoption. Snapshot techniques exist but are often confined to bulky benchtop setups or have low spatio-spectral resolution. In this paper, we propose a novel, compact, and inexpensive computational camera for snapshot hyperspectral imaging. Our system consists of a tiled spectral filter array placed directly on the image sensor and a diffuser placed close to the sensor. Each point in the world maps to a unique pseudorandom pattern on the spectral filter array, which encodes multiplexed spatio-spectral information. By solving a sparsity-constrained inverse problem, we recover the hyperspectral volume with sub-super-pixel resolution. Our hyperspectral imaging framework is flexible and can be designed with contiguous or non-contiguous spectral filters that can be chosen for a given application. We provide theory for system design, demonstrate a prototype device, and present experimental results with high spatio-spectral resolution.

...read moreread less

Journal Article•DOI•

An aquatic-vision-inspired camera based on a monocentric lens and a silicon nanorod photodiode array

[...]

Min Sung Kim¹, Gil Ju Lee², Changsoon Choi¹, Min Seok Kim², Min-Cheol Lee¹, Siyi Liu³, Kyoung Won Cho¹, Hyun Myung Kim², Hyojin Cho¹, Moon Kee Choi⁴, Nanshu Lu³, Young Min Song², Dae-Hyeong Kim¹ - Show less +9 more•Institutions (4)

Seoul National University¹, Gwangju Institute of Science and Technology², University of Texas at Austin³, Ulsan National Institute of Science and Technology⁴

01 Sep 2020

TL;DR: In this article, an aquatic-vision-inspired camera that consists of a single monocentric lens and a hemispherical silicon nanorod photodiode array is presented.

...read moreread less

Abstract: Conventional wide-field-of-view cameras consist of multi-lens optics and flat image sensor arrays, which makes them bulky and heavy As a result, they are poorly suited to advanced mobile applications such as drones and autonomous vehicles In nature, the eyes of aquatic animals consist of a single spherical lens and a highly sensitive hemispherical retina, an approach that could be beneficial in the development of synthetic wide-field-of-view imaging systems Here, we report an aquatic-vision-inspired camera that consists of a single monocentric lens and a hemispherical silicon nanorod photodiode array The imaging system features a wide field of view, miniaturized design, low optical aberration, deep depth of field and simple visual accommodation Furthermore, under vignetting, the photodiode array enables high-quality panoramic imaging due to the enhanced photodetection properties of the silicon nanorod photodiodes By integrating a single monocentric lens with a hemispherical silicon nanorod photodiode array, a wide-field-of-view camera is created that offers low optical aberration, deep depth of field and simple visual accommodation

...read moreread less

Journal Article•DOI•

Multifunctional volumetric meta-optics for color and polarization image sensors

[...]

Philip Camayd-Munoz¹, Conner Ballew¹, Gregory Roberts¹, Andrei Faraon¹•Institutions (1)

California Institute of Technology¹

20 Apr 2020

TL;DR: In this article, 3D dielectric elements are designed to be placed on top of the pixels of image sensors, that sort and focus light based on its color and polarization with efficiency significantly surpassing 2D absorptive and diffractive filters.

...read moreread less

Abstract: Three-dimensional elements, with refractive index distribution structured at subwavelength scale, provide an expansive optical design space that can be harnessed for demonstrating multifunctional free-space optical devices. Here we present 3D dielectric elements, designed to be placed on top of the pixels of image sensors, that sort and focus light based on its color and polarization with efficiency significantly surpassing 2D absorptive and diffractive filters. The devices are designed via iterative gradient-based optimization to account for multiple target functions while ensuring compatibility with existing nanofabrication processes, and they are experimentally validated using a scaled device that operates at microwave frequencies. This approach combines arbitrary functions into a single compact element, even where there is no known equivalent in bulk optics, enabling novel integrated photonic applications.

...read moreread less

Journal Article•DOI•

Neural Sensors: Learning Pixel Exposures for HDR Imaging and Video Compressive Sensing With Programmable Sensors

[...]

Julien N. P. Martel¹, Lorenz K. Muller², Stephen J. Carey³, Piotr Dudek³, Gordon Wetzstein¹ - Show less +1 more•Institutions (3)

Stanford University¹, IBM², University of Manchester³

01 Jul 2020-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: This work introduces neural sensors as a methodology to optimize per-pixel shutter functions jointly with a differentiable image processing method, such as a neural network, in an end-to-end fashion and demonstrates how to leverage emerging programmable and re-configurable sensor–processors to implement the optimized exposure functions directly on the sensor.

...read moreread less

Abstract: Camera sensors rely on global or rolling shutter functions to expose an image. This fixed function approach severely limits the sensors’ ability to capture high-dynamic-range (HDR) scenes and resolve high-speed dynamics. Spatially varying pixel exposures have been introduced as a powerful computational photography approach to optically encode irradiance on a sensor and computationally recover additional information of a scene, but existing approaches rely on heuristic coding schemes and bulky spatial light modulators to optically implement these exposure functions. Here, we introduce neural sensors as a methodology to optimize per-pixel shutter functions jointly with a differentiable image processing method, such as a neural network, in an end-to-end fashion. Moreover, we demonstrate how to leverage emerging programmable and re-configurable sensor–processors to implement the optimized exposure functions directly on the sensor. Our system takes specific limitations of the sensor into account to optimize physically feasible optical codes and we evaluate its performance for snapshot HDR and high-speed compressive imaging both in simulation and experimentally with real scenes.

...read moreread less

Journal Article•DOI•

Highly Reliable and Low-Complexity Image Compression Scheme Using Neighborhood Correlation Sequence Algorithm in WSN

[...]

J. Uthayakumar¹, Mohamed Elhoseny², K. Shankar³•Institutions (3)

Pondicherry University¹, Mansoura University², Alagappa University³

28 Feb 2020-IEEE Transactions on Reliability

TL;DR: This article introduces a highly reliable and low-complexity image compression scheme using neighborhood correlation sequence (NCS) algorithm that increases the compression performance and decreases the energy utilization of the sensor nodes with high fidelity.

...read moreread less

Abstract: Recently, the advancements in the field of wireless technologies and micro-electro-mechanical systems lead to the development of potential applications in wireless sensor networks (WSNs). The visual sensors in WSN create a significant impact on computer vision based applications such as pattern recognition and image restoration. generate a massive quantity of multimedia data. Since transmission of images consumes more computational resources, various image compression techniques have been proposed. But, most of the existing image compression techniques are not applicable for sensor nodes due to its limitations on energy, bandwidth, memory, and processing capabilities. In this article, we introduce a highly reliable and low-complexity image compression scheme using neighborhood correlation sequence (NCS) algorithm. The NCS algorithm performs the bit reduction operation and then encoded by a codec (such as PPM, Deflate, and Lempel Ziv Markov chain algorithm.) to further compress the image. The proposed NCS algorithm increases the compression performance and decreases the energy utilization of the sensor nodes with high fidelity. Moreover, it achieved a minimum end to end delay of 1074.46 ms at the average bit rate of 4.40 bpp and peak signal to noise ratio of 48.06 on the applied test images. On comparing with state-of-art methods, the proposed method maintains a better tradeoff between compression efficiency and reconstructed image quality.

...read moreread less

Journal Article•DOI•

High-speed 3D sensing via hybrid-mode imaging and guided upsampling

[...]

Istvan Gyongy¹, Sam W. Hutchings¹, Abderrahim Halimi², Max Tyler², Susan Chan², Feng Zhu², Stephen McLaughlin², Robert Henderson¹, Jonathan Leach² - Show less +5 more•Institutions (2)

University of Edinburgh¹, Heriot-Watt University²

20 Oct 2020

TL;DR: A high-speed 3D imaging system enabled by a state-of-the-art SPAD sensor used in a hybrid imaging mode that can perform multi-event histogramming and guided upscaling of depth data from a native resolution of 64×32 to 256×128 is reported.

...read moreread less

Abstract: Imaging systems with temporal resolution play a vital role in a diverse range of scientific, industrial, and consumer applications, e.g., fluorescent lifetime imaging in microscopy and time-of-flight (ToF) depth sensing in autonomous vehicles. In recent years, single-photon avalanche diode (SPAD) arrays with picosecond timing capabilities have emerged as a key technology driving these systems forward. Here we report a high-speed 3D imaging system enabled by a state-of-the-art SPAD sensor used in a hybrid imaging mode that can perform multi-event histogramming. The hybrid imaging modality alternates between photon counting and timing frames at rates exceeding 1000 frames per second, enabling guided upscaling of depth data from a native resolution of 64×32 to 256×128. The combination of hardware and processing allows us to demonstrate high-speed ToF 3D imaging in outdoor conditions and with low latency. The results indicate potential in a range of applications where real-time, high throughput data are necessary. One such example is improving the accuracy and speed of situational awareness in autonomous systems and robotics.

...read moreread less

Journal Article•DOI•

Multimodal Medical Image Sensor Fusion Model Using Sparse K-SVD Dictionary Learning in Nonsubsampled Shearlet Domain

[...]

Sneha Singh¹, R. S. Anand¹•Institutions (1)

Indian Institute of Technology Roorkee¹

01 Feb 2020-IEEE Transactions on Instrumentation and Measurement

TL;DR: A novel fusion framework for multimodal neurological images, which is able to capture small-scale details of input images with original structural details and is superior to several other approaches as it produces better visually fused images with improved computational measures.

...read moreread less

Abstract: Multimodal medical image sensor fusion (MMISF) has a significant role for better visualization of the diagnostic statistics computed by integrating the vital information taken from input source images acquired using multimodal imaging sensors. The MMISF also helps the medical professionals in precise diagnosis of several critical diseases and its treatment. Often, images taken from different imaging sensors are degraded by noise interferences during acquisition or data transmission that leads to the false perception of noise as a useful feature of the image. This paper presents a novel fusion framework for multimodal neurological images, which is able to capture small-scale details of input images with original structural details. In its procedural steps, at first, source images get decomposed by the nonsubsampled shearlet transform (NSST) into a low-frequency (lf) and several high-frequency (hf) components to separate out the two basic characteristics of source image, i.e., principal information and edge details. The lf layers get fused with a sparse representation-based model, and the hf components are merged by the guided filtering-based approach. Finally, fused images are reconstructed by employing the inverse NSST. The superiority of the proposed MMISF approach is confirmed by a large extent of analytical experimentations on the different real magnetic resonance single-photon emission computed tomography, magnetic resonance-positron emission tomography and computed tomography magnetic resonance neurological image data sets. Based on all these experimental results, it is stated that the proposed MMISF approach is superior to several other approaches as it produces better visually fused images with improved computational measures.

...read moreread less

Proceedings Article•DOI•

Camera-to-Robot Pose Estimation from a Single Image

[...]

Timothy E. Lee¹, Jonathan Tremblay¹, Thang To¹, Jia Cheng¹, Terry Mosier¹, Oliver Kroemer², Dieter Fox¹, Stan Birchfield¹ - Show less +4 more•Institutions (2)

Nvidia¹, Carnegie Mellon University²

01 May 2020

TL;DR: This work presents an approach for estimating the pose of an external camera with respect to a robot using a single RGB image of the robot, capable of computing the camera extrinsics from a single frame, thus opening the possibility of on-line calibration.

...read moreread less

Abstract: We present an approach for estimating the pose of an external camera with respect to a robot using a single RGB image of the robot. The image is processed by a deep neural network to detect 2D projections of keypoints (such as joints) associated with the robot. The network is trained entirely on simulated data using domain randomization to bridge the reality gap. Perspective-n-point (PnP) is then used to recover the camera extrinsics, assuming that the camera intrinsics and joint configuration of the robot manipulator are known. Unlike classic hand-eye calibration systems, our method does not require an off-line calibration step. Rather, it is capable of computing the camera extrinsics from a single frame, thus opening the possibility of on-line calibration. We show experimental results for three different robots and camera sensors, demonstrating that our approach is able to achieve accuracy with a single frame that is comparable to that of classic off-line hand-eye calibration using multiple frames. With additional frames from a static pose, accuracy improves even further. Code, datasets, and pretrained models for three widely-used robot manipulators are made available.1

...read moreread less

Journal Article•DOI•

Biologically inspired ultrathin arrayed camera for high-contrast and high-resolution imaging.

[...]

Kisoo Kim¹, Kyung-Won Jang¹, Jae-Kwan Ryu, Ki-Hun Jeong¹•Institutions (1)

KAIST¹

27 Feb 2020-Light-Science & Applications

TL;DR: This ultrathin arrayed camera provides a novel and practical direction for diverse mobile, surveillance or medical applications and demonstrates that the multilayered pinhole of the MOE allows high-contrast imaging by eliminating the optical crosstalk between microlenses.

...read moreread less

Abstract: Compound eyes found in insects provide intriguing sources of biological inspiration for miniaturised imaging systems. Here, we report an ultrathin arrayed camera inspired by insect eye structures for high-contrast and super-resolution imaging. The ultrathin camera features micro-optical elements (MOEs), i.e., inverted microlenses, multilayered pinhole arrays, and gap spacers on an image sensor. The MOE was fabricated by using repeated photolithography and thermal reflow. The fully packaged camera shows a total track length of 740 μm and a field-of-view (FOV) of 73°. The experimental results demonstrate that the multilayered pinhole of the MOE allows high-contrast imaging by eliminating the optical crosstalk between microlenses. The integral image reconstructed from array images clearly increases the modulation transfer function (MTF) by ~1.57 times compared to that of a single channel image in the ultrathin camera. This ultrathin arrayed camera provides a novel and practical direction for diverse mobile, surveillance or medical applications.

...read moreread less

Journal Article•DOI•

Robust Vision-Based Tube Model Predictive Control of Multiple Mobile Robots for Leader–Follower Formation

[...]

Zhijun Li¹, Yuxia Yuan, Fan Ke, Wei He², Chun-Yi Su³ - Show less +1 more•Institutions (3)

University of Science and Technology of China¹, University of Science and Technology Beijing², Guangdong University of Technology³

01 Apr 2020-IEEE Transactions on Industrial Electronics

TL;DR: This paper proposes a controller that makes use of the image information from an un-calibrated perspective camera mounted at the follower robot without relative position measurement or any communication among the robots.

...read moreread less

Abstract: Generally, vision-based controls use various camera sensors and require camera calibration, while the control performance would degrade due to inaccuracy calibration. Therefore, in this paper, the proposed controller only makes use of the image information from an un-calibrated perspective camera mounted at the follower robot without relative position measurement or any communication among the robots. First, the nominal visual formation kinematic model is developed using the camera models. Then it is redescribed as a quadratic programming (QP) with the specified constraints. A neurodynamic optimization based on primal-dual neural network is utilized to ensure the QP being converged to the exact optimal values. Through two-time-scale neuro-dynamical optimization, the gain scheduling of the ancillary state feedback can be realized so that the state variables are constrained within an invariant designed tube. The experiment results provide the verification for the effectiveness of the proposed approach.

...read moreread less

Journal Article•DOI•

Intel® RealSense™ SR300 Coded Light Depth Camera

[...]

Aviad Zabatani¹, Vitaly Surazhsky¹, Erez Sperling¹, Sagi Ben Moshe¹, Ohad Menashe¹, David H. Silver¹, Zachi Karni¹, Alexander M. Bronstein², Michael M. Bronstein³, Ron Kimmel² - Show less +6 more•Institutions (3)

Intel¹, Technion – Israel Institute of Technology², Imperial College London³

01 Oct 2020-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: The underlying technology, hardware, and algorithms of the SR300, as well as its calibration procedure, are described, and some use cases are outlined, which will provide a full case study of a mass-produced depth sensing product and technology.

...read moreread less

Abstract: Intel® RealSense™ SR300 is a depth camera capable of providing a VGA-size depth map at 60 fps and 0.125mm depth resolution. In addition, it outputs an infrared VGA-resolution image and a 1080p color texture image at 30 fps. SR300 form-factor enables it to be integrated into small consumer products and as a front facing camera in laptops and Ultrabooks™. The SR300 depth camera is based on a coded-light technology where triangulation between projected patterns and images captured by a dedicated sensor is used to produce the depth map. Each projected line is coded by a special temporal optical code, that enables a dense depth map reconstruction from its reflection. The solid mechanical assembly of the camera allows it to stay calibrated throughout temperature and pressure changes, drops, and hits. In addition, active dynamic control maintains a calibrated depth output. An extended API LibRS released with the camera allows developers to integrate the camera in various applications. Algorithms for 3D scanning, facial analysis, hand gesture recognition, and tracking are within reach for applications using the SR300. In this paper, we describe the underlying technology, hardware, and algorithms of the SR300, as well as its calibration procedure, and outline some use cases. We believe that this paper will provide a full case study of a mass-produced depth sensing product and technology.

...read moreread less

Patent•DOI•

Wish: wavefront imaging sensor with high resolution

[...]

Yicheng Wu¹, Manoj Kumar Sharma¹, Ashok Veeraraghavan¹•Institutions (1)

Rice University¹

30 Apr 2020-Light-Science & Applications

TL;DR: This study suggests that the designing principle of WISH, which combines optical modulators and computational algorithms to sense high-resolution optical fields, enables improved capabilities in many existing applications while revealing entirely new, hitherto unexplored application areas.

...read moreread less

Abstract: A system for a wavefront imaging sensor with high resolution (WISH) comprises a spatial light modulator (SLM), a plurality of image sensors and a processor. The system further includes the SLM and a computational post-processing algorithm for recovering an incident wavefront with a high spatial resolution and a fine phase estimation. In addition, the image sensors work both in a visible electromagnetic (EM) spectrum and outside the visible EM spectrum.

...read moreread less

Journal Article•DOI•

On-chip fluorescence microscopy with a random microlens diffuser

[...]

Grace Kuo¹, Fanglin Linda Liu¹, Irene Grossrubatscher², Ren Ng¹, Laura Waller¹ - Show less +1 more•Institutions (2)

University of California, Berkeley¹, Helen Wills Neuroscience Institute²

16 Mar 2020-Optics Express

TL;DR: An on-chip, widefield fluorescence microscope is presented, which consists of a diffuser placed a few millimeters away from a traditional image sensor, enabling refocusability in post-processing and three-dimensional imaging of sparse samples from a single acquisition.

...read moreread less

Abstract: We present an on-chip, widefield fluorescence microscope, which consists of a diffuser placed a few millimeters away from a traditional image sensor. The diffuser replaces the optics of a microscope, resulting in a compact and easy-to-assemble system with a practical working distance of over 1.5 mm. Furthermore, the diffuser encodes volumetric information, enabling refocusability in post-processing and three-dimensional (3D) imaging of sparse samples from a single acquisition. Reconstruction of images from the raw data requires a precise model of the system, so we introduce a practical calibration scheme and a physics-based forward model to efficiently account for the spatially-varying point spread function (PSF). To improve performance in low-light, we propose a random microlens diffuser, which consists of many small lenslets randomly placed on the mask surface and yields PSFs that are robust to noise. We build an experimental prototype and demonstrate our system on both planar and 3D samples.

...read moreread less

Journal Article•DOI•

Numerical Model of SPAD-Based Direct Time-of-Flight Flash LIDAR CMOS Image Sensors.

[...]

Alessandro Tontini¹, Leonardo Gasparini¹, Matteo Perenzoni¹•Institutions (1)

fondazione bruno kessler¹

12 Sep 2020-Sensors

TL;DR: A Montecarlo simulator developed in Matlab® for the analysis of a Single Photon Avalanche Diode (SPAD)-based Complementary Metal-Oxide Semiconductor (CMOS) flash Light Detection and Ranging (LIDAR) system is presented.

...read moreread less

Abstract: We present a Montecarlo simulator developed in Matlab® for the analysis of a Single Photon Avalanche Diode (SPAD)-based Complementary Metal-Oxide Semiconductor (CMOS) flash Light Detection and Ranging (LIDAR) system. The simulation environment has been developed to accurately model the components of a flash LIDAR system, such as illumination source, optics, and the architecture of the designated SPAD-based CMOS image sensor. Together with the modeling of the background noise and target topology, all of the fundamental factors that are involved in a typical LIDAR acquisition system have been included in order to predict the achievable system performance and verified with an existing sensor.

...read moreread less

Journal Article•DOI•

Event Density Based Denoising Method for Dynamic Vision Sensor

[...]

Feng Yang, Hengyi Lv, Hailong Liu, Yisa Zhang, Yuyao Xiao, Chengshan Han - Show less +2 more

01 Mar 2020-Applied Sciences

TL;DR: A method to eliminate background activity, and a method and performance index for evaluating filter performance: noise in real (NIR) and real in noise (RIN).

...read moreread less

Abstract: Dynamic vision sensor (DVS) is a new type of image sensor, which has application prospects in the fields of automobiles and robots. Dynamic vision sensors are very different from traditional image sensors in terms of pixel principle and output data. Background activity (BA) in the data will affect image quality, but there is currently no unified indicator to evaluate the image quality of event streams. This paper proposes a method to eliminate background activity, and proposes a method and performance index for evaluating filter performance: noise in real (NIR) and real in noise (RIN). The lower the value, the better the filter. This evaluation method does not require fixed pattern generation equipment, and can also evaluate filter performance using natural images. Through comparative experiments of the three filters, the comprehensive performance of the method in this paper is optimal. This method reduces the bandwidth required for DVS data transmission, reduces the computational cost of target extraction, and provides the possibility for the application of DVS in more fields.

...read moreread less

Journal Article•DOI•

End-to-end Learned, Optically Coded Super-resolution SPAD Camera

[...]

Qilin Sun¹, Jian Zhang², Xiong Dun³, Bernard Ghanem¹, Yifan Peng⁴, Wolfgang Heidrich¹ - Show less +2 more•Institutions (4)

King Abdullah University of Science and Technology¹, Peking University², Tongji University³, Stanford University⁴

17 Mar 2020-ACM Transactions on Graphics

TL;DR: This work investigates a simple, low-cost, and compact optical coding camera design that supports high-resolution image reconstructions from raw measurements with low pixel counts, and uses an end-to-end framework to simultaneously optimize the optical design and a reconstruction network for obtaining super-resolved images from raw measures.

...read moreread less

Abstract: Single Photon Avalanche Photodiodes (SPADs) have recently received a lot of attention in imaging and vision applications due to their excellent performance in low-light conditions, as well as their ultra-high temporal resolution. Unfortunately, like many evolving sensor technologies, image sensors built around SPAD technology currently suffer from a low pixel count. In this work, we investigate a simple, low-cost, and compact optical coding camera design that supports high-resolution image reconstructions from raw measurements with low pixel counts. We demonstrate this approach for regular intensity imaging, depth imaging, as well transient imaging. Our method uses an end-to-end framework to simultaneously optimize the optical design and a reconstruction network for obtaining super-resolved images from raw measurements. The optical design space is that of an engineered point spread function (implemented with diffractive optics), which can be considered an optimized anti-aliasing filter to preserve as much high-resolution information as possible despite imaging with a low pixel count, low fill-factor SPAD array. We further investigate a deep network for reconstruction. The effectiveness of this joint design and reconstruction approach is demonstrated for a range of different applications, including high-speed imaging, and time of flight depth imaging, as well as transient imaging. While our work specifically focuses on low-resolution SPAD sensors, similar approaches should prove effective for other emerging image sensor technologies with low pixel counts and low fill-factors.

...read moreread less

Journal Article•DOI•

Plasmonic ommatidia for lensless compound-eye vision.

[...]

Leonard C. Kogos¹, Yunzhe Li¹, Jianing Liu¹, Yuyu Li¹, Lei Tian¹, Roberto Paiella¹ - Show less +2 more•Institutions (1)

Boston University¹

02 Apr 2020-Nature Communications

TL;DR: A flat, plasmonic image sensor array is developed that enables high-quality wide-angle vision without lenses or curvature and is described as a lensless planar architecture.

...read moreread less

Abstract: The vision system of arthropods such as insects and crustaceans is based on the compound-eye architecture, consisting of a dense array of individual imaging elements (ommatidia) pointing along different directions. This arrangement is particularly attractive for imaging applications requiring extreme size miniaturization, wide-angle fields of view, and high sensitivity to motion. However, the implementation of cameras directly mimicking the eyes of common arthropods is complicated by their curved geometry. Here, we describe a lensless planar architecture, where each pixel of a standard image-sensor array is coated with an ensemble of metallic plasmonic nanostructures that only transmits light incident along a small geometrically-tunable distribution of angles. A set of near-infrared devices providing directional photodetection peaked at different angles is designed, fabricated, and tested. Computational imaging techniques are then employed to demonstrate the ability of these devices to reconstruct high-quality images of relatively complex objects. The compound eyes of arthropods provide a visual advantage by seeing a wide range of angles all at once, but cameras that mimic them are usually curved and bulky. Here, the authors develop a flat, plasmonic image sensor array that enables high-quality wide-angle vision without lenses or curvature.

...read moreread less

Collapse