scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Resolution Enhancement in Multi-Image Stereo

TL;DR: This paper proposes an integrated approach to estimate the HR depth and the SR image from multiple LR stereo observations and demonstrates the efficacy of the proposed method in not only being able to bring out image details but also in enhancing theHR depth over its LR counterpart.
Abstract: Under stereo settings, the twin problems of image superresolution (SR) and high-resolution (HR) depth estimation are intertwined. The subpixel registration information required for image superresolution is tightly coupled to the 3D structure. The effects of parallax and pixel averaging (inherent in the downsampling process) preclude a priori estimation of pixel motion for superresolution. These factors also compound the correspondence problem at low resolution (LR), which in turn affects the quality of the LR depth estimates. In this paper, we propose an integrated approach to estimate the HR depth and the SR image from multiple LR stereo observations. Our results demonstrate the efficacy of the proposed method in not only being able to bring out image details but also in enhancing the HR depth over its LR counterpart.
Citations
More filters
Proceedings ArticleDOI
01 Dec 2012
TL;DR: The multiframe Super Resolution algorithm applied here is MForWarD, a fast two step algorithm that combines multiple noisy, blurry, low resolution images into a high quality, high resolution image.
Abstract: The low resolution images taken from a scene may contain crucial information that are barely visible to the eye. Super Resolution is the process of combining multiple noisy, blurry, low resolution images into a high quality, high resolution image. By registration, we fuse images taken at different times, at different angles of the same scene. Restoration and denoising of the fused images play a key role in Super Resolution. The multiframe Super Resolution algorithm applied here is MForWarD. It is a fast two step algorithm. First, Fourier-based Weiner filtering produces a sharp but noisy image. The next step uses Wavelet based denoising to remove noise artifacts. The algorithm is applied on several test images including remote sensing images and the results are presented.

4 citations


Cites background from "Resolution Enhancement in Multi-Ima..."

  • ...Super Resolution is the process of combining multiple noisy, blurry, low resolution images into a high quality, high resolution image....

    [...]

Journal ArticleDOI
21 Apr 2022
TL;DR: This paper proposes a novel Transformer-based model for SVSR, namely Trans-SVSR, which comprises two key novel components: a spatio-temporal convolutional self-attention layer and an optical flow-based feed-forward layer that discovers the correlation across different video frames and aligns the features.
Abstract: Stereo video super-resolution (SVSR) aims to enhance the spatial resolution of the low-resolution video by reconstructing the high-resolution video. The key challenges in SVSR are preserving the stereo-consistency and temporal-consistency, without which viewers may experience 3D fatigue. There are several notable works on stereoscopic image super-resolution, but there is little research on stereo video super-resolution. In this paper, we propose a novel Transformer-based model for SVSR, namely Trans-SVSR. Trans-SVSR comprises two key novel components: a spatio-temporal convolutional self-attention layer and an optical flow-based feed-forward layer that discovers the correlation across different video frames and aligns the features. The parallax attention mechanism (PAM) that uses the cross-view information to consider the significant disparities is used to fuse the stereo views. Due to the lack of a benchmark dataset suitable for the SVSR task, we collected a new stereoscopic video dataset, SVSR-Set, containing 71 full high-definition (HD) stereo videos captured using a professional stereo camera. Extensive experiments on the collected dataset, along with two other datasets, demonstrate that the Trans-SVSR can achieve competitive performance compared to the state-of-the-art methods. Project code and additional results are available at https://github.com/H-deep/Trans-SVSR/.

4 citations

Journal ArticleDOI
TL;DR: An integrated error concealment system for lost color frames and lost depth frames in multiview videos with depths is considered, and a pixel-based color error-concealment method with the use of depth information is proposed.
Abstract: In this paper, we consider an integrated error concealment system for lost color frames and lost depth frames in multiview videos with depths. We first proposed a pixel-based color error-concealment method with the use of depth information. Instead of assuming that the same moving object in consecutive frames has minimal depth difference, as is done in a state-of-the-art method, a more realistic situation in which the same moving object in consecutive frames can be in different depths is considered. In the derived motion vector candidate set, we consider all the candidate motion vectors in the set, and weight the reference pixels by the depth differences to obtain the final recovered pixel. Compared with the two state-of-the-art methods, the proposed method has average peak signal-to-noise ratio gains of up to 8.73 and 3.98 dB, respectively. Second, we proposed an iterative depth frame error-concealment method. The initial recovered depth frame is obtained by depth-image-based rendering from another available view. The holes in the recovered depth frame are then filled in the proposed priority order. Preprocessing methods (depth difference compensation and inconsistent pixel removal) are performed to improve the performance. Compared with a method that uses the available motion vector in a color frame to recover the lost depth pixels, the hybrid motion vector extrapolation method, the inpainting method and the proposed method have gains of up to 4.31, 10.29, and 6.04 dB, respectively. Finally, for the situation in which the color and the depth frames are lost at the same time, our two methods jointly perform better with a gain of up to 7.79 dB.

4 citations


Cites background from "Resolution Enhancement in Multi-Ima..."

  • ...Note that the works on depth pixel reconstruction can be found in [20]–[22]....

    [...]

Proceedings ArticleDOI
24 Mar 2023
TL;DR: Parallax fusion transformer (PFT) as mentioned in this paper employs a cross-view fusion transformer and an intra-view refinement transformer for feature extraction and SR reconstruction, and adopts the Swin Transformer as the backbone to form a pure Transformer architecture.
Abstract: Stereo image super-resolution aims to boost the performance of image super-resolution by exploiting the supplementary information provided by binocular systems. Although previous methods have achieved promising results, they did not fully utilize the information of cross-view and intra-view. To further unleash the potential of binocular images, in this letter, we propose a novel Transformerbased parallax fusion module called Parallax Fusion Transformer (PFT). PFT employs a Cross-view Fusion Transformer (CVFT) to utilize cross-view information and an Intra-view Refinement Transformer (IVRT) for intra-view feature refinement. Meanwhile, we adopted the Swin Transformer as the backbone for feature extraction and SR reconstruction to form a pure Transformer architecture called PFT-SSR. Extensive experiments and ablation studies show that PFT-SSR achieves competitive results and outperforms most SOTA methods. Source code is available at https://github.com/MIVRC/PFT-PyTorch.

3 citations

Posted Content
TL;DR: A large-scale stereo dataset named Flickr1024 is proposed that can improve the performance of stereo SR algorithms and is trained on the KITTI2015, Middlebury, and Flickr1024 datasets.
Abstract: With the popularity of dual cameras in recently released smart phones, a growing number of super-resolution (SR) methods have been proposed to enhance the resolution of stereo image pairs. However, the lack of high-quality stereo datasets has limited the research in this area. To facilitate the training and evaluation of novel stereo SR algorithms, in this paper, we propose a large-scale stereo dataset named Flickr1024. Compared to the existing stereo datasets, the proposed dataset contains much more high-quality images and covers diverse scenarios. We train two state-of-the-art stereo SR methods (i.e., StereoSR and PASSRnet) on the KITTI2015, Middlebury, and Flickr1024 datasets. Experimental results demonstrate that our dataset can improve the performance of stereo SR algorithms. The Flickr1024 dataset is available online at: this https URL.

3 citations

References
More filters
Journal ArticleDOI
TL;DR: This paper has designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can easily be extended to include new algorithms.
Abstract: Stereo matching is one of the most active research areas in computer vision. While a large number of algorithms for stereo correspondence have been developed, relatively little work has been done on characterizing their performance. In this paper, we present a taxonomy of dense, two-frame stereo methods designed to assess the different components and design decisions made in individual stereo algorithms. Using this taxonomy, we compare existing stereo methods and present experiments evaluating the performance of many different variants. In order to establish a common software platform and a collection of data sets for easy evaluation, we have designed a stand-alone, flexible C++ implementation that enables the evaluation of individual components and that can be easily extended to include new algorithms. We have also produced several new multiframe stereo data sets with ground truth, and are making both the code and data sets available on the Web.

7,458 citations

Journal ArticleDOI
TL;DR: This work presents two algorithms based on graph cuts that efficiently find a local minimum with respect to two types of large moves, namely expansion moves and swap moves that allow important cases of discontinuity preserving energies.
Abstract: Many tasks in computer vision involve assigning a label (such as disparity) to every pixel. A common constraint is that the labels should vary smoothly almost everywhere while preserving sharp discontinuities that may exist, e.g., at object boundaries. These tasks are naturally stated in terms of energy minimization. The authors consider a wide class of energies with various smoothness constraints. Global minimization of these energy functions is NP-hard even in the simplest discontinuity-preserving case. Therefore, our focus is on efficient approximation algorithms. We present two algorithms based on graph cuts that efficiently find a local minimum with respect to two types of large moves, namely expansion moves and swap moves. These moves can simultaneously change the labels of arbitrarily large sets of pixels. In contrast, many standard algorithms (including simulated annealing) use small moves where only one pixel changes its label at a time. Our expansion algorithm finds a labeling within a known factor of the global minimum, while our swap algorithm handles more general energy functions. Both of these algorithms allow important cases of discontinuity preserving energies. We experimentally demonstrate the effectiveness of our approach for image restoration, stereo and motion. On real data with ground truth, we achieve 98 percent accuracy.

7,413 citations


"Resolution Enhancement in Multi-Ima..." refers background or methods in this paper

  • ...If the scene mainly consists of discontinuous depth planes, we choose a Potts model [ 2 ] (equivalent to choosing T as the minimum depth label)....

    [...]

  • ...The minimization for depth is carried out by � -expansion graph cuts [ 2 ] and that for the image is carried out using iterated conditional modes (ICM) [6], [33]....

    [...]

  • ...Graph cuts is one of the better performing contemporary algorithms for stereo disparity estimation and we too employ it. Its advantage is its efficiency and the guarantee of reaching a strong local minimum [ 2 ]....

    [...]

Proceedings ArticleDOI
17 Jun 2006
TL;DR: This paper first survey multi-view stereo algorithms and compare them qualitatively using a taxonomy that differentiates their key properties, then describes the process for acquiring and calibrating multiview image datasets with high-accuracy ground truth and introduces the evaluation methodology.
Abstract: This paper presents a quantitative comparison of several multi-view stereo reconstruction algorithms. Until now, the lack of suitable calibrated multi-view image datasets with known ground truth (3D shape models) has prevented such direct comparisons. In this paper, we first survey multi-view stereo algorithms and compare them qualitatively using a taxonomy that differentiates their key properties. We then describe our process for acquiring and calibrating multiview image datasets with high-accuracy ground truth and introduce our evaluation methodology. Finally, we present the results of our quantitative comparison of state-of-the-art multi-view stereo reconstruction algorithms on six benchmark datasets. The datasets, evaluation details, and instructions for submitting new models are available online at http://vision.middlebury.edu/mview.

2,556 citations


"Resolution Enhancement in Multi-Ima..." refers methods in this paper

  • ...For the multiview temple scene, where we compute the depth map (instead of disparity), the depth step was computed using the resolution information provided on the Middlebury multiview Web page [27]....

    [...]

  • ...Finally, we provide results for SR by 2 on the Middlebury multiview temple data set [27]....

    [...]

  • ...We carried out experiments on the Middlebury [26], [27], [35] and CMU data sets [34], as well as images captured in our lab....

    [...]

Proceedings ArticleDOI
07 Mar 2001
TL;DR: This paper presents a new method which properly addresses occlusions, while preserving the advantages of graph cut algorithms, and gives experimental results for stereo as well as motion, which demonstrate that the method performs well both at detecting occlusion and computing disparities.
Abstract: Several new algorithms for visual correspondence based on graph cuts have recently been developed. While these methods give very strong results in practice, they do not handle occlusions properly. Specifically, they treat the two input images asymmetrically, and they do not ensure that a pixel corresponds to at most one pixel in the other image. In this paper, we present a new method which properly addresses occlusions, while preserving the advantages of graph cut algorithms. We give experimental results for stereo as well as motion, which demonstrate that our method performs well both at detecting occlusions and computing disparities.

1,334 citations


"Resolution Enhancement in Multi-Ima..." refers methods in this paper

  • ...Many methods exist for handling visibility [4], [10], [14], [29], [31]....

    [...]

Book
01 Aug 1995
TL;DR: This book presents a comprehensive study on the use of MRFs for solving computer vision problems, and covers the following parts essential to the subject: introduction to fundamental theories, formulations of MRF vision models, MRF parameter estimation, and optimization algorithms.
Abstract: From the Publisher: Markov random field (MRF) theory provides a basis for modeling contextual constraints in visual processing and interpretation. It enables us to develop optimal vision algorithms systematically when used with optimization principles. This book presents a comprehensive study on the use of MRFs for solving computer vision problems. The book covers the following parts essential to the subject: introduction to fundamental theories, formulations of MRF vision models, MRF parameter estimation, and optimization algorithms. Various vision models are presented in a unified framework, including image restoration and reconstruction, edge and region segmentation, texture, stereo and motion, object matching and recognition, and pose estimation. This book is an excellent reference for researchers working in computer vision, image processing, statistical pattern recognition, and applications of MRFs. It is also suitable as a text for advanced courses in these areas.

1,333 citations


"Resolution Enhancement in Multi-Ima..." refers methods in this paper

  • ...The termsEzðzÞ andExðxÞ correspond to the weighted MRF priors applied on the depth and the image, respectively....

    [...]