scispace - formally typeset
Search or ask a question
Topic

Channel (digital image)

About: Channel (digital image) is a research topic. Over the lifetime, 7211 publications have been published within this topic receiving 69974 citations.


Papers
More filters
Posted Content
TL;DR: The proposed method permits the joint detection and estimation of the poses without knowing a priori the number of persons present in the scene and demonstrates the benefits of using the additional depth channel for pose refinement beyond its use for the generation of improved features.
Abstract: Many approaches have been proposed for human pose estimation in single and multi-view RGB images. However, some environments, such as the operating room, are still very challenging for state-of-the-art RGB methods. In this paper, we propose an approach for multi-view 3D human pose estimation from RGB-D images and demonstrate the benefits of using the additional depth channel for pose refinement beyond its use for the generation of improved features. The proposed method permits the joint detection and estimation of the poses without knowing a priori the number of persons present in the scene. We evaluate this approach on a novel multi-view RGB-D dataset acquired during live surgeries and annotated with ground truth 3D poses.

34 citations

Journal ArticleDOI
TL;DR: A spectral CT reconstruction method, namely spatial-spectral cube matching frame (SSCMF), inspired by the following three facts, which shows that the SSCMF method outperforms the state-of-the-art algorithms, including the simultaneous algebraic reconstruction technique, total variation minimization,total variation plus low rank, and tensor dictionary learning.
Abstract: Spectral computed tomography (CT) reconstructs the same scanned object from projections of multiple narrow energy windows, and it can be used for material identification and decomposition. However, the multi-energy projection dataset has a lower signal-noise-ratio (SNR), resulting in poor reconstructed image quality. To address this thorny problem, we develop a spectral CT reconstruction method, namely spatial-spectral cube matching frame (SSCMF). This method is inspired by the following three facts: i) human body usually consists of two or three basic materials implying that the reconstructed spectral images have a strong sparsity; ii) the same basic material component in a single channel image has similar intensity and structures in local regions. Different material components within the same energy channel share similar structural information; iii) multi-energy projection datasets are collected from the subject by using different narrow energy windows, which means images reconstructed from different energy-channels share similar structures. To explore those information, we first establish a tensor cube matching frame (CMF) for a BM4D denoising procedure. Then, as a new regularizer, the CMF is introduced into a basic spectral CT reconstruction model, generating the SSCMF method. Because the SSCMF model contains an L0-norm minimization of 4D transform coefficients, an effective strategy is employed for optimization. Both numerical simulations and realistic preclinical mouse studies are performed. The results show that the SSCMF method outperforms the state-of-the-art algorithms, including the simultaneous algebraic reconstruction technique, total variation minimization, total variation plus low rank, and tensor dictionary learning.

34 citations

Posted Content
TL;DR: The Gated Fusion Double SSD (GFD-SSD) outperforms the stacked fusion and achieves the lowest miss rate in the benchmark, at an inference speed that is two times faster than Faster-RCNN based fusion networks.
Abstract: Pedestrian detection is an essential task in autonomous driving research. In addition to typical color images, thermal images benefit the detection in dark environments. Hence, it is worthwhile to explore an integrated approach to take advantage of both color and thermal images simultaneously. In this paper, we propose a novel approach to fuse color and thermal sensors using deep neural networks (DNN). Current state-of-the-art DNN object detectors vary from two-stage to one-stage mechanisms. Two-stage detectors, like Faster-RCNN, achieve higher accuracy, while one-stage detectors such as Single Shot Detector (SSD) demonstrate faster performance. To balance the trade-off, especially in the consideration of autonomous driving applications, we investigate a fusion strategy to combine two SSDs on color and thermal inputs. Traditional fusion methods stack selected features from each channel and adjust their weights. In this paper, we propose two variations of novel Gated Fusion Units (GFU), that learn the combination of feature maps generated by the two SSD middle layers. Leveraging GFUs for the entire feature pyramid structure, we propose several mixed versions of both stack fusion and gated fusion. Experiments are conducted on the KAIST multispectral pedestrian detection dataset. Our Gated Fusion Double SSD (GFD-SSD) outperforms the stacked fusion and achieves the lowest miss rate in the benchmark, at an inference speed that is two times faster than Faster-RCNN based fusion networks.

34 citations

Patent
03 Nov 2010
TL;DR: In this article, the authors proposed a method for removing noise in multiview video by multiple cameras setup, which consists of several steps which are normalizing the colour and intensity of the images; then choosing a reference image; reducing temporal noise for each channel motion compensation or frame averaging independently; mapping each pixel in the reference camera to the other camera views; determining the visibility of the corresponding pixel, after mapped to other images by comparing the depth value; checking RGB range of the candidates with the corresponding pixels within the visible observations, then among stored RGB values from the visible regions of a
Abstract: This invention relates to method for removing noise in multiview video by multiple cameras setup. This method comprises several steps which are normalizing the colour and intensity of the images; then choosing a reference image; reducing temporal noise for each channel motion compensation or frame averaging independently; mapping each pixel in the reference camera to the other camera views; determining the visibility of the corresponding pixel, after mapped to the other images by comparing the depth value; checking RGB range of the candidates with the corresponding pixel within the visible observations, then among stored RGB values from the visible regions of a pixel in the reference view, getting the median value and assigning this value to the reference pixel and all the other pixels matched to the reference pixel after mapping through depth map; and repeating the said steps until all of the pixels in each view are visited.

34 citations

Journal ArticleDOI
Xin Wu, Danfeng Hong, Pedram Ghamisi, Wei Li, Ran Tao 
TL;DR: A novel approach, entitled multi-scale and rotation-insensitive convolutional channel features (MsRi-CCF), is proposed for geospatial object detection by integrating robust low-level feature generation, classifier generation with outlier removal, and detection with a power law, which yields better detection results.
Abstract: Geospatial object detection is a fundamental but challenging problem in the remote sensing community. Although deep learning has shown its power in extracting discriminative features, there is still room for improvement in its detection performance, particularly for objects with large ranges of variations in scale and direction. To this end, a novel approach, entitled multi-scale and rotation-insensitive convolutional channel features (MsRi-CCF), is proposed for geospatial object detection by integrating robust low-level feature generation, classifier generation with outlier removal, and detection with a power law. The low-level feature generation step consists of rotation-insensitive and multi-scale convolutional channel features, which were obtained by learning a regularized convolutional neural network (CNN) and integrating multi-scaled convolutional feature maps, followed by the fine-tuning of high-level connections in the CNN, respectively. Then, these generated features were fed into AdaBoost (chosen due to its lower computation and storage costs) with outlier removal to construct an object detection framework that facilitates robust classifier training. In the test phase, we adopted a log-space sampling approach instead of fine-scale sampling by using the fast feature pyramid strategy based on a computable power law. Extensive experimental results demonstrate that compared with several state-of-the-art baselines, the proposed MsRi-CCF approach yields better detection results, with 90.19% precision with the satellite dataset and 81.44% average precision with the NWPU VHR-10 datasets. Importantly, MsRi-CCF incurs no additional computational cost, which is only 0.92 s and 0.7 s per test image on the two datasets. Furthermore, we determined that most previous methods fail to gain an acceptable detection performance, particularly when they face several obstacles, such as deformations in objects (e.g., rotation, illumination, and scaling). Yet, these factors are effectively addressed by MsRi-CCF, yielding a robust geospatial object detection method.

34 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
86% related
Image processing
229.9K papers, 3.5M citations
85% related
Feature (computer vision)
128.2K papers, 1.7M citations
85% related
Image segmentation
79.6K papers, 1.8M citations
85% related
Convolutional neural network
74.7K papers, 2M citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202216
2021559
2020643
2019696
2018613
2017496