Showing papers in "Computer Vision and Image Understanding in 2022"

PDF

Open Access

Journal Article•DOI•

[...]

Fei Chen, Xiaodong Wang, Yunxiang Zhao, Shaohe Lv, Xin Niu - Show less +1 more

01 Jul 2022-Computer Vision and Image Understanding

TL;DR: A comprehensive overview of state-of-the-art tracking frameworks including both deep and non-deep trackers is provided in this article , where the authors present both quantitative and qualitative tracking results of various trackers on five benchmark datasets.

...read moreread less

39 citations

Journal Article•DOI•

Pros and cons of GAN evaluation measures: New developments

[...]

Ali Borji¹•Institutions (1)

University of Southern California¹

01 Jan 2022-Computer Vision and Image Understanding

TL;DR: Borji et al. as discussed by the authors describe new dimensions that are becoming important in assessing models (e.g. bias and fairness) and discuss the connection between GAN evaluation and deepfakes.

...read moreread less

34 citations

Journal Article•DOI•

CUFD: An encoder-decoder network for visible and infrared image fusion based on common and unique feature decomposition

[...]

Han Xu, Meiqi Gong, Xin Tian, Jun Huang, Jiayi Ma - Show less +1 more

01 Mar 2022-Computer Vision and Image Understanding

TL;DR: Wang et al. as mentioned in this paper proposed a novel method for visible and infrared image fusion by decomposing feature information, which adopts two pairs of encoder-decoder networks to implement feature map extraction and decomposition, respectively.

...read moreread less

28 citations

Journal Article•DOI•

TCLR: Temporal contrastive learning for video representation

[...]

01 Jun 2022-Computer Vision and Image Understanding

TL;DR: Tan et al. as discussed by the authors proposed a temporal contrastive learning framework consisting of two novel losses to improve upon existing contrastive self-supervised video representation learning methods, namely the local-local and global-local contrastive losses.

...read moreread less

26 citations

Journal Article•DOI•

Deep learning for deepfakes creation and detection: A survey

[...]

JOCELYN MCWHIRTER¹•Institutions (1)

Boston Medical Center¹

01 Oct 2022-Computer Vision and Image Understanding

TL;DR: In this article , a survey of algorithms used to create deepfakes and methods proposed to detect deep fakes in the literature to date is presented, along with extensive discussions on challenges, research trends and directions related to deepfake technologies.

...read moreread less

19 citations

Journal Article•DOI•

Multiple instance learning on deep features for weakly supervised object detection with extreme domain shifts

[...]

Nicolas Gonthier¹, Nicolas Gonthier², Saïd Ladjal², Yann Gousseau²•Institutions (2)

Université Paris-Saclay¹, Télécom ParisTech²

01 Jan 2022-Computer Vision and Image Understanding

TL;DR: Despite its simplicity, the proposed weakly supervised object detection method shows competitive results on a range of publicly available datasets, including paintings, watercolors, cliparts and comics and allows to quickly learn unseen visual categories.

...read moreread less

16 citations

Journal Article•DOI•

Enhanced discriminative graph convolutional network with adaptive temporal modelling for skeleton-based action recognition

[...]

Tamam Alsarhan, Usman Ali, Hongtao Lu

01 Jan 2022-Computer Vision and Image Understanding

TL;DR: Zhang et al. as mentioned in this paper proposed an adaptive temporal modelling block (ATB), which is able to flexibly capture temporal structure for skeleton-based action recognition, and fused the adaptive feature map to the graph convolutional layer to improve the capability of learning better representation.

...read moreread less

15 citations

Journal Article•DOI•

Periocular Biometrics and its Relevance to Partially Masked Faces: A Survey

[...]

Renu Sharma, Arun Ross

29 Mar 2022-Computer Vision and Image Understanding

TL;DR: The performance of face recognition systems can be negatively impacted in the presence of masks and other types of facial coverings that have become prevalent due to the COVID-19 pandemic, so the periocular region of the human face becomes an important biometric cue.

...read moreread less

14 citations

Journal Article•DOI•

Uncertainty-aware consistency regularization for cross-domain semantic segmentation

[...]

01 Aug 2022-Computer Vision and Image Understanding

TL;DR: Zhang et al. as mentioned in this paper proposed an uncertainty-aware consistency regularization method for cross-domain semantic segmentation, which introduces an uncertainty guided consistency loss with a dynamic weighting scheme by exploiting the latent uncertainty information of the target samples.

...read moreread less

12 citations

Journal Article•DOI•

BasicTAD: an Astounding RGB-Only Baseline for Temporal Action Detection

[...]

Mingdong Yang, Guo Chen, Yin-Dong Zheng, Tong Lu, Limin Wang - Show less +1 more

05 May 2022-Computer Vision and Image Understanding

TL;DR: This paper decomposes the TAD pipeline into several essential components: data sampling, backbone design, neck construction, and detection head, and yields an astounding RGB-Only baseline very close to the state-of-the-art methods with two-stream inputs.

...read moreread less

9 citations

Journal Article•DOI•

MC-Calib: A generic and robust calibration toolbox for multi-camera systems

[...]

Francois Rameau, Jinsun Park, Oleksandr Bailo, In So Kweon

01 Jan 2022-Computer Vision and Image Understanding

TL;DR: In this paper , the authors present MC-Calib, a toolbox dedicated to the calibration of complex synchronized multi-camera systems using an arbitrary number of fiducial marker-based patterns.

...read moreread less

Journal Article•DOI•

Frame-level refinement networks for skeleton-based gait recognition

[...]

Likai Wang, Jinyan Chen, Yuxin Liu

01 Jul 2022-Computer Vision and Image Understanding

TL;DR: Zhang et al. as discussed by the authors proposed a frame-level refinement network to adaptively learn specific topology in different frames and capture long-range dependencies between frames through transformer self-attention.

...read moreread less

Journal Article•DOI•

SSMTL++: Revisiting Self-Supervised Multi-Task Learning for Video Anomaly Detection

[...]

Antonio Barbalau, Radu Ionescu, Mariana-Iuliana Georgescu, Jacob Velling Dueholm, Bharathkumar Ramachandra, Kamal Nasrollahi, Fahad Shahbaz Khan, Thomas B. Moeslund, Mubarak Shah - Show less +5 more

16 Jul 2022-Computer Vision and Image Understanding

TL;DR: This work revisits the self-supervised multi-task learning framework for video anomaly detection, proposing several updates to the original method, and modernizes the 3D convolutional backbone by introducing multi-head self-attention modules, inspired by the recent success of vision transformers.

...read moreread less

Journal Article•DOI•

A survey on bias in visual datasets

[...]

01 Oct 2022-Computer Vision and Image Understanding

TL;DR: In this paper , the authors present a checklist to spot different types of bias during visual dataset collection and discuss existing attempts to collect visual datasets in a bias-aware manner, concluding that the problem of bias discovery and quantification in visual datasets is still open, and there is room for improvement in terms of both methods and the range of biases that can be addressed.

...read moreread less

Journal Article•DOI•

Deep structural information fusion for 3D object detection on LiDAR-camera system

[...]

Pei An¹, Jack Bokros², Junxiong Liang¹, Kun Yu¹, Bin Fang¹, Jie Ma¹ - Show less +2 more•Institutions (2)

Huazhong University of Science and Technology¹, Agro-Environmental Protection Institute²

01 Jan 2022-Computer Vision and Image Understanding

TL;DR: Wang et al. as discussed by the authors proposed a 3D-2D structural information fusion (SIF) for 3D object detection on LiDAR-camera system, which is based on hand-crafted 3D and 2D descriptors, generates primary structure feature, and has stable performance in outdoor scenes.

...read moreread less

Journal Article•DOI•

A survey on RGB-D datasets

[...]

Erving Goffman Pdf¹•Institutions (1)

University of Massachusetts Lowell¹

01 Sep 2022-Computer Vision and Image Understanding

TL;DR: In this paper , the authors reviewed and categorized image datasets that include depth information and grouped them into three categories: scene/objects, body, and medical, and provided an overview of the different types of sensors, depth applications, and examined trends and future directions of the usage and creation of datasets containing depth data.

...read moreread less

Journal Article•DOI•

Automatic detection and localization of thighbone fractures in X-ray based on improved deep learning method

[...]

Bin Guan, Jinkun Yao, Shao-Qing Wang, Guoshan Zhang, Yueming Zhang, Xinbo Wang, Mengxuan Wang - Show less +3 more

01 Jan 2022-Computer Vision and Image Understanding

TL;DR: Wang et al. as discussed by the authors proposed a two-stage region-based convolutional neural network for thighbone fracture detection, which achieved an AP of 88.9% and outperformed all existing methods.

...read moreread less

Journal Article•DOI•

Deep structural information fusion for 3D object detection on LiDAR–camera system

[...]

Jack Bokros¹•Institutions (1)

Agro-Environmental Protection Institute¹

01 Jan 2022-Computer Vision and Image Understanding

...read moreread less

Journal Article•DOI•

Efficient dual attention SlowFast networks for video action recognition

[...]

Dafeng Wei, Ye Tian, Liqing Wei, Hong Zhong, Siqian Chen, Shiliang Pu, Hongtao Lu - Show less +3 more

01 Jun 2022-Computer Vision and Image Understanding

TL;DR: Widafeng et al. as mentioned in this paper proposed a cross-modality dual attention fusion module named CMDA to explicitly exchange spatial-temporal information between two pathways in two-stream SlowFast networks.

...read moreread less

Journal Article•DOI•

Animal pose estimation: A closer look at the state-of-the-art, existing gaps and opportunities

[...]

Le Jiang, Caleb Lee, Divyang Teotia, Sarah Ostadabbas

01 Sep 2022-Computer Vision and Image Understanding

TL;DR: In this article , the authors summarize the recent work in animal pose estimation from computer vision perspective and highlight the challenges they face in this field, and provide an in-depth analysis of the persisting obstacles.

...read moreread less

Journal Article•DOI•

BacklitNet: A dataset and network for backlit image enhancement

[...]

Xiaoqian Lv, Shengping Zhang, Qinglin Liu, Haozhe Xie, Bineng Zhong, Huiyu Zhou - Show less +2 more

01 Mar 2022-Computer Vision and Image Understanding

TL;DR: Zhang et al. as mentioned in this paper proposed a saliency guided backlit image enhancement network, namely BacklitNet, for robust and natural restoration of backlit images, which combines a nested U-structure with bilateral grids, which enables fully extracting multi-scale saliency information and rapidly enhancing arbitrary resolution images.

...read moreread less

Journal Article•DOI•

Adaptive CNN filter pruning using global importance metric

[...]

Milton Mondal, Bishshoy Das, Sumantra Dutta Roy, Pushpendra Singh, Brejesh Lall, Shiv Dutt Joshi - Show less +2 more

01 Jul 2022-Computer Vision and Image Understanding

TL;DR: In this paper , a global filter importance based adaptive pruning (GFI-AP) method is proposed to assign importance scores to all filters based on how the network learns the input-output mapping of a dataset, which can then be compared across all the other convolutional filters.

...read moreread less

Journal Article•DOI•

Light-weight shadow detection via GCN-based annotation strategy and knowledge distillation

[...]

Urinova Feruza Uljayevna, Azizmatova Zakhro Nurmukhammad kizi, Fazliddinova Shakhodat Sirojiddin …¹•Institutions (1)

Nanjing University of Science and Technology¹

01 Feb 2022-Computer Vision and Image Understanding

TL;DR: Zhang et al. as mentioned in this paper proposed a light-weight network for real-time shadow detection, which uses graph convolutional networks to provide extra training pairs, which obtains a complete shadow mask via only several annotation scribbles.

...read moreread less

Journal Article•DOI•

Encoder and decoder network with ResNet-50 and global average feature pooling for local change detection

[...]

01 Sep 2022-Computer Vision and Image Understanding

TL;DR: In this article , a robust encoder-decoder structured deep learning network is proposed to detect the local changes in video using a combination of the feature pooling module (FPM) with a ResNet-50 encoderdecoder network.

...read moreread less

Journal Article•DOI•

Multi-person 3D pose estimation from a single image captured by a fisheye camera

[...]

Yahui Zhang, Shaodi You, Sezer Karaoglu, Theo Gevers

01 Jul 2022-Computer Vision and Image Understanding

TL;DR: Zhang et al. as discussed by the authors proposed a root regression module to estimate absolute root locations in the camera coordinate and a fisheye re-projection module without using ground-truth camera parameters to connect two branches.

...read moreread less

Journal Article•DOI•

Anti-jamming heart rate estimation using a spatial–temporal fusion network

[...]

01 Feb 2022-Computer Vision and Image Understanding

TL;DR: In this article , an anti-jamming network is proposed to improve the robustness of handling less-constrained scenarios, and a new spatial-temporal map generation mechanism is designed to enhance the spatial and temporal features representation by equivalent padding for low-quality video frame fragments.

...read moreread less

Journal Article•DOI•

SIFNet: Free-form image inpainting using color split-inpaint-fuse approach

[...]

S. M. Nadim Uddin, Yong Jung

01 May 2022-Computer Vision and Image Understanding

TL;DR: In this paper , a split-inpaint-fuse network (SIFNet) is proposed to separate the corrupted luma and chroma images using two decoupled branches in the coarse stage and a fusion sub-network for fusing the inpainted chroma and luma images into a refined image in the refinement stage.

...read moreread less

Journal Article•DOI•

Cross-modal distillation for RGB-depth person re-identification

[...]

Florian Lenhart¹•Institutions (1)

ZF Friedrichshafen¹

01 Feb 2022-Computer Vision and Image Understanding

TL;DR: In this article , the authors proposed a cross-modal attention mechanism where the gating signal from one modality can dynamically activate the most discriminant CNN filters of the other modality.

...read moreread less

Journal Article•DOI•

HSGAN: Reducing mode collapse in GANs by the latent code distance of homogeneous samples

[...]

01 Jan 2022-Computer Vision and Image Understanding

TL;DR: HSGAN as discussed by the authors alleviates the mode collapse problem by maintaining a certain distance between the latent code of the generated data and the real data, where the objective function is designed to minimize the f-divergence between the distributions of generated and real data.

...read moreread less

Journal Article•DOI•

MECCANO: A Multimodal Egocentric Dataset for Humans Behavior Understanding in the Industrial-like Domain

[...]

Francesco Ragusa, Antonino Furnari, Giovanni Maria Farinella

19 Sep 2022-Computer Vision and Image Understanding

TL;DR: A benchmark aimed to study human behavior in the considered industrial-like scenario is proposed which demon-strates that the investigated tasks and the considered scenario are challenging for state-of-the-art algorithms.

...read moreread less

Collapse