OSCAR-Net: Object-centric Scene Graph Attention for Image Attribution.

Open AccessPosted Content

OSCAR-Net: Object-centric Scene Graph Attention for Image Attribution.

Eric Nguyen, +3 more

- 07 Aug 2021 -

arXiv: Computer Vision and Pattern Recog...

Chats0

TLDR

OSCAR-Net as mentioned in this paper constructs a scene graph representation that attends to fine-grained changes of every object's visual appearance and their spatial relationships to match images back to trusted sources.

Abstract:

Images tell powerful stories but cannot always be trusted. Matching images back to trusted sources (attribution) enables users to make a more informed judgment of the images they encounter online. We propose a robust image hashing algorithm to perform such matching. Our hash is sensitive to manipulation of subtle, salient visual details that can substantially change the story told by an image. Yet the hash is invariant to benign transformations (changes in quality, codecs, sizes, shapes, etc.) experienced by images during online redistribution. Our key contribution is OSCAR-Net (Object-centric Scene Graph Attention for Image Attribution Network); a robust image hashing model inspired by recent successes of Transformers in the visual domain. OSCAR-Net constructs a scene graph representation that attends to fine-grained changes of every object's visual appearance and their spatial relationships. The network is trained via contrastive learning on a dataset of original and manipulated images yielding a state of the art image hash for content fingerprinting that scales to millions of images.

OSCAR-Net: Object-centric Scene Graph Attention for Image Attribution.

Citations

A Self-Supervised Descriptor for Image Copy Detection

A survey of visual neural networks: current trends, challenges and opportunities

Image captioning based on scene graphs: A survey

ARIA: Adversarially Robust Image Attribution for Content Provenance

RepMix: Representation Mixing for Robust Attribution of Synthesized Images

References

Deep Residual Learning for Image Recognition

Attention is All you Need

Mask R-CNN

FaceNet: A unified embedding for face recognition and clustering

Visual pattern recognition by moment invariants

Related Papers (5)

Semantic Image Manipulation Using Scene Graphs

PasteGAN: A Semi-Parametric Method to Generate Image from Scene Graph

Scene Graph Generation With External Knowledge and Image Reconstruction

Specifying Object Attributes and Relations in Interactive Scene Generation

Aligning where to see and what to tell: image caption with region-based attention and scene factorization