Open AccessPosted Content
OSCAR-Net: Object-centric Scene Graph Attention for Image Attribution.
Reads0
Chats0
TLDR
OSCAR-Net as mentioned in this paper constructs a scene graph representation that attends to fine-grained changes of every object's visual appearance and their spatial relationships to match images back to trusted sources.Abstract:
Images tell powerful stories but cannot always be trusted. Matching images back to trusted sources (attribution) enables users to make a more informed judgment of the images they encounter online. We propose a robust image hashing algorithm to perform such matching. Our hash is sensitive to manipulation of subtle, salient visual details that can substantially change the story told by an image. Yet the hash is invariant to benign transformations (changes in quality, codecs, sizes, shapes, etc.) experienced by images during online redistribution. Our key contribution is OSCAR-Net (Object-centric Scene Graph Attention for Image Attribution Network); a robust image hashing model inspired by recent successes of Transformers in the visual domain. OSCAR-Net constructs a scene graph representation that attends to fine-grained changes of every object's visual appearance and their spatial relationships. The network is trained via contrastive learning on a dataset of original and manipulated images yielding a state of the art image hash for content fingerprinting that scales to millions of images.read more
Citations
More filters
Proceedings ArticleDOI
A Self-Supervised Descriptor for Image Copy Detection
TL;DR: SSCD as discussed by the authors adapts a self-supervised contrastive training objective to the copy detection task by changing the architecture and training objective, including a pooling operator from the instance matching literature, and adapting contrastive learning to augmentations that combine images.
Journal ArticleDOI
A survey of visual neural networks: current trends, challenges and opportunities
Ping Feng,Zhenjun Tang +1 more
Proceedings ArticleDOI
ARIA: Adversarially Robust Image Attribution for Content Provenance
TL;DR: In this article , robust contrastive learning is proposed to prevent imperceptible adversarial attacks on deep visual fingerprinting models, which is shown to be robust to adversarial examples.
Book ChapterDOI
RepMix: Representation Mixing for Robust Attribution of Synthesized Images
TL;DR: Bui et al. as discussed by the authors proposed RepMix, a GAN fingerprinting technique based on representation mixing and a novel loss, which can trace the provenance of GAN-generated images invariant to the semantic content of the image and robust to perturbations.
References
More filters
Proceedings ArticleDOI
Deep Residual Learning for Image Recognition
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings Article
Attention is All you Need
Ashish Vaswani,Noam Shazeer,Niki Parmar,Jakob Uszkoreit,Llion Jones,Aidan N. Gomez,Lukasz Kaiser,Illia Polosukhin +7 more
TL;DR: This paper proposed a simple network architecture based solely on an attention mechanism, dispensing with recurrence and convolutions entirely and achieved state-of-the-art performance on English-to-French translation.
Proceedings ArticleDOI
Mask R-CNN
TL;DR: This work presents a conceptually simple, flexible, and general framework for object instance segmentation, which extends Faster R-CNN by adding a branch for predicting an object mask in parallel with the existing branch for bounding box recognition.
Proceedings ArticleDOI
FaceNet: A unified embedding for face recognition and clustering
TL;DR: A system that directly learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure offace similarity, and achieves state-of-the-art face recognition performance using only 128-bytes perface.
Journal ArticleDOI
Visual pattern recognition by moment invariants
TL;DR: It is shown that recognition of geometrical patterns and alphabetical characters independently of position, size and orientation can be accomplished and it is indicated that generalization is possible to include invariance with parallel projection.