DeepPrimitive: Image decomposition by layered primitive detection

doi:10.1007/S41095-018-0128-6

Open AccessJournal ArticleDOI

DeepPrimitive: Image decomposition by layered primitive detection

Jiahui Huang, +6 more

- 23 Dec 2018 -

Computational Visual Media

- Vol. 4, Iss: 4, pp 385-397

Chats0

TLDR

This paper builds a framework to detect primitives from images in a layered manner by modifying the YOLO network, and uses an RNN with a novel loss function to equip this network with the capability to predict primitives with a variable number of parameters.

Abstract:

The perception of the visual world through basic building blocks, such as cubes, spheres, and cones, gives human beings a parsimonious understanding of the visual world. Thus, efforts to find primitive-based geometric interpretations of visual data date back to 1970s studies of visual media. However, due to the difficulty of primitive fitting in the pre-deep learning age, this research approach faded from the main stage, and the vision community turned primarily to semantic image understanding. In this paper, we revisit the classical problem of building geometric interpretations of images, using supervised deep learning tools. We build a framework to detect primitives from images in a layered manner by modifying the YOLO network; an RNN with a novel loss function is then used to equip this network with the capability to predict primitives with a variable number of parameters. We compare our pipeline to traditional and other baseline learning methods, demonstrating that our layered detection model has higher accuracy and performs better reconstruction.

Citations

PDF

Open Access

More filters

The PASCAL Visual Object Classes Challenge

Jianguo Zhang

Posted Content

ParSeNet: A Parametric Surface Fitting Network for 3D Point Clouds

Gopal Sharma, +5 more

- 26 Mar 2020 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: A novel, end-to-end trainable, deep network called ParSeNet is proposed that decomposes a 3D point cloud into parametric surface patches, including B-spline patches as well as basic geometric primitives, and allows us to represent surfaces with higher fidelity.

...read moreread less

Book ChapterDOI

ParSeNet: A Parametric Surface Fitting Network for 3D Point Clouds

Gopal Sharma, +5 more

TL;DR: ParSeNet as discussed by the authors decomposes a 3D point cloud into parametric surface patches, including B-spline patches as well as basic geometric primitives, to represent surfaces with higher fidelity.

...read moreread less

Proceedings Article

UCSG-Net -- Unsupervised Discovering of Constructive Solid Geometry Tree

Kacper Kania, +2 more

TL;DR: A model that extracts a CSG parse tree without any supervision - UCSG-Net is proposed that predicts parameters of primitives and binarizes their SDF representation through differentiable indicator function and shows that the predicted parse tree representation is interpretable and can be used in CAD software.

...read moreread less

Book ChapterDOI

Beyond Fixed Grid: Learning Geometric Image Representation with a Deformable Grid

Jun Gao, +3 more

TL;DR: DefGrid as discussed by the authors predicts location offsets of vertices of a 2-dimensional triangular grid, such that the edges of the deformed grid align with image boundaries, which can be used for unsupervised image partitioning.

...read moreread less

References

PDF

Open Access

More filters

Proceedings ArticleDOI

You Only Look Once: Unified, Real-Time Object Detection

Joseph Redmon, +3 more

TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.

...read moreread less

Journal ArticleDOI

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Shaoqing Ren, +3 more

- 01 Jun 2017 -

IEEE Transactions on Pattern Analysis an...

TL;DR: This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.

...read moreread less

Posted Content

Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks

Shaoqing Ren, +3 more

- 04 Jun 2015 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: Faster R-CNN as discussed by the authors proposes a Region Proposal Network (RPN) to generate high-quality region proposals, which are used by Fast R-NN for detection.

...read moreread less

Proceedings ArticleDOI

Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation

Kyunghyun Cho, +8 more

TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.

...read moreread less

Book ChapterDOI

SSD: Single Shot MultiBox Detector

Wei Liu, +6 more

TL;DR: The approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location, which makes SSD easy to train and straightforward to integrate into systems that require a detection component.

...read moreread less