DeepPrimitive: Image decomposition by layered primitive detection
Jiahui Huang,Jun Gao,Vignesh Ganapathi-Subramanian,Hao Su,Yin Liu,Chengcheng Tang,Leonidas J. Guibas +6 more
Reads0
Chats0
TLDR
This paper builds a framework to detect primitives from images in a layered manner by modifying the YOLO network, and uses an RNN with a novel loss function to equip this network with the capability to predict primitives with a variable number of parameters.Abstract:
The perception of the visual world through basic building blocks, such as cubes, spheres, and cones, gives human beings a parsimonious understanding of the visual world. Thus, efforts to find primitive-based geometric interpretations of visual data date back to 1970s studies of visual media. However, due to the difficulty of primitive fitting in the pre-deep learning age, this research approach faded from the main stage, and the vision community turned primarily to semantic image understanding. In this paper, we revisit the classical problem of building geometric interpretations of images, using supervised deep learning tools. We build a framework to detect primitives from images in a layered manner by modifying the YOLO network; an RNN with a novel loss function is then used to equip this network with the capability to predict primitives with a variable number of parameters. We compare our pipeline to traditional and other baseline learning methods, demonstrating that our layered detection model has higher accuracy and performs better reconstruction.read more
Citations
More filters
Posted Content
ParSeNet: A Parametric Surface Fitting Network for 3D Point Clouds
Gopal Sharma,Difan Liu,Subhransu Maji,Evangelos Kalogerakis,Siddhartha Chaudhuri,Radomír Měch +5 more
TL;DR: A novel, end-to-end trainable, deep network called ParSeNet is proposed that decomposes a 3D point cloud into parametric surface patches, including B-spline patches as well as basic geometric primitives, and allows us to represent surfaces with higher fidelity.
Book ChapterDOI
ParSeNet: A Parametric Surface Fitting Network for 3D Point Clouds
Gopal Sharma,Difan Liu,Subhransu Maji,Evangelos Kalogerakis,Siddhartha Chaudhuri,Radomír Měch +5 more
TL;DR: ParSeNet as discussed by the authors decomposes a 3D point cloud into parametric surface patches, including B-spline patches as well as basic geometric primitives, to represent surfaces with higher fidelity.
Proceedings Article
UCSG-Net -- Unsupervised Discovering of Constructive Solid Geometry Tree
TL;DR: A model that extracts a CSG parse tree without any supervision - UCSG-Net is proposed that predicts parameters of primitives and binarizes their SDF representation through differentiable indicator function and shows that the predicted parse tree representation is interpretable and can be used in CAD software.
Book ChapterDOI
Beyond Fixed Grid: Learning Geometric Image Representation with a Deformable Grid
TL;DR: DefGrid as discussed by the authors predicts location offsets of vertices of a 2-dimensional triangular grid, such that the edges of the deformed grid align with image boundaries, which can be used for unsupervised image partitioning.
References
More filters
Proceedings ArticleDOI
You Only Look Once: Unified, Real-Time Object Detection
TL;DR: Compared to state-of-the-art detection systems, YOLO makes more localization errors but is less likely to predict false positives on background, and outperforms other detection methods, including DPM and R-CNN, when generalizing from natural images to other domains like artwork.
Journal ArticleDOI
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
TL;DR: This work introduces a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals and further merge RPN and Fast R-CNN into a single network by sharing their convolutionAL features.
Posted Content
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
TL;DR: Faster R-CNN as discussed by the authors proposes a Region Proposal Network (RPN) to generate high-quality region proposals, which are used by Fast R-NN for detection.
Proceedings ArticleDOI
Learning Phrase Representations using RNN Encoder--Decoder for Statistical Machine Translation
Kyunghyun Cho,Bart van Merriënboer,Caglar Gulcehre,Dzmitry Bahdanau,Fethi Bougares,Holger Schwenk,Yoshua Bengio,Yoshua Bengio,Yoshua Bengio +8 more
TL;DR: In this paper, the encoder and decoder of the RNN Encoder-Decoder model are jointly trained to maximize the conditional probability of a target sequence given a source sequence.
Book ChapterDOI
SSD: Single Shot MultiBox Detector
Wei Liu,Dragomir Anguelov,Dumitru Erhan,Christian Szegedy,Scott Reed,Cheng-Yang Fu,Alexander C. Berg +6 more
TL;DR: The approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map location, which makes SSD easy to train and straightforward to integrate into systems that require a detection component.