scispace - formally typeset

Posted Content

ShapeNet: An Information-Rich 3D Model Repository

09 Dec 2015-arXiv: Graphics-

TL;DR: ShapeNet contains 3D models from a multitude of semantic categories and organizes them under the WordNet taxonomy, a collection of datasets providing many semantic annotations for each 3D model such as consistent rigid alignments, parts and bilateral symmetry planes, physical sizes, keywords, as well as other planned annotations.
Abstract: We present ShapeNet: a richly-annotated, large-scale repository of shapes represented by 3D CAD models of objects. ShapeNet contains 3D models from a multitude of semantic categories and organizes them under the WordNet taxonomy. It is a collection of datasets providing many semantic annotations for each 3D model such as consistent rigid alignments, parts and bilateral symmetry planes, physical sizes, keywords, as well as other planned annotations. Annotations are made available through a public web-based interface to enable data visualization of object attributes, promote data-driven geometric analysis, and provide a large-scale quantitative benchmark for research in computer graphics and vision. At the time of this technical report, ShapeNet has indexed more than 3,000,000 models, 220,000 models out of which are classified into 3,135 categories (WordNet synsets). In this report we describe the ShapeNet effort as a whole, provide details for all currently available datasets, and summarize future plans.
Topics: WordNet (58%), Data visualization (52%), Computer graphics (50%), Object (computer science) (50%)
Citations
More filters

Posted Content
Charles R. Qi1, Li Yi1, Hao Su2, Leonidas J. Guibas1Institutions (2)
TL;DR: A hierarchical neural network that applies PointNet recursively on a nested partitioning of the input point set and proposes novel set learning layers to adaptively combine features from multiple scales to learn deep point set features efficiently and robustly.
Abstract: Few prior works study deep learning on point sets. PointNet by Qi et al. is a pioneer in this direction. However, by design PointNet does not capture local structures induced by the metric space points live in, limiting its ability to recognize fine-grained patterns and generalizability to complex scenes. In this work, we introduce a hierarchical neural network that applies PointNet recursively on a nested partitioning of the input point set. By exploiting metric space distances, our network is able to learn local features with increasing contextual scales. With further observation that point sets are usually sampled with varying densities, which results in greatly decreased performance for networks trained on uniform densities, we propose novel set learning layers to adaptively combine features from multiple scales. Experiments show that our network called PointNet++ is able to learn deep point set features efficiently and robustly. In particular, results significantly better than state-of-the-art have been obtained on challenging benchmarks of 3D point clouds.

2,215 citations


Journal ArticleDOI
Yue Wang1, Yongbin Sun1, Ziwei Liu2, Sanjay E. Sarma1  +2 moreInstitutions (3)
TL;DR: This work proposes a new neural network module suitable for CNN-based high-level tasks on point clouds, including classification and segmentation called EdgeConv, which acts on graphs dynamically computed in each layer of the network.
Abstract: Point clouds provide a flexible geometric representation suitable for countless applications in computer graphics; they also comprise the raw output of most 3D data acquisition devices. While hand-designed features on point clouds have long been proposed in graphics and vision, however, the recent overwhelming success of convolutional neural networks (CNNs) for image analysis suggests the value of adapting insight from CNN to the point cloud world. Point clouds inherently lack topological information, so designing a model to recover topology can enrich the representation power of point clouds. To this end, we propose a new neural network module dubbed EdgeConv suitable for CNN-based high-level tasks on point clouds, including classification and segmentation. EdgeConv acts on graphs dynamically computed in each layer of the network. It is differentiable and can be plugged into existing architectures. Compared to existing modules operating in extrinsic space or treating each point independently, EdgeConv has several appealing properties: It incorporates local neighborhood information; it can be stacked applied to learn global shape properties; and in multi-layer systems affinity in feature space captures semantic characteristics over potentially long distances in the original embedding. We show the performance of our model on standard benchmarks, including ModelNet40, ShapeNetPart, and S3DIS.

1,664 citations


Proceedings ArticleDOI
Charles R. Qi1, Hao Su1, Matthias NieBner1, Angela Dai1  +2 moreInstitutions (1)
27 Jun 2016-
Abstract: 3D shape models are becoming widely available and easier to capture, making available 3D information crucial for progress in object classification. Current state-of-theart methods rely on CNNs to address this problem. Recently, we witness two types of CNNs being developed: CNNs based upon volumetric representations versus CNNs based upon multi-view representations. Empirical results from these two types of CNNs exhibit a large gap, indicating that existing volumetric CNN architectures and approaches are unable to fully exploit the power of 3D representations. In this paper, we aim to improve both volumetric CNNs and multi-view CNNs according to extensive analysis of existing approaches. To this end, we introduce two distinct network architectures of volumetric CNNs. In addition, we examine multi-view CNNs, where we introduce multiresolution filtering in 3D. Overall, we are able to outperform current state-of-the-art methods for both volumetric CNNs and multi-view CNNs. We provide extensive experiments designed to evaluate underlying design choices, thus providing a better understanding of the space of methods available for object classification on 3D data.

1,201 citations


Book ChapterDOI
Christopher Choy1, Danfei Xu1, JunYoung Gwak1, Kevin Chen1  +1 moreInstitutions (1)
08 Oct 2016-
Abstract: Inspired by the recent success of methods that employ shape priors to achieve robust 3D reconstructions, we propose a novel recurrent neural network architecture that we call the 3D Recurrent Reconstruction Neural Network (3D-R2N2). The network learns a mapping from images of objects to their underlying 3D shapes from a large collection of synthetic data [13]. Our network takes in one or more images of an object instance from arbitrary viewpoints and outputs a reconstruction of the object in the form of a 3D occupancy grid. Unlike most of the previous works, our network does not require any image annotations or object class labels for training or testing. Our extensive experimental analysis shows that our reconstruction framework (i) outperforms the state-of-the-art methods for single view reconstruction, and (ii) enables the 3D reconstruction of objects in situations when traditional SFM/SLAM methods fail (because of lack of texture and/or wide baseline).

1,193 citations


Proceedings ArticleDOI
21 Jul 2017-
TL;DR: The utility of the OctNet representation is demonstrated by analyzing the impact of resolution on several 3D tasks including 3D object classification, orientation estimation and point cloud labeling.
Abstract: We present OctNet, a representation for deep learning with sparse 3D data. In contrast to existing models, our representation enables 3D convolutional networks which are both deep and high resolution. Towards this goal, we exploit the sparsity in the input data to hierarchically partition the space using a set of unbalanced octrees where each leaf node stores a pooled feature representation. This allows to focus memory allocation and computation to the relevant dense regions and enables deeper networks without compromising resolution. We demonstrate the utility of our OctNet representation by analyzing the impact of resolution on several 3D tasks including 3D object classification, orientation estimation and point cloud labeling.

1,094 citations


Cites background from "ShapeNet: An Information-Rich 3D Mo..."

  • ...At the same time, large 3D repositories such as ModelNet [48], ShapeNet [6] or 3D Warehouse1 as well as databases of 3D object scans [7] are becoming increasingly available....

    [...]

  • ...At the same time, large 3D repositories such as ModelNet [47], ShapeNet [6] or 3D Warehouse1 as well as databases of 3D object scans [7] are becoming increasingly available....

    [...]


References
More filters

Proceedings ArticleDOI
Jia Deng1, Wei Dong1, Richard Socher1, Li-Jia Li1  +2 moreInstitutions (1)
20 Jun 2009-
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Abstract: The explosion of image data on the Internet has the potential to foster more sophisticated and robust models and algorithms to index, retrieve, organize and interact with images and multimedia data. But exactly how such data can be harnessed and organized remains a critical problem. We introduce here a new database called “ImageNet”, a large-scale ontology of images built upon the backbone of the WordNet structure. ImageNet aims to populate the majority of the 80,000 synsets of WordNet with an average of 500-1000 clean and full resolution images. This will result in tens of millions of annotated images organized by the semantic hierarchy of WordNet. This paper offers a detailed analysis of ImageNet in its current state: 12 subtrees with 5247 synsets and 3.2 million images in total. We show that ImageNet is much larger in scale and diversity and much more accurate than the current image datasets. Constructing such a large-scale database is a challenging task. We describe the data collection scheme with Amazon Mechanical Turk. Lastly, we illustrate the usefulness of ImageNet through three simple applications in object recognition, image classification and automatic object clustering. We hope that the scale, accuracy, diversity and hierarchical structure of ImageNet can offer unparalleled opportunities to researchers in the computer vision community and beyond.

31,274 citations


Journal ArticleDOI
TL;DR: The goals of the PDB are described, the systems in place for data deposition and access, how to obtain further information and plans for the future development of the resource are described.
Abstract: The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.

30,190 citations


"ShapeNet: An Information-Rich 3D Mo..." refers methods in this paper

  • ...For example, the Protein Data Bank [1] provides a database with 100K protein 3D structures, each labeled with its source and links to structural and functional annotations [15]....

    [...]


Journal ArticleDOI
George A. Miller1Institutions (1)
TL;DR: WordNet1 provides a more effective combination of traditional lexicographic information and modern computing, and is an online lexical database designed for use under program control.
Abstract: Because meaningful sentences are composed of meaningful words, any system that hopes to process natural languages as people do must have information about words and their meanings. This information is traditionally provided through dictionaries, and machine-readable dictionaries are now widely available. But dictionary entries evolved for the convenience of human readers, not for machines. WordNet1 provides a more effective combination of traditional lexicographic information and modern computing. WordNet is an online lexical database designed for use under program control. English nouns, verbs, adjectives, and adverbs are organized into sets of synonyms, each representing a lexicalized concept. Semantic relations link the synonym sets [4].

13,247 citations


"ShapeNet: An Information-Rich 3D Mo..." refers background or methods in this paper

  • ...tions: Naming objects by their basic category is useful for indexing, grouping, and linking to related sources of data. As described in the previous section, we organize ShapeNet based on the WordNet [21] taxonomy. Synsets are interlinked with various relations, such as hyper and hyponym, and part-whole relations. Due to the popularity of WordNet, we can leverage other resources linked to WordNet such...

    [...]

  • ... structures, or other domain-specific objects. However, we will include scenes (e.g., office), objects (e.g., laptop computer), and parts of objects (e.g., keyboard). Models are organized under WordNet [21] noun “synsets” (synonym sets). WordNet provides a broad and deep taxonomy with over 80K distinct synsets representing distinct noun concepts arranged as a DAG network of hyponym relationships (e.g., ...

    [...]

  • ... content is the lack of large-scale, curated datasets of 3D models that are available to the community. Motivated by the far-reaching impact of dataset efforts such as the Penn Treebank [20], WordNet [21] and ImageNet [4], which collectively have tens of thousands of citations, we propose establishing ShapeNet: a large-scale 3D model dataset. Making a comprehensive, semantically enriched shape dataset...

    [...]

  • ...organizing, and labeling large datasets in computer vision and related fields. For example, ImageNet [4] provides a set of 14M images organized into 20K categories associated with “synsets” of WordNet [21]. LabelMe provides segmentations and label annotations of hundreds of thousands of objects in tens of thousands of images [24]. The SUN dataset provides 3M annotations of objects in 4K categories appe...

    [...]


ReportDOI
TL;DR: As a result of this grant, the researchers have now published on CDROM a corpus of over 4 million words of running text annotated with part-of- speech (POS) tags, which includes a fully hand-parsed version of the classic Brown corpus.
Abstract: : As a result of this grant, the researchers have now published oil CDROM a corpus of over 4 million words of running text annotated with part-of- speech (POS) tags, with over 3 million words of that material assigned skeletal grammatical structure This material now includes a fully hand-parsed version of the classic Brown corpus About one half of the papers at the ACL Workshop on Using Large Text Corpora this past summer were based on the materials generated by this grant

7,927 citations


"ShapeNet: An Information-Rich 3D Mo..." refers methods in this paper

  • ...Motivated by the far-reaching impact of dataset efforts such as the Penn Treebank [20], WordNet [21] and ImageNet [4], which collectively have tens of thousands of citations, we propose establishing ShapeNet: a large-scale 3D model dataset....

    [...]


Proceedings ArticleDOI
Zhirong Wu1, Shuran Song1, Aditya Khosla2, Fisher Yu1  +3 moreInstitutions (3)
07 Jun 2015-
TL;DR: This work proposes to represent a geometric 3D shape as a probability distribution of binary variables on a 3D voxel grid, using a Convolutional Deep Belief Network, and shows that this 3D deep representation enables significant performance improvement over the-state-of-the-arts in a variety of tasks.
Abstract: 3D shape is a crucial but heavily underutilized cue in today's computer vision systems, mostly due to the lack of a good generic shape representation. With the recent availability of inexpensive 2.5D depth sensors (e.g. Microsoft Kinect), it is becoming increasingly important to have a powerful 3D shape representation in the loop. Apart from category recognition, recovering full 3D shapes from view-based 2.5D depth maps is also a critical part of visual understanding. To this end, we propose to represent a geometric 3D shape as a probability distribution of binary variables on a 3D voxel grid, using a Convolutional Deep Belief Network. Our model, 3D ShapeNets, learns the distribution of complex 3D shapes across different object categories and arbitrary poses from raw CAD data, and discovers hierarchical compositional part representation automatically. It naturally supports joint object recognition and shape completion from 2.5D depth maps, and it enables active object recognition through view planning. To train our 3D deep learning model, we construct ModelNet - a large-scale 3D CAD model dataset. Extensive experiments show that our 3D deep representation enables significant performance improvement over the-state-of-the-arts in a variety of tasks.

3,142 citations


"ShapeNet: An Information-Rich 3D Mo..." refers background in this paper

  • ...Scene understanding from 2D images is a grand challenge in vision that has recently benefited tremendously from 3D CAD models [28, 34]....

    [...]

  • ...Recent work demonstrated the benefit of a large dataset of 120K 3D CAD models in training a convolutional neural network for object recognition and next-best view prediction in RGB-D data [34]....

    [...]

  • ..., upright and front) for every model is important for various tasks such as visualizing shapes [13], shape classification [8] and shape recognition [34]....

    [...]


Network Information
Related Papers (5)
07 Jun 2015

Zhirong Wu, Shuran Song +5 more

27 Jun 2016

Kaiming He, Xiangyu Zhang +2 more

01 Jan 2015

Diederik P. Kingma, Jimmy Ba

Performance
Metrics
No. of citations received by the Paper in previous years
YearCitations
20223
2021617
2020756
2019527
2018367
2017192