scispace - formally typeset
Search or ask a question
Journal ArticleDOI

TriangleConv: A Deep Point Convolutional Network for Recognizing Building Shapes in Map Space

13 Oct 2021-ISPRS international journal of geo-information (Multidisciplinary Digital Publishing Institute)-Vol. 10, Iss: 10, pp 687
TL;DR: A deep point convolutional network to recognize building shapes is proposed, which executes the convolution directly on the points of the buildings without constructing the graphs and extracting the geometric features of the points.
Abstract: The classification and recognition of the shapes of buildings in map space play an important role in spatial cognition, cartographic generalization, and map updating. As buildings in map space are often represented as the vector data, research was conducted to learn the feature representations of the buildings and recognize their shapes based on graph neural networks. Due to the principles of graph neural networks, it is necessary to construct a graph to represent the adjacency relationships between the points (i.e., the vertices of the polygons shaping the buildings), and extract a list of geometric features for each point. This paper proposes a deep point convolutional network to recognize building shapes, which executes the convolution directly on the points of the buildings without constructing the graphs and extracting the geometric features of the points. A new convolution operator named TriangleConv was designed to learn the feature representations of each point by aggregating the features of the point and the local triangle constructed by the point and its two adjacency points. The proposed method was evaluated and compared with related methods based on a dataset consisting of 5010 vector buildings. In terms of accuracy, macro-precision, macro-recall, and macro-F1, the results show that the proposed method has comparable performance with typical graph neural networks of GCN, GAT, and GraphSAGE, and point cloud neural networks of PointNet, PointNet++, and DGCNN in the task of recognizing and classifying building shapes in map space.
Citations
More filters
Journal ArticleDOI
TL;DR: This paper proposes a relation network based method for the recognization of building footprint shapes with few labeled samples, and takes the TriangleConv embedding module to act as theembedding module of the relation network.
Abstract: Buildings are important entity objects of cities, and the classification of building shapes plays an indispensable role in the cognition and planning of the urban structure. In recent years, some deep learning methods have been proposed for recognizing the shapes of building footprints in modern electronic maps. Furthermore, their performance depends on enough labeled samples for each class of building footprints. However, it is impractical to label enough samples for each type of building footprint shapes. Therefore, the deep learning methods using few labeled samples are more preferable to recognize and classify the building footprint shapes. In this paper, we propose a relation network based method for the recognization of building footprint shapes with few labeled samples. Relation network, composed of embedding module and relation module, is a metric based few-shot method which aims to learn a generalized metric function and predict the types of the new samples according to their relation with the prototypes of these few labeled samples. To better extract the shape features of the building footprints in the form of vector polygons, we have taken the TriangleConv embedding module to act as the embedding module of the relation network. We validate the effectiveness of our method based on a building footprint dataset with 10 typical shapes and compare it with three classical few-shot learning methods in accuracy. The results show that our method performs better for the classification of building footprint shapes with few labeled samples. For example, the accuracy reached 89.40% for the 2-way 5-shot classification task where there are only two classes of samples in the task and five labeled samples for each class.

5 citations

Journal ArticleDOI
TL;DR: In this paper , a metric learning model that learns similarity metrics directly from linear features is proposed to address the complexity of linear features by mapping vector lines to embeddings without format conversion or feature engineering.
Abstract: Abstract Measuring similarity is essential for classifying, clustering, retrieving, and matching linear features in geospatial data. However, the complexity of linear features challenges the formalization of characteristics and determination of the weight of each characteristic in similarity measurements. Additionally, traditional methods have limited adaptability to the variety of linear features. To address these challenges, this study proposes a metric learning model that learns similarity metrics directly from linear features. The model’s ability to learn allows no pre-determined characteristics and supports adaptability to different levels of complex linear features. LineStringNet functions as a feature encoder that maps vector lines to embeddings without format conversion or feature engineering. With a Siamese architecture, the learning process minimizes the contrastive loss, which brings similar pairs closer and pushes dissimilar pairs away in the embedding space. Finally, the proposed model calculates the Euclidean distance to measure the similarity between learned embeddings. Experiments on common linear features and building shapes indicated that the learned similarity metrics effectively supported retrieving, matching, and classifying lines and polygons, with higher precision and accuracy than traditional measures. Furthermore, the model ensures desired metric properties, including rotation and starting point invariances, by adjusting labeling strategies or preprocessing input data.

1 citations

Journal ArticleDOI
TL;DR: A novel pattern recognition and segmentation method for lines, based on deep learning and shape context descriptors, which showed that the lixel classification accuracy of the 1D-U-Net reached 90.42%, higher than either of the two existing machine learning-based segmentation methods.
Abstract: Recognizing morphological patterns in lines and segmenting them into homogeneous segments is critical for line generalization and other applications. Due to the excessive dependence on handcrafted features in existing methods and their insufficient consideration of contextual information, we propose a novel pattern recognition and segmentation method for lines, based on deep learning and shape context descriptors. In this method, a line is divided into a series of consecutive linear units of equal length, termed lixels. A grid shape context descriptor (GSCD) was designed to extract the contextual features for each lixel. A one-dimensional convolutional neural network (1D-U-Net) was constructed to classify the pattern type of each lixel, and adjacent lixels with the same pattern types were fused to obtain segmentation results. The proposed method was applied to administrative boundaries, which were segmented into components with three different patterns. The experiments showed that the lixel classification accuracy of the 1D-U-Net reached 90.42%. The consistency ratio was 92.41%, when compared with the manual segmentation results, which was higher than either of the two existing machine learning-based segmentation methods.

1 citations

Journal ArticleDOI
TL;DR: A thorough review of 34 publications on GNNs in construction, presenting a comprehensive overview of the current research landscape, is presented in this paper , where the authors identify opportunities and challenges for further advancing the application of GNN in construction.
References
More filters
Proceedings Article
01 Jan 2015
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Abstract: We introduce Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments. The method is straightforward to implement, is computationally efficient, has little memory requirements, is invariant to diagonal rescaling of the gradients, and is well suited for problems that are large in terms of data and/or parameters. The method is also appropriate for non-stationary objectives and problems with very noisy and/or sparse gradients. The hyper-parameters have intuitive interpretations and typically require little tuning. Some connections to related algorithms, on which Adam was inspired, are discussed. We also analyze the theoretical convergence properties of the algorithm and provide a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework. Empirical results demonstrate that Adam works well in practice and compares favorably to other stochastic optimization methods. Finally, we discuss AdaMax, a variant of Adam based on the infinity norm.

111,197 citations

Proceedings Article
01 Jan 2015
TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.
Abstract: In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.

49,914 citations

Journal ArticleDOI
TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.
Abstract: The ImageNet Large Scale Visual Recognition Challenge is a benchmark in object category classification and detection on hundreds of object categories and millions of images. The challenge has been run annually from 2010 to present, attracting participation from more than fifty institutions. This paper describes the creation of this benchmark dataset and the advances in object recognition that have been possible as a result. We discuss the challenges of collecting large-scale ground truth annotation, highlight key breakthroughs in categorical object recognition, provide a detailed analysis of the current state of the field of large-scale image classification and object detection, and compare the state-of-the-art computer vision accuracy with human accuracy. We conclude with lessons learned in the 5 years of the challenge, and propose future directions and improvements.

30,811 citations

Posted Content
TL;DR: A scalable approach for semi-supervised learning on graph-structured data that is based on an efficient variant of convolutional neural networks which operate directly on graphs which outperforms related methods by a significant margin.
Abstract: We present a scalable approach for semi-supervised learning on graph-structured data that is based on an efficient variant of convolutional neural networks which operate directly on graphs. We motivate the choice of our convolutional architecture via a localized first-order approximation of spectral graph convolutions. Our model scales linearly in the number of graph edges and learns hidden layer representations that encode both local graph structure and features of nodes. In a number of experiments on citation networks and on a knowledge graph dataset we demonstrate that our approach outperforms related methods by a significant margin.

15,696 citations

Proceedings Article
01 Jan 2019
TL;DR: This paper details the principles that drove the implementation of PyTorch and how they are reflected in its architecture, and explains how the careful and pragmatic implementation of the key components of its runtime enables them to work together to achieve compelling performance.
Abstract: Deep learning frameworks have often focused on either usability or speed, but not both. PyTorch is a machine learning library that shows that these two goals are in fact compatible: it was designed from first principles to support an imperative and Pythonic programming style that supports code as a model, makes debugging easy and is consistent with other popular scientific computing libraries, while remaining efficient and supporting hardware accelerators such as GPUs. In this paper, we detail the principles that drove the implementation of PyTorch and how they are reflected in its architecture. We emphasize that every aspect of PyTorch is a regular Python program under the full control of its user. We also explain how the careful and pragmatic implementation of the key components of its runtime enables them to work together to achieve compelling performance. We demonstrate the efficiency of individual subsystems, as well as the overall speed of PyTorch on several commonly used benchmarks.

10,045 citations