scispace - formally typeset

Journal ArticleDOI

Classification of sensor independent point cloud data of building objects using random forests

01 Jan 2019-Journal of building engineering (Elsevier)-Vol. 21, pp 468-477

TL;DR: A generic approach to automatically identify structural elements for the purposes of Scan-to-BIM by taking a set of planar primitives that are pre-segmented from the point cloud.

AbstractThe Architectural, Engineering and Construction (AEC) industry is looking to integrate Building Information Modeling (BIM) for existing buildings. Currently these as-built models are created manually, which is time-consuming. An important step in the automated Scan-to-BIM procedure is the interpretation and classification of point cloud data. This is computationally challenging due to the sheer size of point cloud data for an entire building. Additionally, the variety of objects makes classification problematic. Existing methods integrate prior knowledge from the sensors or environment to improve the results. However, these approaches are therefore often case specific and thus have limited applicability. The goal of this research is to provide a method that is independent of any sensor or scene within a building environment. Furthermore, our method processes the entire building simultaneously, resulting in more distinct local and contextual features. This paper presents a generic approach to automatically identify structural elements for the purposes of Scan-to-BIM. More specifically, a Random Forests classifier is employed for the classification of the floors, ceilings, roofs, walls and beams. As input, our algorithm takes a set of planar primitives that are pre-segmented from the point cloud. This significantly reduces the data while maintaining accuracy. Both contextual and geometric features are used to describe the observed patches. The algorithm is evaluated using realistic data for a wide variety of existing buildings including houses, school facilities, a factory, a castle and a church. The experiments prove that the proposed algorithm is capable of properly labeling 87% of the structural elements with an average precision of 85% in highly cluttered environments without the support of the sensors position. In future work, the classified patches will be processed by class-specific reconstruction algorithms to create BIM geometry.

Summary (2 min read)

1. Introduction

  • The implementation of Building Information Modelling (BIM) for existing buildings is gaining popularity.
  • Experiencing the advantage of BIM for new constructions, the industry now looks to implement as-built BIM.
  • These as-built models store an immense amount of information about a building at the varying stages of the construction’s life cycle [1].
  • More specifically, structural elements such as floors, ceilings, roofs, walls and beams are automatically identified in existing structures.
  • 35 In Section 4 the methodology is presented.

2. Background

  • The procedure of converting point cloud data to BIM geometry is referred40 to as Scan-to-BIM.
  • Second, each cluster is provided with a class label.
  • Examples of local geometric features are the area, surface dimensions and orientation.
  • Heuristic models are based on user defined rules in a certain structure.
  • Alternatively, machine learning algorithms are60 employed such as Discriminant Analysis (DA), Decision Trees, Support Vector Machines (SVM), Neural Networks (NN), Probabilistic Graphical Models (PGM), etc. [8, 12, 13, 16, 17, 18, 19].

4.2. Model formulation

  • Each tree consists of a series of binary splits that separate the input variables.
  • The Random Forests model is trained using leave-p-out cross validation.
  • This intuitive procedural programming platform allows for flexible data processing and evaluation.
  • The classified patches are exported to the Rhinoceros model space for185 validation and further processing.

5. Experiments

  • 10 structures including houses, offices, industrial buildings and churches were used for training and testing (Fig 5).
  • The test sites were acquired under realistic conditions including clutter, occlusions, traffic, etc.
  • Ghz with 4 cores and 4 hyperthreads and 32GB RAM.
  • Over 90,000 surfaces were computed for the projects.
  • All 17 predictors from table 1 were considered for the classification of the observations.

5.1. Performance

  • The classification results are depicted in the confusion matrices in Fig.7.
  • This is very accurate given the large variety of buildings and objects that were evaluated.
  • Increased confusion rates are observed between the walls and clutter classes as well as the ceiling and roof classes.
  • This is due to the fact that several data sets do not have roofs making the top ceilings harder to230 interpret (Fig.8b).
  • Several misclassifications are due to their sensor independent approach.

5.2. Comparison

  • The authors compared the results of the Random Forests classifier with other common machine learning methods.
  • Table 2 depicts the results of the model performance for K-Nearest Neighbours (KNN), a multiceptron Neural Network (NN), Support Vector Machines (SVM) and boosted decision trees.
  • All models240 were tested with the same predictors and data as the proposed model.
  • This proves that the used predictors are both distinct and robust for the detection of structural elements in cluttered and noisy environments.
  • Since their approach focusses on post-processing applications, the training time is of lesser concern.

6. Discussion and Conclusion

  • More specifically, the data is pre-segmented and processed by machine learning algorithms to label the floors, ceilings, roofs, beams, walls and clutter in noisy and occluded environ-255 ments.
  • This allows for the processing of larger data sets and provides additional features.
  • Some classes underperform due to the large variance in feature values within the265 class.
  • This will enhance the current classification and allows for the processing of non-planar classes such as cylindrical beams280 and pipes as well as furniture.

Did you find this useful? Give us your feedback

...read more

Content maybe subject to copyright    Report

Classification of Sensor Independent Point Cloud
Data of Building Objects using Random Forests
Maarten Bassier
1
, Maarten Vergauwen
1
,
Bjorn Van Genechten
2
1
Dept. of Civil Engineering, TC Construction - Geomatics
KU Leuven
Belgium
2
Leica Geosystems, Hexagon, Brussels, Belgium
{maarten.bassier, maarten.vergauwen}@kuleuven.be
Bjorn.VanGenechten@leica-geosystems.com
http://iiw.kuleuven.be/onderzoek/geomatics
http://www.leica-geosystems.be/
Abstract
The Architectural, Engineering and Construction (AEC) industry is looking to
integrate Building Information Modelling (BIM) for existing buildings. Cur-
rently these as-built models are created manually which is time-consuming.
An important step in the automated Scan-to-BIM procedure is the interpreta-
tion and classification of point cloud data. This is computationally challenging
due to the sheer size of point cloud data of an entire building. Additionally,
the variety of objects makes classification problematic. Existing methods focus
on specific sensors or environments to improve their results. The goal of this
research is to provide a method that is sensor independent and labels entire
buildings at once.
This paper presents a method to automatically identify structural elements
for the purposes of Scan-to-BIM. More specifically, a Random Forests classi-
fier is employed for the classification of the floors, ceilings, roofs, walls and
beams. First, the point cloud is pre-segmented into planar primitives. This
significantly reduces the data while maintaining accuracy. Both contextual
and geometric features are used to describe the observed patches. By pre-
Preprint submitted to Journal of Building Engineering September 20, 2017

segmenting the data, more distinct features can be extracted from the input
information. The algorithm is evaluated using realistic data of a wide variety
of existing buildings including houses, school facilities, a factory, a castle and
a church. The experiments prove that the proposed algorithm is capable of
labelling structural elements with reported precisions of 85% and 87% recall
in highly cluttered environments. In future work, the classified patches are
processed by class-specific reconstruction algorithms to create BIM geometry.
Keywords: Classification, Semantic Labelling, Scan-to-BIM,
Pre-segmentation, Point Clouds, Building
1. Introduction
The implementation of Building Information Modelling (BIM) for existing
buildings is gaining popularity. Experiencing the advantage of BIM for new
constructions, the industry now looks to implement as-built BIM. The need for
resource eciency, planning and communication have prompted stakeholders5
to adopt intelligent models that reflect the state of the asset as it was built to
properly manage their data. These as-built models store an immense amount
of information about a building at the varying stages of the constructions life
cycle [1]. This metric and non-metric information allows the dierent parties
in the construction process to better operate, analyse and evaluate the asset [2].10
As-built BIM is currently being employed for documentation, maintenance,
quality control, etc. [3].
The production of BIM models with as-built conditions is labour inten-
sive and error prone. The existing documentation of the asset is often sparse
or non-existent. Moreover, it often does not match the as-design model of the15
building due to construction changes or renovations [4, 5]. Current procedures
rely on dense spatial measurements for the modelling of the geometry. Typi-
cally, 3D point cloud data is acquired by 3D laser scanners or photogrammetry
[6]. The automated reconstruction of BIM objects is still ongoing research [7].
A key step in the workflow is the identification of observations of structural20
2

elements. Several machine learning algorithms have been proposed to classify
building objects. However, these methods often rely on specific sensors or en-
vironments [8, 9]. The goal of this research is to extend the existing approaches
to be applicable to point cloud data of any source that includes both interior
and exterior environments of a wide variety of buildings.25
This paper presents an automated solution for the classification of point
cloud data for the purposes of as-built BIM. More specifically, structural ele-
ments such as floors, ceilings, roofs, walls and beams are automatically iden-
tified in existing structures. Typically, these structures have a wide variety of
elements, are heavily cluttered and have problems with occlusion as depicted30
in Fig. 1. Machine learning techniques are proposed for the labelling of the
elements. The scope of this research is focussed on the processing of metric
information as it is the most consistent in point cloud data.
The remainder of this work is structured as follows. Section 2 presents a
background in as-built modelling. In section 3 the related work is discussed.35
In Section 4 the methodology is presented. The test design and experimental
results are proposed in Section 5. Finally, the conclusions are presented in
Section 6.
2. Background
The procedure of converting point cloud data to BIM geometry is referred40
to as Scan-to-BIM. Most automated workflows consists of three consecutive
steps: Segmentation, classification and reconstruction. First, the point cloud is
segmented into point clusters. It is considered as an instance of unsupervised
pattern recognition and can be solved by several machine learning techniques
[10]. Planar clusters are often proposed [11, 12] but other primitives are not45
excluded.
Second, each cluster is provided with a class label. This step, referred to
as classification, is considered an instance of supervised learning that identi-
3

Figure 1: Example point clouds of structures used during testing: Chemical facility (a), house (b),
multi-storey school building (c) and a church (d).
4

fies the class of new observations given a set of explanatory variables known
as features [10, 13]. Commonly in literature both geometric and contextual50
features of the observations are employed [14, 15]. Examples of local geo-
metric features are the area, surface dimensions and orientation. Geometric
contextual features may describe the similarity, proximity, coplanarity and or-
thogonality. The set of features, grouped in a feature vector, is processed by
a pre-trained classification model to predict the labels. These functions are55
referred to as classifiers. Both heuristics and machine learning algorithms
have been proposed for classification. Heuristic models are based on user
defined rules in a certain structure. These rules require no training of the
model parameters as they are intuitively set. While being very ecient, heuris-
tics are typically case specific. Alternatively, machine learning algorithms are60
employed such as Discriminant Analysis (DA), Decision Trees, Support Vec-
tor Machines (SVM), Neural Networks (NN), Probabilistic Graphical Models
(PGM), etc. [8, 12, 13, 16, 17, 18, 19]. The parameters of these models are
learned from known observations. Machine learning methods often generalise
better than heuristics but require extensive training data to work adequately.65
Currently, classification is used in a wide variety of applications such as navi-
gation, object recognition and remote sensing.
In the third step, the labelled clusters are processed by class-specific recon-
struction algorithms that create the BIM objects. Once the initial geometry has
been constructed, the topology of the objects is adjusted to create a realistic70
BIM model. Afterwards, non-metric properties such as materials are added to
the individual elements.
3. Related Work
Semantic labelling of building geometry has been a major research topic
over the last decade. Both imagery and LIght Detection And Ranging (LIDAR)75
data are considered for the classification of structures. The terrestrial applica-
tions are split into the processing of indoor and outdoor scenery. Most indoor
5

Figures (10)
Citations
More filters

Journal ArticleDOI
TL;DR: A novel method for reconstructing parametric, volumetric, multi-story building models from unstructured, unfiltered indoor point clouds with oriented normals by means of solving an integer linear optimization problem.
Abstract: We present a novel method for reconstructing parametric, volumetric, multi-story building models from unstructured, unfiltered indoor point clouds with oriented normals by means of solving an integer linear optimization problem. Our approach overcomes limitations of previous methods in several ways: First, we drop assumptions about the input data such as the availability of separate scans as an initial room segmentation. Instead, a fully automatic room segmentation and outlier removal is performed on the unstructured point clouds. Second, restricting the solution space of our optimization approach to arrangements of volumetric wall entities representing the structure of a building enforces a consistent model of volumetric, interconnected walls fitted to the observed data instead of unconnected, paper-thin surfaces. Third, we formulate the optimization as an integer linear programming problem which allows for an exact solution instead of the approximations achieved with most previous techniques. Lastly, our optimization approach is designed to incorporate hard constraints which were difficult or even impossible to integrate before. We evaluate and demonstrate the capabilities of our proposed approach on a variety of complex real-world point clouds.

72 citations


Cites methods from "Classification of sensor independen..."

  • ...[31] employ a machine learning approach to classify structural elements such as walls, floors, ceilings, and beams in point cloud data....

    [...]


Journal ArticleDOI
TL;DR: This paper aims to develop, explore and validate reliable and efficient automated procedures for the classification of 3D data (point clouds or polygonal mesh models) of heritage scenarios and demonstrates that the proposed approach is reliable and replicable and it is effective for restoration and documentation purposes.
Abstract: In recent years, the use of 3D models in cultural and archaeological heritage for documentation and dissemination purposes is increasing. The association of heterogeneous information to 3D data by means of automated segmentation and classification methods can help to characterize, describe and better interpret the object under study. Indeed, the high complexity of 3D data along with the large diversity of heritage assets themselves have constituted segmentation and classification methods as currently active research topics. Although machine learning methods brought great progress in this respect, few advances have been developed in relation to cultural heritage 3D data. Starting from the existing literature, this paper aims to develop, explore and validate reliable and efficient automated procedures for the classification of 3D data (point clouds or polygonal mesh models) of heritage scenarios. In more detail, the proposed solution works on 2D data (“texture-based” approach) or directly on the 3D data (“geometry-based approach) with supervised or unsupervised machine learning strategies. The method was applied and validated on four different archaeological/architectural scenarios. Experimental results demonstrate that the proposed approach is reliable and replicable and it is effective for restoration and documentation purposes, providing metric information e.g. of damaged areas to be restored.

51 citations


Cites background from "Classification of sensor independen..."

  • ...2, is one of the most used supervised learning algorithms for classification problem [63,64]....

    [...]


Journal ArticleDOI
Abstract: Growing climate change challenges and increasingly strict sustainability standards have led to a significant growth in the need for building refurbishment projects which are essentially focused on retrofitting in order to make them low carbon, energy efficient and environmentally friendly. The Waste and Resources Action Programme (WRAP) suggested that Building Information Modelling (BIM) should be used to achieve sustainability requirements during refurbishment projects as a correspondence to the National Audit Office (NAO) sustainability report. BIM is now widely advocated as the preferred tool for the management and co-ordination of design and construction data using object- oriented principles. The successful integration of environmental assessment into BIM for the whole of the construction lifecycle has not yet been achieved. The potential for using BIM in refurbishment projects specifically for achieving and managing sustainability requirements has not been yet critically reviewed or put into practice. This paper focuses on the use of BIM sustainability design tools in refurbishment projects, to achieve energy efficient buildings and achieve sustainability criteria for refurbishing non-domestic buildings. A critical lens is cast on the current literature in the domains of sustainable designs and the associated implications of the sustainability decision-support tools in BIM. The research also reviews the practicality of the existing sustainability decision-support tools that are currently used to assist with achieving environmental scheme certifications such as BREEAM and LEED for refurbishment projects.

28 citations


Journal ArticleDOI
TL;DR: A full-fledged laser scanning framework for geometric data acquisition, comprising the entire spectrum from planning, surveying and data analysis is introduced, that details the necessary steps to acquire a point cloud that is applicable to BIM modelling.
Abstract: Laser scanning, as a rising topic within the Architecture, Engineering and Construction (AEC) industry, has been increasing both in importance and practice as a means of gathering in-situ geometric data. Several studies have covered possible applications of this technology, from construction monitoring to damage assessment, with Building Information Modelling (BIM) being one of its focus. Despite this, to present, no research was found to fully explore the laser scanning survey process, with most studies either focusing the process after the point cloud acquisition or after its conversion to BIM. To help fill this knowledge gap, the present article introduces a full-fledged laser scanning framework for geometric data acquisition, comprising the entire spectrum from planning, surveying and data analysis. The result is a framework that details the necessary steps to acquire a point cloud that is applicable to BIM modelling. The framework is validated through its application to a recently renewed bus station in Porto, Portugal. Relevant conclusions regarding setting selection, station positioning, optimization, point cloud decimation and treatment, required resolution, along other topics, are drawn through laboratory tests and the previously mentioned case study.

26 citations


Journal ArticleDOI
TL;DR: A summary of the efforts of the past ten years in automating the digital modeling of existing buildings by applying reality capture devices and computer vision algorithms, with a particular focus on object recognition methods.
Abstract: Digital building representations enable and promote new forms of simulation, automation, and information sharing. However, creating and maintaining these representations is prohibitively expensive. In an effort to make the adoption of this technology easier, researchers have been automating the digital modeling of existing buildings by applying reality capture devices and computer vision algorithms. This article is a summary of the efforts of the past ten years, with a particular focus on object recognition methods. We rectify three limitations of existing review articles by describing the general structure and variations of object recognition systems and performing an extensive and quantitative comparative performance evaluation. The coverage of building component classes (i.e. semantic coverage) and recognition performances are reported in-depth and framed using a building taxonomy. Research programs demonstrate sparse semantic coverage with a clear bias towards recognizing floor, wall, ceiling, door, and window classes. Comprehensive semantic coverage of building infrastructure will require a radical scaling and diversification of efforts.

22 citations


References
More filters

Journal ArticleDOI
01 Oct 2001
TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Abstract: Random forests are a combination of tree predictors such that each tree depends on the values of a random vector sampled independently and with the same distribution for all trees in the forest. The generalization error for forests converges a.s. to a limit as the number of trees in the forest becomes large. The generalization error of a forest of tree classifiers depends on the strength of the individual trees in the forest and the correlation between them. Using a random selection of features to split each node yields error rates that compare favorably to Adaboost (Y. Freund & R. Schapire, Machine Learning: Proceedings of the Thirteenth International conference, aaa, 148–156), but are more robust with respect to noise. Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the splitting. Internal estimates are also used to measure variable importance. These ideas are also applicable to regression.

58,232 citations


Book
01 Oct 2004
TL;DR: Introduction to Machine Learning is a comprehensive textbook on the subject, covering a broad array of topics not usually included in introductory machine learning texts, and discusses many methods from different fields, including statistics, pattern recognition, neural networks, artificial intelligence, signal processing, control, and data mining.
Abstract: The goal of machine learning is to program computers to use example data or past experience to solve a given problem. Many successful applications of machine learning exist already, including systems that analyze past sales data to predict customer behavior, optimize robot behavior so that a task can be completed using minimum resources, and extract knowledge from bioinformatics data. Introduction to Machine Learning is a comprehensive textbook on the subject, covering a broad array of topics not usually included in introductory machine learning texts. In order to present a unified treatment of machine learning problems and solutions, it discusses many methods from different fields, including statistics, pattern recognition, neural networks, artificial intelligence, signal processing, control, and data mining. All learning algorithms are explained so that the student can easily move from the equations in the book to a computer program. The text covers such topics as supervised learning, Bayesian decision theory, parametric methods, multivariate methods, multilayer perceptrons, local models, hidden Markov models, assessing and comparing classification algorithms, and reinforcement learning. New to the second edition are chapters on kernel machines, graphical models, and Bayesian estimation; expanded coverage of statistical tests in a chapter on design and analysis of machine learning experiments; case studies available on the Web (with downloadable results for instructors); and many additional exercises. All chapters have been revised and updated. Introduction to Machine Learning can be used by advanced undergraduates and graduate students who have completed courses in computer programming, probability, calculus, and linear algebra. It will also be of interest to engineers in the field who are concerned with the application of machine learning methods. Adaptive Computation and Machine Learning series

3,947 citations


Book ChapterDOI
01 Jan 2019
Abstract: Machine learning is evolved from a collection of powerful techniques in AI areas and has been extensively used in data mining, which allows the system to learn the useful structural patterns and models from training data Machine learning algorithms can be basically classified into four categories: supervised, unsupervised, semi-supervised and reinforcement learning In this chapter, widely-used machine learning algorithms are introduced Each algorithm is briefly explained with some examples

1,664 citations


Journal ArticleDOI
TL;DR: Results show scarce BIM implementation in existing buildings yet, due to challenges of (1) high modeling/conversion effort from captured building data into semantic BIM objects, (2) updating of information in BIM and (3) handling of uncertain data, objects and relations in B IM occurring inexisting buildings.
Abstract: While BIM processes are established for new buildings, the majority of existing buildings is not maintained, refurbished or deconstructed with BIM yet. Promising benefits of efficient resource management motivate research to overcome uncertainties of building condition and deficient documentation prevalent in existing buildings. Due to rapid developments in BIM research, involved stakeholders demand a state-of-the-art overview of BIM implementation and research in existing buildings. This paper presents a review of over 180 recent publications on the topic. Results show scarce BIM implementation in existing buildings yet, due to challenges of (1) high modeling/conversion effort from captured building data into semantic BIM objects, (2) updating of information in BIM and (3) handling of uncertain data, objects and relations in BIM occurring in existing buildings. Despite fast developments and spreading standards, challenging research opportunities arise from process automation and BIM adaption to existing buildings' requirements.

1,204 citations


Posted Content
Abstract: Often we wish to predict a large number of variables that depend on each other as well as on other observed variables. Structured prediction methods are essentially a combination of classification and graphical modeling, combining the ability of graphical models to compactly model multivariate data with the ability of classification methods to perform prediction using large sets of input features. This tutorial describes conditional random fields, a popular probabilistic method for structured prediction. CRFs have seen wide application in natural language processing, computer vision, and bioinformatics. We describe methods for inference and parameter estimation for CRFs, including practical issues for implementing large scale CRFs. We do not assume previous knowledge of graphical modeling, so this tutorial is intended to be useful to practitioners in a wide variety of fields.

785 citations


Frequently Asked Questions (2)
Q1. What contributions have the authors mentioned in the paper "Classification of sensor independent point cloud data of building objects using random forests" ?

The goal of this research is to provide a method that is sensor independent and labels entire buildings at once. This paper presents a method to automatically identify structural elements for the purposes of Scan-to-BIM. The experiments prove that the proposed algorithm is capable of labelling structural elements with reported precisions of 85 % and 87 % recall in highly cluttered environments. 

In future work, the method will be investigated further to improve the labelling performance. Also, research will be performed towards the integration of probabilistic graphical models to increase the methods perfor- mance.