DANIEL: A Deep Architecture for Automatic Analysis and Retrieval of Building Floor Plans

doi:10.1109/ICDAR.2017.76

Home
/
Papers
/
DANIEL: A Deep Architecture for Automatic Analysis and Retrieval of Building Floor Plans

Proceedings Article•DOI•

DANIEL: A Deep Architecture for Automatic Analysis and Retrieval of Building Floor Plans

Divya Sharma¹, Nitin Gupta², Chiranjoy Chattopadhyay¹, Sameep Mehta²•Institutions (2)

Indian Institute of Technology, Jodhpur¹, IBM²

01 Nov 2017-pp 420-425

TL;DR: This paper proposes Deep Architecture for fiNdIng alikE Layouts (DANIEL), a novel deep learning framework to retrieve similar floor plan layouts from repository and creation of a new complex dataset ROBIN, having three broad dataset categories with 510 real world floor plans.

read less

Abstract: Automatically finding out existing building layouts from a repository is always helpful for an architect to ensure reuse of design and timely completion of projects. In this paper, we propose Deep Architecture for fiNdIng alikE Layouts (DANIEL). Using DANIEL, an architect can search from the existing projects repository of layouts (floor plan), and give accurate recommendation to the buyers. DANIEL is also capable of recommending the property buyers, having a floor plan image, the corresponding rank ordered list of alike layouts. DANIEL is based on the deep learning paradigm to extract both low and high level semantic features from a layout image. The key contributions in the proposed approach are: (i) novel deep learning framework to retrieve similar floor plan layouts from repository; (ii) analysing the effect of individual deep convolutional neural network layers for floor plan retrieval task; and (iii) creation of a new complex dataset ROBIN (Repository Of BuildIng plaNs), having three broad dataset categories with 510 real world floor plans.We have evaluated DANIEL by performing extensive experiments on ROBIN and compared our results with eight different state-of-the-art methods to demonstrate DANIEL’s effectiveness on challenging scenarios.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Graph-based generative representation learning of semantically and behaviorally augmented floorplans

[...]

Vahid Azizi¹, Muhammad Usman², Honglu Zhou¹, Petros Faloutsos³, Petros Faloutsos², Mubbasir Kapadia¹ - Show less +2 more•Institutions (3)

Rutgers University¹, York University², Toronto Rehabilitation Institute³

24 May 2021-The Visual Computer

TL;DR: In this paper, a floorplan embedding technique that uses an attributed graph to model the floorplans' geometric information, design semantics, and behavioral features as the node and edge attributes is presented.

...read moreread less

Abstract: Floorplans are commonly used to represent the layout of buildings. Research works toward computational techniques that facilitate the design process, such as automated analysis and optimization, often using simple floorplan representations that ignore the space’s semantics and do not consider usage-related analytics. We present a floorplan embedding technique that uses an attributed graph to model the floorplans’ geometric information, design semantics, and behavioral features as the node and edge attributes. A long short-term memory (LSTM) variational autoencoder (VAE) architecture is proposed and trained to embed attributed graphs as vectors in a continuous space. A user study is conducted to evaluate the coupling of similar floorplans retrieved from the embedding space for a given input (e.g., design layout). The qualitative, quantitative, and user study evaluations show that our embedding framework produces meaningful and accurate vector representations for floorplans. Besides, our proposed model is generative. We studied and showcased its effectiveness for generating new floorplans. We also release the dataset that we have constructed. We include the design semantic attributes and simulation-generated human behavioral features for each floorplan in the dataset for further study in the community.

...read moreread less

3 citations

Book Chapter•DOI•

Supporting Architectural Design Process with FLEA

[...]

Viktor Eisenstadt¹, Christoph Lanhgenhan², Klaus-Dieter Althoff¹•Institutions (2)

University of Hildesheim¹, Technische Universität München²

26 Jun 2019

TL;DR: This paper presents an application of a distributed AI-based methodology FLEA (Find, Learn, Explain, Adapt) to the task of room configuration during the early conceptual phases of architectural design.

...read moreread less

Abstract: The artificial intelligence methods, such as case-based reasoning and artificial neural networks were already applied to the task of architectural design support in a multitude of specific approaches and tools. However, modern AI trends, such as Explainable AI (XAI), and additional features, such as providing contextual suggestions for the next step of the design process, were rarely considered an integral part of these approaches or simply not available. In this paper, we present an application of a distributed AI-based methodology FLEA (Find, Learn, Explain, Adapt) to the task of room configuration during the early conceptual phases of architectural design. The implementation of the methodology in the framework MetisCBR applies CBR-based methods for retrieval of similar floor plans to suggest possibly inspirational designs and to explain the returned results with specific explanation patterns. Furthermore, it makes use of a farm of recurrent neural networks to suggest contextually suitable next configuration steps and to present design variations that show how the designs may evolve in the future. The flexibility of FLEA allows for variational use of its components in order to activate the currently required modules only. The methodology was initialized during the basic research project Metis (funded by German Research Foundation) during which the architectural semantic search patterns and a family of corresponding floor plan representations were developed. FLEA uses these patterns and representations as the base for its semantic search, explanation, next step suggestion, and adaptation components. The methodology implementation was iteratively tested during quantitative evaluations and user studies with multiple floor plan datasets.

...read moreread less

3 citations

Journal Article•DOI•

FloorplanGAN: Vector residential floorplan adversarial generation

[...]

Ziniu Luo, Weixin Huang

01 Oct 2022-Automation in Construction

TL;DR: FloorplanGAN as discussed by the authors proposes an adversarial generative framework that combines vector generation and raster discrimination for residential floorplan generation tasks, where the floorplan is first generated in vector format with room areas as constraints and then discriminated in raster format visually using convolutional layers.

...read moreread less

3 citations

Proceedings Article•DOI•

Will Artificial Intelligence (AI) Take over the Construction World? - A Multidisciplinary Exploration

[...]

Souhail Elhouar, Elodie Hochscheid, M. Ammar Alzarrad, Chance Emanuels

01 Jul 2020

TL;DR: In this article, the authors tried to shed some light on how AI might change the face of the construction industry and tried to answer the question "will AI take over construction industry?" each from their own perspective including architectural, structural, and construction management.

...read moreread less

Abstract: The late Stephen Hawking was reported to have said, “Computers will overtake humans with AI within the next 100 years. When that happens, we need to make sure the computers have goals aligned with ours.” This statement is frightening to most, as very few people may like the idea of seeing computers take over the world. However, what can be more frightening is for those few people who like the idea to also make use of Hawking’s suggestion and find a way to make sure the computers have goals that are strictly aligned with only theirs. There is a distinguishable apprehension among people of the role AI is set to play in the future of humanity, and this apprehension is transcending disciplinary boundaries. In the particular fields related to construction, there seems to be a genuine interest in integrating AI in each phase of a project to improve quality, enhance safety, and reduce costs, but this interest is countered by a legitimate concern that many types of jobs would be lost to AI-enhanced machines. In this paper, the authors tried to shed some light on how AI might change the face of the construction industry. The authors, spanning generations and disciplines in the industry, tried to answer the question “will AI take over the construction industry?” each from their own perspective including architectural, structural, and construction management. A synopsis of the status of the application of AI in construction and related fields is first provided, and then the authors offer their individual views with respect to how they expect AI to affect their side of the industry. This paper is an effort to gain insights into the perceptions of current and future construction related professionals of the role of AI and the impact it may have on the industry. © 2020 The Authors. Published by Budapest University of Technology and Economics & Diamond Congress Ltd Peer-review under responsibility of the Scientific Committee of the Creative Construction Conference 2020.

...read moreread less

2 citations

Additional excerpts

...Classification possibilities of CNN are often exploited to retrieve examples that match a design, for assisting a designer in a situation of case-based design to find relevant cases of previous designs (like for example the DANIEL project [14])....
[...]

Book Chapter•DOI•

Attributed Paths for Layout-Based Document Retrieval

[...]

Divya Sharma¹, Gaurav Harit¹, Chiranjoy Chattopadhyay¹•Institutions (1)

Indian Institute of Technology, Jodhpur¹

18 Dec 2018-Communications in computer and information science

TL;DR: This paper proposes a new way of representing layout, which is called attributed paths, which admits a string edit distance based match measure and shows that layout based retrieval using attributed paths is computationally efficient and more effective.

...read moreread less

Abstract: A document is rich in its layout. The entities of interest can be scattered over the document page. Traditional layout matching has involved modeling layout structure as grids, graphs, and spatial histograms of patches. In this paper we propose a new way of representing layout, which we call attributed paths. This representation admits a string edit distance based match measure. Our experiments show that layout based retrieval using attributed paths is computationally efficient and more effective. It also offers flexibility in tuning the match criterion. We have demonstrated effectiveness of attributed paths in performing layout based retrieval tasks on datasets of floor plan images [14] and journal pages [1].

...read moreread less

2 citations

1
…
2
3
4
5
6
7
8
…
9

Collapse

References

PDF

Open Access

More filters

Proceedings Article•

ImageNet Classification with Deep Convolutional Neural Networks

[...]

Alex Krizhevsky¹, Ilya Sutskever¹, Geoffrey E. Hinton¹•Institutions (1)

University of Toronto¹

03 Dec 2012

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Abstract: We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implementation of the convolution operation. To reduce overriding in the fully-connected layers we employed a recently-developed regularization method called "dropout" that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry.

...read moreread less

73,978 citations

Proceedings Article•DOI•

Histograms of oriented gradients for human detection

[...]

Navneet Dalal¹, Bill Triggs¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

20 Jun 2005

TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.

...read moreread less

Abstract: We study the question of feature sets for robust visual object recognition; adopting linear SVM based human detection as a test case. After reviewing existing edge and gradient based descriptors, we show experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection. We study the influence of each stage of the computation on performance, concluding that fine-scale gradients, fine orientation binning, relatively coarse spatial binning, and high-quality local contrast normalization in overlapping descriptor blocks are all important for good results. The new approach gives near-perfect separation on the original MIT pedestrian database, so we introduce a more challenging dataset containing over 1800 annotated human images with a large range of pose variations and backgrounds.

...read moreread less

31,952 citations

"DANIEL: A Deep Architecture for Aut..." refers methods in this paper

...We have shown the P-R plot for only proposed ROBIN Dataset, as for SESYD dataset our proposed and some of the existing state-of-the-art techniques [2], [3], [15] yielded flat PR curve (Precision value 1 for all Recall values)....
[...]
...Hence, when the same features (HOG, SIFT, RLH) are used under a canonical CBIR paradigm, they yield superior results than OASIS (see Tab....
[...]
...Several generic image retrieval systems were also proposed in the past [13], [14], where features like Histogram of Oriented Gradients (HOG) [15], Bag of Features (BOF) [16], Local Binary Pattern (LBP) [17] and Run-Length Histogram [18] has been used....
[...]

Posted Content•

Caffe: Convolutional Architecture for Fast Feature Embedding

[...]

Yangqing Jia¹, Evan Shelhamer², Jeff Donahue², Sergey Karayev², Jonathan Long², Ross Girshick², Sergio Guadarrama², Trevor Darrell² - Show less +4 more•Institutions (2)

Google¹, University of California, Berkeley²

20 Jun 2014-arXiv: Computer Vision and Pattern Recognition

TL;DR: Caffe as discussed by the authors is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.

...read moreread less

Abstract: Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models. The framework is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures. Caffe fits industry and internet-scale media needs by CUDA GPU computation, processing over 40 million images a day on a single K40 or Titan GPU ($\approx$ 2.5 ms per image). By separating model representation from actual implementation, Caffe allows experimentation and seamless switching among platforms for ease of development and deployment from prototyping machines to cloud environments. Caffe is maintained and developed by the Berkeley Vision and Learning Center (BVLC) with the help of an active community of contributors on GitHub. It powers ongoing research projects, large-scale industrial applications, and startup prototypes in vision, speech, and multimedia.

...read moreread less

12,531 citations

Journal Article•DOI•

Speeded-Up Robust Features (SURF)

[...]

Herbert Bay¹, Andreas Ess¹, Tinne Tuytelaars², Luc Van Gool¹•Institutions (2)

ETH Zurich¹, Katholieke Universiteit Leuven²

01 Jun 2008-Computer Vision and Image Understanding

TL;DR: A novel scale- and rotation-invariant detector and descriptor, coined SURF (Speeded-Up Robust Features), which approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, yet can be computed and compared much faster.

...read moreread less

12,449 citations

"DANIEL: A Deep Architecture for Aut..." refers background in this paper

...Rotation and translation invariant features for symbol spotting in documents were proposed in [22], and [23]....
[...]

Proceedings Article•DOI•

Caffe: Convolutional Architecture for Fast Feature Embedding

[...]

Yangqing Jia¹, Evan Shelhamer², Jeff Donahue², Sergey Karayev², Jonathan Long², Ross Girshick², Sergio Guadarrama², Trevor Darrell² - Show less +4 more•Institutions (2)

Google¹, University of California, Berkeley²

03 Nov 2014

TL;DR: Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.

...read moreread less

Abstract: Caffe provides multimedia scientists and practitioners with a clean and modifiable framework for state-of-the-art deep learning algorithms and a collection of reference models. The framework is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures. Caffe fits industry and internet-scale media needs by CUDA GPU computation, processing over 40 million images a day on a single K40 or Titan GPU (approx 2 ms per image). By separating model representation from actual implementation, Caffe allows experimentation and seamless switching among platforms for ease of development and deployment from prototyping machines to cloud environments.Caffe is maintained and developed by the Berkeley Vision and Learning Center (BVLC) with the help of an active community of contributors on GitHub. It powers ongoing research projects, large-scale industrial applications, and startup prototypes in vision, speech, and multimedia.

...read moreread less

10,161 citations