scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

ASYSST: A Framework for Synopsis Synthesis Empowering Visually Impaired

TL;DR: This work proposes an end to end framework (ASYSST) for textual description synthesis from digitized building floor plans and introduces a novel Bag of Decor feature to learn $5$ classes of a room from $1355$ samples under a supervised learning paradigm.
Abstract: In an indoor scenario, the visually impaired do not have the information about the surroundings and finds it difficult to navigate from room to room. The sensor-based solutions are expensive and may not always be comfortable for the end users. In this paper, we focus on the problem of synthesis of textual description from a given floor plan image to assist the visually impaired. The textual description, in addition to a text reading software, can aid the visually impaired person while moving inside a building. In this work, for the first time, we propose an end to end framework (ASYSST) for textual description synthesis from digitized building floor plans. We have introduced a novel Bag of Decor (BoD) feature to learn $5$ classes of a room from $1355$ samples under a supervised learning paradigm. These learned labels are fed into a description synthesis framework to yield a holistic description of a floor plan image. Experimental analysis of real publicly available floor plan data-set proves the superiority of our framework.
Citations
More filters
Proceedings ArticleDOI
01 Sep 2019
TL;DR: An extensive experimental study is presented for tasks like furniture localization in a floor plan, caption and description generation, on the proposed dataset showing the utility of BRIDGE.
Abstract: In this paper, a large scale public dataset containing floor plan images and their annotations is presented. BRIDGE (Building plan Repository for Image Description Generation, and Evaluation) dataset contains more than 13000 images of the floor plan and annotations collected from various websites, as well as publicly available floor plan images in the research domain. The images in BRIDGE also has annotations for symbols, region graphs, and paragraph descriptions. The BRIDGE dataset will be useful for symbol spotting, caption and description generation, scene graph synthesis, retrieval and many other tasks involving building plan parsing. In this paper, we also present an extensive experimental study for tasks like furniture localization in a floor plan, caption and description generation, on the proposed dataset showing the utility of BRIDGE.

11 citations


Cites methods from "ASYSST: A Framework for Synopsis Sy..."

  • ...In [14], [15], authors have used handcrafted features for identifying decor symbol, room information and generating region wise caption generation....

    [...]

  • ...1) Template based: Paragraph based descriptions are generated by using technique proposed in [14]....

    [...]

01 Jan 2017
TL;DR: In this article, the authors present results of a user study into extending the functionality of an existing casebased search engine for similar architectural designs to a flexible process-oriented case-based support tool for the architectural conceptualization phase.
Abstract: This paper presents results of a user study into extending the functionality of an existing case-based search engine for similar architectural designs to a flexible process-oriented case-based support tool for the architectural conceptualization phase. Based on a research examining the target group’s (architects) thinking and working processes during the early conceptualization phase (especially during the search for similar architectural references), we identified common features for defining retrieval strategies for a more flexible case-based search for similar building designs within our system. Furthermore, we were also able to infer a definition for implementing these strategies into the early conceptualization process in architecture, that is, to outline a definition for this process as a wrapping structure for a user model. The study was conducted among the target group representatives (architects, architecture students and teaching personnel) by means of applying the paper prototyping method and Business Processing Model and Notation (BPMN). The results of this work are intended as a foundation for our upcoming research, but we also think it could be of wider interest for the case-based design research area.

6 citations

Journal ArticleDOI
TL;DR: In this paper, the authors proposed two models, description synthesis from image cue (DSIC) and transformer-based description generation (TBDG), for text generation from floor plan images.
Abstract: Image captioning is a widely known problem in the area of AI. Caption generation from floor plan images has applications in indoor path planning, real estate, and providing architectural solutions. Several methods have been explored in the literature for generating captions or semi-structured descriptions from floor plan images. Since only the caption is insufficient to capture fine-grained details, researchers also proposed descriptive paragraphs from images. However, these descriptions have a rigid structure and lack flexibility, making it difficult to use them in real-time scenarios. This paper offers two models, description synthesis from image cue (DSIC) and transformer-based description generation (TBDG), for text generation from floor plan images. These two models take advantage of modern deep neural networks for visual feature extraction and text generation. The difference between both models is in the way they take input from the floor plan image. The DSIC model takes only visual features automatically extracted by a deep neural network, while the TBDG model learns textual captions extracted from input floor plan images with paragraphs. The specific keywords generated in TBDG and understanding them with paragraphs make it more robust in a general floor plan image. Experiments were carried out on a large-scale publicly available dataset and compared with state-of-the-art techniques to show the proposed model’s superiority.

4 citations

Posted Content
TL;DR: In this paper, the authors proposed two models, Description Synthesis from Image Cue (DSIC) and Transformer Based Description Generation (TBDG), for floor plan image to text generation to fill the gaps in existing methods.
Abstract: Image captioning is a widely known problem in the area of AI. Caption generation from floor plan images has applications in indoor path planning, real estate, and providing architectural solutions. Several methods have been explored in literature for generating captions or semi-structured descriptions from floor plan images. Since only the caption is insufficient to capture fine-grained details, researchers also proposed descriptive paragraphs from images. However, these descriptions have a rigid structure and lack flexibility, making it difficult to use them in real-time scenarios. This paper offers two models, Description Synthesis from Image Cue (DSIC) and Transformer Based Description Generation (TBDG), for the floor plan image to text generation to fill the gaps in existing methods. These two models take advantage of modern deep neural networks for visual feature extraction and text generation. The difference between both models is in the way they take input from the floor plan image. The DSIC model takes only visual features automatically extracted by a deep neural network, while the TBDG model learns textual captions extracted from input floor plan images with paragraphs. The specific keywords generated in TBDG and understanding them with paragraphs make it more robust in a general floor plan image. Experiments were carried out on a large-scale publicly available dataset and compared with state-of-the-art techniques to show the proposed model's superiority.
References
More filters
Proceedings ArticleDOI
01 Nov 2017
TL;DR: This paper proposes Deep Architecture for fiNdIng alikE Layouts (DANIEL), a novel deep learning framework to retrieve similar floor plan layouts from repository and creation of a new complex dataset ROBIN, having three broad dataset categories with 510 real world floor plans.
Abstract: Automatically finding out existing building layouts from a repository is always helpful for an architect to ensure reuse of design and timely completion of projects. In this paper, we propose Deep Architecture for fiNdIng alikE Layouts (DANIEL). Using DANIEL, an architect can search from the existing projects repository of layouts (floor plan), and give accurate recommendation to the buyers. DANIEL is also capable of recommending the property buyers, having a floor plan image, the corresponding rank ordered list of alike layouts. DANIEL is based on the deep learning paradigm to extract both low and high level semantic features from a layout image. The key contributions in the proposed approach are: (i) novel deep learning framework to retrieve similar floor plan layouts from repository; (ii) analysing the effect of individual deep convolutional neural network layers for floor plan retrieval task; and (iii) creation of a new complex dataset ROBIN (Repository Of BuildIng plaNs), having three broad dataset categories with 510 real world floor plans.We have evaluated DANIEL by performing extensive experiments on ROBIN and compared our results with eight different state-of-the-art methods to demonstrate DANIEL’s effectiveness on challenging scenarios.

41 citations


"ASYSST: A Framework for Synopsis Sy..." refers background or methods in this paper

  • ...We included 12 decor symbols used in the dataset [21]....

    [...]

  • ...In the third column, test results are shown for SESYD samples, by trained model of ROBIN images which are comparatively low....

    [...]

  • ...We have also text annotated the dataset ROBIN [21] by collecting 4 different descriptions from volunteers in order to compare the machine generated descriptions with human written descriptions....

    [...]

  • ...Table 4 shows the results of training and testing of room image samples taken from 2 datasets, ROBIN (R)[21] and SESYD (S)[5]....

    [...]

  • ...In the second column, testing is done using samples taken from ROBIN....

    [...]

Journal ArticleDOI
TL;DR: A symbol spotting technique in graphical documents where graphs are used to represent the documents and a (sub)graph matching technique is used to detect the symbols in them and the effectiveness and efficiency are demonstrated.

36 citations


"ASYSST: A Framework for Synopsis Sy..." refers methods in this paper

  • ...In the same line [7] has proposed a symbol spotting technique in graphical documents, where a subgraph matching is used....

    [...]

Proceedings ArticleDOI
18 Sep 2011
TL;DR: An improved method for text/graphics segmentation is proposed that has a recall of almost 99 % and a precision greater then 97% and is more suitable for architectural floor plans.
Abstract: In this paper, we propose an improved method for text/graphics segmentation. Text/graphics separation is a crucial preprocessing step in document analysis before further analysis and recognition can be applied. Our proposed system extends the method of Tombre et al. with a number of improvements to make it more suitable for architectural floor plans. A crucial novel preprocessing step is the detection and removal of walls before the actual segmentation. Furthermore, text components are then extracted by analyzing connected components and even considering text overlapping with graphics. Finally, a smearing approach is used to remove noise and extract the final text components. Evaluation results over the series of 90 floor plans which has also been used in reference work shows that our method has a recall of almost 99 % and a precision greater then 97%.

32 citations

Proceedings ArticleDOI
01 Dec 2016
TL;DR: A framework for the matching and retrieval of similar architectural floorplans under the query by example paradigm is proposed and a novel graph spectral embedding feature is proposed to uniquely represent the layout of the architectural floorplan.
Abstract: An automatic lookup tool, which matches and retrieves similar floorplans from a large repository of digitized architectural floorplans can prove to be of immense help for the architects while designing new projects. In this paper, we have proposed a framework for the matching and retrieval of similar architectural floorplans under the query by example paradigm. We propose a room layout segmentation and adjacent room detection algorithm to represent layouts as an undirected graph. We have also proposed a novel graph spectral embedding feature to uniquely represent the layout of the architectural floorplan. This helps in effective and efficient matching of the room layouts. Room semantics in terms of both the room structures and room decor is used to retrieve similar floorplans from the repository. To match the semantic similarity between a pair of floorplans, we have proposed a two stage matching technique. We have validated the effectiveness of our proposed framework by performing experiments on publicly available floorplan dataset and achieved high retrieval accuracy.

19 citations


"ASYSST: A Framework for Synopsis Sy..." refers background or methods in this paper

  • ...It can be seen that performance is comparable between ours and [4]....

    [...]

  • ...table, there was no recognition using [4], while with our method accuracy of 82....

    [...]

  • ...We have adopted the technique proposed in [4] for the room segmentation task....

    [...]

  • ...Table 1 shows the quantitative comparison between ours, [4], [10] and our technique using LBP (local binary pattern) feature [15]....

    [...]

  • ...Improved the decor characterization method proposed in [4]....

    [...]

Proceedings ArticleDOI
13 Dec 2012
TL;DR: A new algorithm for image segmentation of ancient maps and floor plans is introduced that aims to remove most part of non textual elements leaving just the text, which allows further automatic identification of the map or plan through automatic character recognition techniques.
Abstract: There are several kinds of information that can be achieved in ancient documents. In general, image processing research on this subject works with images of letters or documents. Topographic maps and floor plans are also an important source of information about history. In this paper, we introduce a new algorithm for image segmentation of ancient maps and floor plans. It aims to remove most part of non textual elements leaving just the text. This allows further automatic identification of the map or plan through automatic character recognition techniques. The proposed method uses a new edge detection algorithm, thresholding and connected component analysis. The results are analyzed both qualitatively and quantitatively by comparison with other technique.

16 citations


"ASYSST: A Framework for Synopsis Sy..." refers background in this paper

  • ...In [14], authors have proposed a new algorithm to segment the ancient maps and floor plans by removing non textual elements and recognizing characters to identify the plans....

    [...]