scispace - formally typeset
Search or ask a question
Author

J. Lin

Bio: J. Lin is an academic researcher from University of California, Berkeley. The author has contributed to research in topics: Augmented reality & Object detection. The author has an hindex of 1, co-authored 1 publications receiving 2 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: Augmented Annotations is introduced, a novel approach to bounding box data annotation that solves the scanning and annotation processes of an environment in parallel and can produce annotated 3D data faster than the state-of-the-art.
Abstract: . The proliferation of machine learning applied to 3D computer vision tasks such as object detection has heightened the need for large, high-quality datasets of labeled 3D scans for training and testing purposes. Current methods of producing these datasets require first scanning the environment, then transferring the resulting point cloud or mesh to a separate tool for it to be annotated with semantic information, both of which are time consuming processes. In this paper, we introduce Augmented Annotations, a novel approach to bounding box data annotation that solves the scanning and annotation processes of an environment in parallel. Leveraging knowledge of the user’s position in 3D space during scanning, we use augmented reality (AR) to place persistent digital annotations directly on top of indoor real world objects. We test our system with seven human subjects, and demonstrate that this approach can produce annotated 3D data faster than the state-of-the-art. Additionally, we show that Augmented Annotations can also be adapted to automatically produce 2D labeled image data from many viewpoints, a much needed augmentation technique for 2D object detection and recognition. Finally, we release our work to the public as an open-source iPad application designed for efficient 3D data collection.

8 citations


Cited by
More filters
Posted Content
TL;DR: This paper introduces SceneGen, a generative contextual augmentation framework that predicts virtual object positions and orientations within existing scenes, and forms a novel spatial Scene Graph representation, which encapsulates explicit topological properties between objects, object groups, and the room.
Abstract: Spatial computing experiences are constrained by the real-world surroundings of the user. In such experiences, augmenting virtual objects to existing scenes require a contextual approach, where geometrical conflicts are avoided, and functional and plausible relationships to other objects are maintained in the target environment. Yet, due to the complexity and diversity of user environments, automatically calculating ideal positions of virtual content that is adaptive to the context of the scene is considered a challenging task. Motivated by this problem, in this paper we introduce SceneGen, a generative contextual augmentation framework that predicts virtual object positions and orientations within existing scenes. SceneGen takes a semantically segmented scene as input, and outputs positional and orientational probability maps for placing virtual content. We formulate a novel spatial Scene Graph representation, which encapsulates explicit topological properties between objects, object groups, and the room. We believe providing explicit and intuitive features plays an important role in informative content creation and user interaction of spatial computing settings, a quality that is not captured in implicit models. We use kernel density estimation (KDE) to build a multivariate conditional knowledge model trained using prior spatial Scene Graphs extracted from real-world 3D scanned data. To further capture orientational properties, we develop a fast pose annotation tool to extend current real-world datasets with orientational labels. Finally, to demonstrate our system in action, we develop an Augmented Reality application, in which objects can be contextually augmented in real-time.

18 citations

Journal ArticleDOI
TL;DR: This contribution introduced an image-based backpack mobile mapping system and new georeferencing methods for capture previously inaccessible outdoor locations and showed a great potential for complementing 3D image- based geospatial web-services of cities as well as for creating such web services for forest applications.
Abstract: . Advances in digitalization technologies lead to rapid and massive changes in infrastructure management. New collaborative processes and workflows require detailed, accurate and up-to-date 3D geodata. Image-based web services with 3D measurement functionality, for example, transfer dangerous and costly inspection and measurement tasks from the field to the office workplace. In this contribution, we introduced an image-based backpack mobile mapping system and new georeferencing methods for capture previously inaccessible outdoor locations. We carried out large-scale performance investigations at two different test sites located in a city centre and in a forest area. We compared the performance of direct, SLAM-based and image-based georeferencing under demanding real-world conditions. Both test sites include areas with restricted GNSS reception, poor illumination, and uniform or ambiguous geometry, which create major challenges for reliable and accurate georeferencing. In our comparison of georeferencing methods, image-based georeferencing improved the median precision of coordinate measurement over direct georeferencing by a factor of 10–15 to 3 mm. Image-based georeferencing also showed a superior performance in terms of absolute accuracies with results in the range from 4.3 cm to 13.2 cm. Our investigations showed a great potential for complementing 3D image-based geospatial web-services of cities as well as for creating such web services for forest applications. In addition, such accurately georeferenced 3D imagery has an enormous potential for future visual localization and augmented reality applications.

6 citations

Journal ArticleDOI
TL;DR: In this article , the authors propose a mutual scene synthesis method that combines a mutual function optimization module with a deep-learning conditional scene augmentation process to generate a scene mutually and physically accessible to all participants of a mixed reality telepresence scenario.
Abstract: Remote telepresence via next-generation mixed reality platforms can provide higher levels of immersion for computer-mediated communications, allowing participants to engage in a wide spectrum of activities, previously not possible in 2D screen-based communication methods. However, as mixed reality experiences are limited to the local physical surrounding of each user, finding a common virtual ground where users can freely move and interact with each other is challenging. In this paper, we propose a novel mutual scene synthesis method that takes the participants' spaces as input, and generates a virtual synthetic scene that corresponds to the functional features of all participants' local spaces. Our method combines a mutual function optimization module with a deep-learning conditional scene augmentation process to generate a scene mutually and physically accessible to all participants of a mixed reality telepresence scenario. The synthesized scene can hold mutual walkable, sittable and workable functions, all corresponding to physical objects in the users' real environments. We perform experiments using the MatterPort3D dataset and conduct comparative user studies to evaluate the effectiveness of our system. Our results show that our proposed approach can be a promising research direction for facilitating contextualized telepresence systems for next-generation spatial computing platforms.

5 citations

Proceedings ArticleDOI
24 Jul 2022
TL;DR: SemSpray as mentioned in this paper is a virtual reality application that provides users with intuitive and handy tools to produce semantic information on as-is 3D spatial data (mesh) of buildings.
Abstract: Capturing the as-is status of buildings in the form of 3D spatial data (e.g., point cloud or mesh) with the use of 3D sensing technologies is becoming predominant in the Architecture, Engineering, and Construction (AEC) industry. Although the act of acquiring this data has been progressively becoming more accurate and efficient with the availability of off-the-shelf solutions in the market, the act of extracting from it as-is information has not seen similar advancements. State-of-the-art practice requires experts to manually interact with the spatial data in a labo-rious and time-consuming process. We propose Semantic Spray ( Semspray ), a Virtual Reality (VR) application that provides users with intuitive and handy tools to produce semantic information on as-is 3D spatial data (mesh) of buildings. The goal is to perform this task accurately and more efficiently by allowing users to experience, interact with, and immerse themselves in the data at different scales. Semspray is a combination of two labeling modes: user-dynamic and user-static . In the user-dynamic mode, the user is fully immersed in the 3D mesh and has the ability to walk and teleport themselves within the model; in the user-static , the user can comfortably sit on a chair and handle a small-scale version of the 3D mesh to perform the labeling, in a similar manner to hand-held painting. We evaluated SemSpray ’s performance with a user study of ten participants and found that the combination of the different modes is able to keep the user entertained and to limit the side-effect of VR on the sensory organs, including nausea, headache, and dizziness.
Proceedings ArticleDOI
01 Oct 2022
TL;DR: In this article , a mutual scene synthesis (MSS) system is proposed to synthesize a virtual scene that corresponds to the functional features of all participants' physical spaces by combining a function optimization module with a deep-learning conditional scene augmentation process.
Abstract: The emerging field of remote telepresence via spatial computing has opened many exciting opportunities for next-generation computer-mediated-communication platforms. Such techniques enable users to mutually engage in a wide spectrum of applications, previously not possible in 2D screen-based communication methods. Yet, cali-brating and finding a mutual environment compatible with all remote participant's physical environment is considered a challenging task. In this paper, we elaborate on the mutual space finding problem and provide a high-level introduction of our proposed novel Mutual Scene Synthesis (MSS) system. The MSS system takes the partici-pants' surrounding environment as input, and synthesizes a virtual scene that corresponds to the functional features of all participants' physical spaces. By combining a function optimization module with a deep-learning conditional scene augmentation process, the MSS can generate a scene compatible to all participants of a remote telepresence scenario. By performing early comparative user studies via the MatterPort3D dataset, we evaluate the effectiveness of our system and show our proposed MSS approach can be a promising re-search direction for facilitating contextualized telepresence systems for next-generation spatial computing platforms.