scispace - formally typeset
Search or ask a question
Author

Li-Yi Wei

Bio: Li-Yi Wei is an academic researcher from Adobe Systems. The author has contributed to research in topics: Virtual machine & User interface. The author has an hindex of 8, co-authored 26 publications receiving 239 citations. Previous affiliations of Li-Yi Wei include University of Hong Kong & State University of New York System.

Papers
More filters
Journal ArticleDOI
TL;DR: It is demonstrated that saccades can significantly increase the rotation gains during redirection without introducing visual distortions or simulator sickness, which allows the method to apply to large open virtual spaces and small physical environments for room-scale VR.
Abstract: Redirected walking techniques can enhance the immersion and visual-vestibular comfort of virtual reality (VR) navigation, but are often limited by the size, shape, and content of the physical environments. We propose a redirected walking technique that can apply to small physical environments with static or dynamic obstacles. Via a head- and eye-tracking VR headset, our method detects saccadic suppression and redirects the users during the resulting temporary blindness. Our dynamic path planning runs in real-time on a GPU, and thus can avoid static and dynamic obstacles, including walls, furniture, and other VR users sharing the same physical space. To further enhance saccadic redirection, we propose subtle gaze direction methods tailored for VR perception. We demonstrate that saccades can significantly increase the rotation gains during redirection without introducing visual distortions or simulator sickness. This allows our method to apply to large open virtual spaces and small physical environments for room-scale VR. We evaluate our system via numerical simulations and real user studies.

152 citations

Proceedings ArticleDOI
01 Oct 2019
TL;DR: In this article, a single convolutional neural network is trained to simultaneously detect salient junctions and straight lines, as well as predict their 3D depth and vanishing points, leading to better 2D wireframe detection.
Abstract: From a single view of an urban environment, we propose a method to effectively exploit the global structural regularities for obtaining a compact, accurate, and intuitive 3D wireframe representation. Our method trains a single convolutional neural network to simultaneously detect salient junctions and straight lines, as well as predict their 3D depth and vanishing points. Compared with state-of-the-art learning-based wireframe detection methods, our network is much simpler and more unified, leading to better 2D wireframe detection. With a global structural prior (such as Manhattan assumption), our method further reconstructs a full 3D wireframe model, a compact vector representation suitable for a variety of high-level vision tasks such as AR and CAD. We conduct extensive evaluations of our method on a large new synthetic dataset of urban scenes as well as real images. Our code and datasets will be published along with the paper.

39 citations

Proceedings ArticleDOI
TL;DR: A new way to embed dynamic and responsive graphics in the real world through an augmented reality interface for sketching interactive graphics and visualizations, and contributes to a set of interaction techniques that enable capturing, parameterizing, and visualizing real-world motion without pre-defined programs and configurations.
Abstract: We present RealitySketch, an augmented reality interface for sketching interactive graphics and visualizations. In recent years, an increasing number of AR sketching tools enable users to draw and embed sketches in the real world. However, with the current tools, sketched contents are inherently static, floating in mid air without responding to the real world. This paper introduces a new way to embed dynamic and responsive graphics in the real world. In RealitySketch, the user draws graphical elements on a mobile AR screen and binds them with physical objects in real-time and improvisational ways, so that the sketched elements dynamically move with the corresponding physical motion. The user can also quickly visualize and analyze real-world phenomena through responsive graph plots or interactive visualizations. This paper contributes to a set of interaction techniques that enable capturing, parameterizing, and visualizing real-world motion without pre-defined programs and configurations. Finally, we demonstrate our tool with several application scenarios, including physics education, sports training, and in-situ tangible interfaces.

36 citations

Journal ArticleDOI
TL;DR: This work designs a computational model for predicting the magnitude of the discomfort and applies it to a new path planning method which optimizes the input motion trajectory to reduce perceptual sickness and evaluates the effectiveness of the method in improving perceptual comfort in a series of user studies.
Abstract: Virtual-reality provides an immersive environment but can induce cybersickness due to the discrepancy between visual and vestibular cues. To avoid this problem, the movement of the virtual camera needs to match the motion of the user in the real world. Unfortunately, this is usually difficult due to the mismatch between the size of the virtual environments and the space available to the users in the physical domain. The resulting constraints on the camera movement significantly hamper the adoption of virtual-reality headsets in many scenarios and make the design of the virtual environments very challenging. In this work, we study how the characteristics of the virtual camera movement (e.g., translational acceleration and rotational velocity) and the composition of the virtual environment (e.g., scene depth) contribute to perceived discomfort. Based on the results from our user experiments, we devise a computational model for predicting the magnitude of the discomfort for a given scene and camera trajectory. We further apply our model to a new path planning method which optimizes the input motion trajectory to reduce perceptual sickness. We evaluate the effectiveness of our method in improving perceptual comfort in a series of user studies targeting different applications. The results indicate that our method can reduce the perceived discomfort while maintaining the fidelity of the original navigation, and perform better than simpler alternatives.

33 citations

Patent
14 Apr 2017
TL;DR: In this paper, a system of generating a progressive representation associated with virtual and physical reality image data is presented, where a folding of the virtual scene map into the one or more free space areas of the real scene map is implemented, the free spaces areas surround or adjacent the generated virtual barrier.
Abstract: A system of generating a progressive representation associated with virtual and physical reality image data. The system receives virtual image data associated with a virtual scene map of a virtual scene, and receives physical image data from the image capturing device, associated with a physical environment of a real scene. Perimeter information of the physical environment is determined. Boundary information of an obstacle associated with the physical environment is determined. A real scene map associated with the physical environment including an obstacle and one or more free space areas is generated. A corresponding virtual barrier is generated in the real scene map, the virtual barrier associated with the boundary information of the obstacle in the real scene map. A folding of the virtual scene map into the one or more free space areas of the real scene map is implemented, the free space areas surround or adjacent the generated virtual barrier. A progressive representation to the wearable communication device is generated that is associated with virtual scene pixels of the virtual map corresponding to real scene points of the real scene map in a time interval. A corresponding method and computer-readable device are also disclosed.

25 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: In this paper, the authors offer a new book that enPDFd the perception of the visual world to read, which they call "Let's Read". But they do not discuss how to read it.
Abstract: Let's read! We will often find out this sentence everywhere. When still being a kid, mom used to order us to always read, so did the teacher. Some books are fully read in a week and we need the obligation to support reading. What about now? Do you still love reading? Is reading only for you who have obligation? Absolutely not! We here offer you a new book enPDFd the perception of the visual world to read.

2,250 citations

Journal ArticleDOI
TL;DR: This state‐of‐the‐art report investigates the background theory of perception and vision as well as the latest advancements in display engineering and tracking technologies involved in near‐eye displays.
Abstract: Virtual and augmented reality (VR/AR) are expected to revolutionise entertainment, healthcare, communication and the manufacturing industries among many others. Near-eye displays are an enabling vessel for VR/AR applications, which have to tackle many challenges related to ergonomics, comfort, visual quality and natural interaction. These challenges are related to the core elements of these near-eye display hardware and tracking technologies. In this state-of-the-art report, we investigate the background theory of perception and vision as well as the latest advancements in display engineering and tracking technologies. We begin our discussion by describing the basics of light and image formation. Later, we recount principles of visual perception by relating to the human visual system. We provide two structured overviews on state-of-the-art near-eye display and tracking technologies involved in such near-eye displays. We conclude by outlining unresolved research questions to inspire the next generation of researchers.

167 citations

Posted Content
TL;DR: This paper presents a new synthetic dataset, Structured3D, with the aim of providing large-scale photo-realistic images with rich 3D structure annotations for a wide spectrum of structured 3D modeling tasks, and takes advantage of the availability of professional interior designs to automatically extract 3D structures from them.
Abstract: Recently, there has been growing interest in developing learning-based methods to detect and utilize salient semi-global or global structures, such as junctions, lines, planes, cuboids, smooth surfaces, and all types of symmetries, for 3D scene modeling and understanding. However, the ground truth annotations are often obtained via human labor, which is particularly challenging and inefficient for such tasks due to the large number of 3D structure instances (e.g., line segments) and other factors such as viewpoints and occlusions. In this paper, we present a new synthetic dataset, Structured3D, with the aim of providing large-scale photo-realistic images with rich 3D structure annotations for a wide spectrum of structured 3D modeling tasks. We take advantage of the availability of professional interior designs and automatically extract 3D structures from them. We generate high-quality images with an industry-leading rendering engine. We use our synthetic dataset in combination with real images to train deep networks for room layout estimation and demonstrate improved performance on benchmark datasets.

87 citations

Journal ArticleDOI
TL;DR: This work introduces a completely new approach to imperceptible position and orientation redirection that takes advantage of the fact that even healthy humans are functionally blind for circa ten percent of the time under normal circumstances, and shows the potential for RDW, whose performance could be improved by approximately 50% when using this technique.
Abstract: Immersive computer-generated environments (aka virtual reality, VR) are limited by the physical space around them, e.g., enabling natural walking in VR is only possible by perceptually-inspired locomotion techniques such as redirected walking (RDW). We introduce a completely new approach to imperceptible position and orientation redirection that takes advantage of the fact that even healthy humans are functionally blind for circa ten percent of the time under normal circumstances due to motor processes preventing light from reaching the retina (such as eye blinks) or perceptual processes suppressing degraded visual information (such as blink-induced suppression). During such periods of missing visual input, change blindness occurs, which denotes the inability to perceive a visual change such as the motion of an object or self-motion of the observer. We show that this phenomenon can be exploited in VR by synchronizing the computer graphics rendering system with the human visual processes for imperceptible camera movements, in particular to implement position and orientation redirection. We analyzed human sensitivity to such visual changes with detection thresholds, which revealed that commercial off-the-shelf eye trackers and head-mounted displays suffice to translate a user by circa 4 -- 9 cm and rotate the user by circa 2 -- 5 degrees in any direction, which could be accumulated each time the user blinks. Moreover, we show the potential for RDW, whose performance could be improved by approximately 50% when using our technique.

87 citations

Proceedings ArticleDOI
02 May 2019
TL;DR: This work creates a synthetic dataset using anatomically-informed eye and face models with variations in face shape, gaze direction, pupil and iris, skin tone, and external conditions, and trains neural networks performing with sub-millisecond latency.
Abstract: Quality, diversity, and size of training data are critical factors for learning-based gaze estimators. We create two datasets satisfying these criteria for near-eye gaze estimation under infrared illumination: a synthetic dataset using anatomically-informed eye and face models with variations in face shape, gaze direction, pupil and iris, skin tone, and external conditions (2M images at 1280x960), and a real-world dataset collected with 35 subjects (2.5M images at 640x480). Using these datasets we train neural networks performing with sub-millisecond latency. Our gaze estimation network achieves 2.06(±0.44)° of accuracy across a wide 30°×40° field of view on real subjects excluded from training and 0.5° best-case accuracy (across the same FOV) when explicitly trained for one real subject. We also train a pupil localization network which achieves higher robustness than previous methods.

85 citations