scispace - formally typeset
Search or ask a question

Showing papers by "Andrew Rabinovich published in 2019"


Posted Content
TL;DR: SuperGlue as discussed by the authors matches two sets of local features by jointly finding correspondences and rejecting non-matchable points by solving a differentiable optimal transport problem, whose costs are predicted by a graph neural network.
Abstract: This paper introduces SuperGlue, a neural network that matches two sets of local features by jointly finding correspondences and rejecting non-matchable points. Assignments are estimated by solving a differentiable optimal transport problem, whose costs are predicted by a graph neural network. We introduce a flexible context aggregation mechanism based on attention, enabling SuperGlue to reason about the underlying 3D scene and feature assignments jointly. Compared to traditional, hand-designed heuristics, our technique learns priors over geometric transformations and regularities of the 3D world through end-to-end training from image pairs. SuperGlue outperforms other learned approaches and achieves state-of-the-art results on the task of pose estimation in challenging real-world indoor and outdoor environments. The proposed method performs matching in real-time on a modern GPU and can be readily integrated into modern SfM or SLAM systems. The code and trained weights are publicly available at this https URL.

613 citations


Patent
TL;DR: gradient adversarial training increases the robustness of a network to adversarial attacks, is able to better distill the knowledge from a teacher network to a student network compared to soft targets, and boosts multi-task learning by aligning the gradient tensors derived from the task specific loss functions.
Abstract: Systems and methods for gradient adversarial training of a neural network are disclosed. In one aspect of gradient adversarial training, an auxiliary neural network can be trained to classify a gradient tensor that is evaluated during backpropagation in a main neural network that provides a desired task output. The main neural network can serve as an adversary to the auxiliary network in addition to a standard task-based training procedure. The auxiliary neural network can pass an adversarial gradient signal back to the main neural network, which can use this signal to regularize the weight tensors in the main neural network. Gradient adversarial training of the neural network can provide improved gradient tensors in the main network. Gradient adversarial techniques can be used to train multitask networks, knowledge distillation networks, and adversarial defense networks.

31 citations


Proceedings ArticleDOI
01 Oct 2019
TL;DR: This work presents EyeNet, the first single deep neural network which solves multiple heterogeneous tasks related to eye gaze estimation for an off-axis camera setting, including eye segmentation, IR LED glints detection, pupil and cornea center estimation.
Abstract: Eye gaze estimation is a crucial component in Virtual and Mixed Reality. In head-mounted VR/MR devices the eyes are imaged off-axis to avoid blocking the user's gaze, this view-point makes drawing eye related inferences very challenging. In this work, we present EyeNet, the first single deep neural network which solves multiple heterogeneous tasks related to eye gaze estimation for an off-axis camera setting. The tasks include eye segmentation, IR LED glints detection, pupil and cornea center estimation. We benchmark all tasks on MagicEyes, a large and new dataset of 587 subjects with varying morphology, gender, skin-color, make-up and imaging conditions.

17 citations


Posted Content
TL;DR: This work presents DeepPerimeter, a deep learning based pipeline for inferring a full indoor perimeter from a sequence of posed RGB images, which results in excellent visual and quantitative performance on the popular ScanNet and FloorNet datasets and works for room shapes of various complexities as well as in multiroom scenarios.
Abstract: We present DeepPerimeter, a deep learning based pipeline for inferring a full indoor perimeter (i.e. exterior boundary map) from a sequence of posed RGB images. Our method relies on robust deep methods for depth estimation and wall segmentation to generate an exterior boundary point cloud, and then uses deep unsupervised clustering to fit wall planes to obtain a final boundary map of the room. We demonstrate that DeepPerimeter results in excellent visual and quantitative performance on the popular ScanNet and FloorNet datasets and works for room shapes of various complexities as well as in multiroom scenarios. We also establish important baselines for future work on indoor perimeter estimation, topics which will become increasingly prevalent as application areas like augmented reality and robotics become more significant.

9 citations


Posted Content
TL;DR: EyeNet is presented, the first single deep neural network which solves multiple heterogeneous tasks related to eye gaze estimation and semantic user understanding for an off-axis camera setting.
Abstract: Eye gaze estimation and simultaneous semantic understanding of a user through eye images is a crucial component in Virtual and Mixed Reality; enabling energy efficient rendering, multi-focal displays and effective interaction with 3D content. In head-mounted VR/MR devices the eyes are imaged off-axis to avoid blocking the user's gaze, this view-point makes drawing eye related inferences very challenging. In this work, we present EyeNet, the first single deep neural network which solves multiple heterogeneous tasks related to eye gaze estimation and semantic user understanding for an off-axis camera setting. The tasks include eye segmentation, blink detection, emotive expression classification, IR LED glints detection, pupil and cornea center estimation. To train EyeNet end-to-end we employ both hand labelled supervision and model based supervision. We benchmark all tasks on MagicEyes, a large and new dataset of 587 subjects with varying morphology, gender, skin-color, make-up and imaging conditions.

8 citations


Patent
22 Apr 2019
TL;DR: In this paper, the authors proposed a method to solve the problem of the lack of resources in the South Korean market by using the concept of "social media" in order to improve the quality of social media.
Abstract: 머리-장착 증강 현실(AR) 디바이스는 복수의 센서들(예컨대, 관성 측정 유닛, 외향 카메라, 깊이 감지 카메라, 눈 이미징 카메라 또는 마이크로폰)로부터 상이한 타입들의 센서 데이터를 수신하며; 그리고 히드라 뉴럴 네트워크(예컨대, 얼굴 인식, 시각적 서치, 제스처 식별, 시맨틱 세그먼트화, 오브젝트 검출, 조명 검출, 동시적인 로컬화 및 맵핑, 재로컬화) 및 상이한 타입들의 센서 데이터를 사용하여 복수의 이벤트들 중의 이벤트를 결정하도록 프로그래밍된 하드웨어 프로세서를 포함할 수 있다.

2 citations


Posted Content
TL;DR: This work discusses the data, architecture, and training procedure necessary to deploy extremely efficient 2.5D hand pose estimation on embedded devices with highly constrained memory and compute envelope, such as AR/VR wearables, and proposes an auxiliary multi-task training strategy needed to compensate for the small capacity of the network.
Abstract: 2D Key-point estimation is an important precursor to 3D pose estimation problems for human body and hands. In this work, we discuss the data, architecture, and training procedure necessary to deploy extremely efficient 2.5D hand pose estimation on embedded devices with highly constrained memory and compute envelope, such as AR/VR wearables. Our 2.5D hand pose estimation consists of 2D key-point estimation of joint positions on an egocentric image, captured by a depth sensor, and lifted to 2.5D using the corresponding depth values. Our contributions are two fold: (a) We discuss data labeling and augmentation strategies, the modules in the network architecture that collectively lead to $3\%$ the flop count and $2\%$ the number of parameters when compared to the state of the art MobileNetV2 architecture. (b) We propose an auxiliary multi-task training strategy needed to compensate for the small capacity of the network while achieving comparable performance to MobileNetV2. Our 32-bit trained model has a memory footprint of less than 300 Kilobytes, operates at more than 50 Hz with less than 35 MFLOPs.

2 citations


Patent
15 Nov 2019
TL;DR: In this article, the authors propose a method to solve the problem of how to improve the quality of a user's experience in the context of Korean-English communication, and they show how to make the user feel better.
Abstract: 룸의 레이아웃을 추정하기 위한 시스템들 및 방법들이 개시된다. 룸 레이아웃은 바닥, 하나 이상의 벽들 및 천장의 위치를 포함한다. 일 양상에서, 뉴럴 네트워크는 룸 레이아웃을 결정하기 위해 룸의 부분의 이미지를 분석할 수 있다. 뉴럴 네트워크는 인코더 서브-네트워크, 디코더 서브-네트워크 및 사이드 서브-네트워크를 갖는 콘볼루션 뉴럴 네트워크를 포함할 수 있다. 뉴럴 네트워크는 룸 타입과 연관된 2-차원 순서화된 키포인트들을 사용하여 3-차원 룸 레이아웃을 결정할 수 있다. 룸 레이아웃은 증강 또는 혼합 현실, 로봇공학, 자율 실내 내비게이션 등과 같은 애플리케이션들에서 사용될 수 있다.

1 citations


Patent
30 Jan 2019
TL;DR: In this paper, the authors proposed a neural network based approach to improve the performance of the network in the real world, where the objective is to find the optimal node for each node in the network.
Abstract: 이미지에 기초하여 신경 네트워크(neural network)에 대한 입력들을 생성하기 위한 방법은, 이미지를 수신하는 단계, 이미지 내의 포지션을 식별하는 단계, 및 포지션에서 이미지의 서브세트를 식별하는 단계를 포함한다 이미지의 서브세트는 코너들의 제1 세트에 의해 정의된다 이 방법은 또한, 코너들의 제2 세트를 형성하도록 코너들의 제1 세트 중 적어도 하나를 섭동시키는 단계를 포함한다 코너들의 제2 세트는 이미지의 수정된 서브세트를 정의한다 이 방법은, 이미지의 서브세트와 이미지의 수정된 서브세트 간의 비교에 기초하여 호모그래피(homography)를 결정하는 단계, 호모그래피를 이미지에 적용함으로써 변환된 이미지를 생성하는 단계, 및 포지션에서 변환된 이미지의 서브세트를 식별하는 단계를 더 포함한다