scispace - formally typeset
Search or ask a question

Showing papers by "Sony Broadcast & Professional Research Laboratories published in 2018"


Proceedings ArticleDOI
01 Jan 2018
TL;DR: In this paper, a multi-layer perceptron operating on pixel coordinates rather than directly on the image is proposed to learn to find good correspondences for wide-baseline stereo.
Abstract: We develop a deep architecture to learn to find good correspondences for wide-baseline stereo. Given a set of putative sparse matches and the camera intrinsics, we train our network in an end-to-end fashion to label the correspondences as inliers or outliers, while simultaneously using them to recover the relative pose, as encoded by the essential matrix. Our architecture is based on a multi-layer perceptron operating on pixel coordinates rather than directly on the image, and is thus simple and small. We introduce a novel normalization technique, called Context Normalization, which allows us to process each data point separately while embedding global information in it, and also makes the network invariant to the order of the correspondences. Our experiments on multiple challenging datasets demonstrate that our method is able to drastically improve the state of the art with little training data.

456 citations


Posted Content
TL;DR: A novel deep architecture and a training strategy to learn a local feature pipeline from scratch, using collections of images without the need for human supervision, and shows that it can optimize the network in a two-branch setup by confining it to one branch, while preserving differentiability in the other.
Abstract: We present a novel deep architecture and a training strategy to learn a local feature pipeline from scratch, using collections of images without the need for human supervision. To do so we exploit depth and relative camera pose cues to create a virtual target that the network should achieve on one image, provided the outputs of the network for the other image. While this process is inherently non-differentiable, we show that we can optimize the network in a two-branch setup by confining it to one branch, while preserving differentiability in the other. We train our method on both indoor and outdoor datasets, with depth data from 3D sensors for the former, and depth estimates from an off-the-shelf Structure-from-Motion solution for the latter. Our models outperform the state of the art on sparse feature matching on both datasets, while running at 60+ fps for QVGA images.

275 citations


Proceedings Article
01 Jan 2018
TL;DR: In this article, the authors exploit depth and relative camera pose cues to create a virtual target that the network should achieve on one image, provided the outputs of the network for the other image.
Abstract: We present a novel deep architecture and a training strategy to learn a local feature pipeline from scratch, using collections of images without the need for human supervision. To do so we exploit depth and relative camera pose cues to create a virtual target that the network should achieve on one image, provided the outputs of the network for the other image. While this process is inherently non-differentiable, we show that we can optimize the network in a two-branch setup by confining it to one branch, while preserving differentiability in the other. We train our method on both indoor and outdoor datasets, with depth data from 3D sensors for the former, and depth estimates from an off-the-shelf Structure-from-Motion solution for the latter. Our models outperform the state of the art on sparse feature matching on both datasets, while running at 60+ fps for QVGA images.

227 citations



Proceedings ArticleDOI
01 Sep 2018
TL;DR: To further enhance MMDenseNet, this paper proposed a novel architecture that integrates long short-term memory (LSTM) in multiple scales with skip connections to efficiently model long-term structures within an audio context.
Abstract: Deep neural networks have become an indispensable technique for audio source separation (SS). It was recently reported that a variant of CNN architecture called MM-DenseNet was successfully employed to solve the SS problem of estimating source amplitudes, and state-of-the-art results were obtained for DSD 100 dataset. To further enhance MMDenseNet, here we propose a novel architecture that integrates long short-term memory (LSTM) in multiple scales with skip connections to efficiently model long-term structures within an audio context. The experimental results show that the proposed method outperforms MMDenseNet, LSTM and a blend of the two networks. The number of parameters and processing time of the proposed model are significantly less than those for simple blending. Furthermore, the proposed method yields better results than those obtained using ideal binary masks for a singing voice separation task.

130 citations


Proceedings ArticleDOI
02 Sep 2018
TL;DR: Experimental results show that the classificationbased approach successfully recovers the phase of the target source in the discretized domain, improves signal-todistortion ratio (SDR) over the regression-based approach in both speech enhancement task and music source separation (MSS) task, and outperforms state-of-the-art MSS.
Abstract: Previous research on audio source separation based on deep neural networks (DNNs) mainly focuses on estimating the magnitude spectrum of target sources and typically, phase of the mixture signal is combined with the estimated magnitude spectra in an ad-hoc way. Although recovering target phase is assumed to be important for the improvement of separation quality, it can be difficult to handle the periodic nature of the phase with the regression approach. Unwrapping phase is one way to eliminate the phase discontinuity, however, it increases the range of value along with the times of unwrapping, making it difficult for DNNs to model. To overcome this difficulty, we propose to treat the phase estimation problem as a classification problem by discretizing phase values and assigning class indices to them. Experimental results show that our classificationbased approach 1) successfully recovers the phase of the target source in the discretized domain, 2) improves signal-todistortion ratio (SDR) over the regression-based approach in both speech enhancement task and music source separation (MSS) task, and 3) outperforms state-of-the-art MSS.

75 citations


Patent
05 Jan 2018
TL;DR: In this article, a WTRU is configured to detect a beam failure condition, identify a candidate beam for resolving the beam fault condition, and send a beam fault recovery request to a network entity.
Abstract: Systems, methods and instrumentalities are disclosed for WTRU-initiated beam recovery including beam switching and/or beam sweeping. A WTRU may be configured to detect a beam failure condition, identify a candidate beam for resolving the beam failure condition, and send a beam failure recovery request to a network entity. The WTRU may include the candidate beam in the beam failure recovery request and may receive a response from the network entity regarding the request and/or a solution for the beam failure condition. WTRU-initiated beam recovery may be used to resolve radio link failures and improve system performance by avoiding the necessity to perform an acquisition procedure. Additionally, beam sweeping may be performed at a sub-time unit level to provide a fast sweeping mechanism.

65 citations


Patent
08 May 2018
TL;DR: In this article, the light guiding plate allows the light from the LED indicator to penetrate to a part of the light-guiding plate exposed from the storage case by diffusing the light received using the concave surface portion.
Abstract: A communication substrate is provided with an LED indicator which emits light. A light guiding plate has a concave surface portion which is a concave surface to cover the LED indicator, and receives the light from the LED indicator using the concave surface portion. A storage case stores the communication substrate and the light guiding plate in a state where a part of the light guiding plate is exposed. The light guiding plate allows the light from the LED indicator to penetrate to a part of the light guiding plate exposed from the storage case by diffusing the light received using the concave surface portion. For example, the present disclosure is applicable to the light emitting device that emits light using an LED or the like.

62 citations


Journal ArticleDOI
TL;DR: In this paper, a 3.2-MP four-directional polarization image sensor with air-gap wire-grid polarizer is described, which is suitable for various megapixel fusion-imaging applications, such as reducing surface reflections, highly accurate depth mapping, and condition robust surveillance.
Abstract: A 3.2-MP four-directional polarization image sensor with air-gap wire-grid polarizer is described. The image sensor was fabricated using a wafer process and incorporates back-illumination and an antireflection layer to minimize optical flaring and ghosting problems. In testing, the sensor achieved a polarization transmittance of 63.3% and an extinction ratio of 85 at 550 nm, thereby outperforming conventional polarization sensors. The proposed sensor also exhibited good oblique-incidence characteristics, even with small polarization pixels of $2.5~\mu \text{m}$ . Based on these results, the proposed image sensor is suitable for various megapixel fusion-imaging applications, such as reducing surface reflections, highly accurate depth mapping, and condition-robust surveillance.

53 citations


Journal ArticleDOI
TL;DR: The proposed structure enabled GaN-based VCSELs to be constructed with cavities as long as 28.3 µm, which greatly simplifies the fabrication process owing to longitudinal mode spacings of less than a few nanometers and should help the implementation of these devices in practice.
Abstract: We demonstrate the lateral optical confinement of GaN-based vertical-cavity surface-emitting lasers (GaN-VCSELs) with a cavity containing a curved mirror that is formed monolithically on a GaN wafer. The output wavelength of the devices is 441-455 nm. The threshold current is 40 mA (Jth = 141 kA/cm2) under pulsed current injection (Wp = 100 ns; duty = 0.2%) at room temperature. We confirm the lateral optical confinement by recording near-field images and investigating the dependence of threshold current on aperture size. The beam profile can be fitted with a Gaussian having a theoretical standard deviation of σ = 0.723 µm, which is significantly smaller than previously reported values for GaN-VCSELs with plane mirrors. Lateral optical confinement with this structure theoretically allows aperture miniaturization to the diffraction limit, resulting in threshold currents far lower than sub-milliamperes. The proposed structure enabled GaN-based VCSELs to be constructed with cavities as long as 28.3 µm, which greatly simplifies the fabrication process owing to longitudinal mode spacings of less than a few nanometers and should help the implementation of these devices in practice.

48 citations



Journal ArticleDOI
TL;DR: In this paper, the authors developed a unique production process of a full-color plastic holographic waveguide combiner with a light-weight and see-through capability, which enables them to increase design flexibility in the eyewear and to expand the market for augmented reality.
Abstract: We have developed a unique production process of a full-color plastic holographic waveguide combiner with a light-weight and see-through capability. The novel plastic waveguide technology enables us to increase design flexibility in the eyewear and to expand the market for augmented reality (AR). This paper presents the approach to production.

Patent
05 Apr 2018
TL;DR: In this article, the authors present a solid-state imaging device and an electronic device with which it is possible, in a pixel configuration in which a plurality of unit pixels are configured from two or more sub-pixels, to achieve both a dynamic range operation and an auto focus operation.
Abstract: The purpose of the present invention is to provide a solid-state imaging device and an electronic device with which it is possible, in a pixel configuration in which a plurality of unit pixels are configured from two or more sub-pixels, to achieve both a dynamic range operation and an auto focus operation. Provided is a solid-state imaging device provided with: a first pixel separating region separating a plurality of unit pixels configured from two or more sub-pixels; a second pixel separating region separating each of the plurality of unit pixels separated by the first pixel separating region; and an overflow region for causing signal charges accumulated in the sub-pixels to overflow with at least one of adjacent sub-pixels. The overflow region is formed between a first sub-pixel and a second sub-pixel.

Journal ArticleDOI
03 Aug 2018-ACS Nano
TL;DR: This work performs a fully quantitative analysis which allows us to probe the charge density distributions inside atoms, including both the positive nuclear and the screening electronic charges, with subatomic resolution and in real space, and maps the spatial distribution of the electron cloud within individual atomic columns.
Abstract: Probing the charge density distributions in materials at atomic scale remains an extremely demanding task, particularly in real space. However, recent advances in differential phase contrast-scanning transmission electron microscopy (DPC-STEM) bring this possibility closer by directly visualizing the atomic electric field. DPC-STEM at atomic resolutions measures how a sub-angstrom electron probe passing through a material is affected by the atomic electric field, the field between the nucleus and the surrounding electrons. Here, we perform a fully quantitative analysis which allows us to probe the charge density distributions inside atoms, including both the positive nuclear and the screening electronic charges, with subatomic resolution and in real space. By combining state-of-the-art DPC-STEM experiments with advanced electron scattering simulations we are able to map the spatial distribution of the electron cloud within individual atomic columns. This work constitutes a crucial step toward the direct a...

Patent
20 Sep 2018
TL;DR: In this paper, a surgical information processing apparatus, including circuitry that obtains position information of a surgical imaging device, the position information indicating displacement of the surgical imaging devices from a predetermined position, in a registration mode, can be found.
Abstract: A surgical information processing apparatus, including circuitry that obtains position information of a surgical imaging device, the position information indicating displacement of the surgical imaging device from a predetermined position, in a registration mode, obtain first image information from the surgical imaging device regarding a position of a surgical component, determines the position of the surgical component based on the first image information and the position information, and in an imaging mode, obtains second image information from the surgical imaging device of the surgical component based on the determined position.

Patent
13 Sep 2018
TL;DR: In this paper, a plurality of resolution modes are made available to deal with a possible conflict between the reflex action and the deliberate action, and by which of the resolution modes the conflict is resolved is specified in advance.
Abstract: During autonomous driving, a reflex action is determined as a simplified action on the basis of detection results detected by a variety of sensors provided in a vehicle, and a deliberate action ranked higher than a reflex action is determined through elaborate processing. A plurality of resolution modes are made available to deal with a possible conflict between the reflex action and the deliberate action, and by which of the resolution modes the conflict is resolved is specified in advance so that the conflict is resolved by the specified resolution mode. The present disclosure is applicable to motor vehicles that drive autonomously.

Journal ArticleDOI
TL;DR: In this paper, it was shown that zinc blende magnesium sulfide is observed as a reaction product after discharging in magnesium-sulfur batteries, which is not related to the most stable rock salt phase of magnesium sulfides.
Abstract: Magnesium-sulfur batteries are one of the most promising next-generation battery systems due to their high energy density, low cost, and high level of safety. However, the reaction mechanisms are not well understood, and in particular, the discharge reaction products have not yet been identified. Here we show that zinc blende magnesium sulfide is observed as a reaction product after discharging in magnesium-sulfur batteries. When magnesium reacts electrochemically with sulfur in a sulfone-based magnesium electrolyte, sulfur becomes amorphous consisting of magnesium and sulfur in the cathode. In this study, it has been found that the amorphous material has an unusual local structure, which is not related to the most stable rock salt phase of magnesium sulfide but rather the metastable zinc blende phase. It was indicated that this material realizes the reversibility of magnesium-sulfur batteries.

Patent
27 Feb 2018
TL;DR: In this article, a method and system to localize surgical tools during anatomical surgery is described, which is implemented in an image processing engine coupled to an image-capturing device that captures one or more video frames.
Abstract: Various aspects of a method and system to localize surgical tools during anatomical surgery are disclosed herein. In accordance with an embodiment of the disclosure, the method is implementable in an image-processing engine, which is communicatively coupled to an image-capturing device that captures one or more video frames. The method includes determination of one or more physical characteristics of one or more surgical tools present in the one or more video frames, based on one or more color and geometric constraints. Thereafter, two-dimensional (2D) masks of the one or more surgical tools are detected, based on the one or more physical characteristics of the one or more surgical tools. Further, poses of the one or more surgical tools are estimated, when the 2D masks of the one or more surgical tools are occluded at tips and/or ends of the one or more surgical tools.

Patent
04 Jan 2018
TL;DR: In this article, a display device according to the present disclosure has a resonator structure in which a light reflector and a semi-transmissive plate are disposed at a distance that differs for each luminescent color.
Abstract: A display device according to the present disclosure has a resonator structure in which a light reflector and a semi-transmissive plate are disposed at a distance that differs for each luminescent color In this resonator structure, a light-emitting function layer including a light-emitting layer, a transparent cathode electrode, and a protective film that protects the cathode electrode are laminated in order between the light reflector and the semi-transmissive plate In addition, the semi-transmissive plate is formed on the protective film

Patent
31 May 2018
TL;DR: In this article, a system for an autonomous vehicle that receives driving environment information corresponding to a driving environment provided by another autonomous vehicle, and determines a navigation route based on a degree of reliability of the driving environments information provided by the other autonomous vehicle is presented.
Abstract: A system for an autonomous vehicle that receives driving environment information corresponding to a driving environment provided by another autonomous vehicle, and determines a navigation route based on a degree of reliability of the driving environment information provided by the another autonomous vehicle.

Patent
15 Feb 2018
TL;DR: In this article, a control unit allocates a first control channel to be commonly transmitted to the plurality of terminal devices to a first region overlapping between the respective channels of the terminal devices in a region in which the resources are allocated.
Abstract: A communication device includes: a communication unit configured to perform wireless communication; and a control unit configured to allocate respective resources for communication with a plurality of terminal devices in which at least any of bandwidths or central frequencies of channels to be used is different. The control unit allocates a first control channel to be commonly transmitted to the plurality of terminal devices to a first region overlapping between the respective channels of the plurality of terminal devices in a region in which the resources are allocated. The control unit allocates a second control channel to be individually transmitted to each of the plurality of terminal devices to a second region different from the first region.

Patent
26 Apr 2018
TL;DR: In this paper, the authors propose to generate a rights blockchain storing rights of a user, including: receiving an enrollment request and a public key from the user, verifying that the user has a private key corresponding to the public key, and generating a user identifier using the private key; and generating and delivering the blockchain having a genesis block including the user identifier to the user.
Abstract: Generating a rights blockchain storing rights of a user, including: receiving an enrollment request and a public key from the user; verifying that the user has a private key corresponding to the public key; generating a user identifier using the public key; and generating and delivering the rights blockchain having a genesis block including the user identifier to the user.

Proceedings ArticleDOI
20 Apr 2018
TL;DR: In this article, a mode-domain feed-forward active noise control (ANC) method is proposed to attenuate the noise field over a large space while reducing the number of microphones required.
Abstract: Active noise control (ANC) over a sizeable space requires a large number of reference and error microphones to satisfy the spatial Nyquist sampling criterion, which limits the feasibility of practical realization of such systems. This paper proposes a mode-domain feedforward ANC method to attenuate the noise field over a large space while reducing the number of microphones required. We adopt a sparse reference signal representation to precisely calculate the reference mode coefficients. The proposed system consists of circular reference and error microphone arrays, which capture the reference noise signal and residual error signal, respectively, and a circular loudspeaker array to drive the anti-noise signal. Experimental results indicate that above the spatial Nyquist frequency, our proposed method can perform well compared to a conventional methods. Moreover, the proposed method can even reduce the number of reference microphones while achieving better noise attenuation.

Proceedings ArticleDOI
01 Sep 2018
TL;DR: This paper presents the Grus framework, a framework to support latency SLOs in GPU-accelerated NFV systems that can significantly reduce latency variation and satisfy 4.5× more SLO terms than state-of-the-art solutions.
Abstract: Graphics Processing Unit (GPU) has been recently exploited as a hardware accelerator to improve the performance of Network Function Virtualization (NFV). However, GPU-accelerated NFV systems suffer from significant latency variation when multiple network functions (NFs) are co-located in the same machine, which prevents operators from supporting latency Service Level Objectives (SLOs). Existing research efforts to address this problem can only guarantee a limited number of SLOs with very low resource utilization efficiency. In this paper, we present the Grus framework to support latency SLOs in GPU-accelerated NFV systems. Grus thoroughly analyzes the sources of latency variation and proposes three design principles: (1) dynamic batch size setting is needed to bound packet batching latency in CPU; (2) a reordering mechanism for data transfer over PCI-E is required to guarantee the stalling time; and (3) maximizing concurrency in GPU is necessary to avoid NF execution waiting time. Guided by the principles, Grus consists of two logical layers including an infrastructure layer and a scheduling layer. The infrastructure layer is equipped with an in-CPU Reorder-able Worker Pool that could adjust batching size and packet transfer order, and in-GPU Controllable Concurrent Executors to provide maximized concurrency. The scheduling layer runs a heuristic algorithm to perform accurate and fast scheduling to guarantee SLOs based on our prediction models. We have implemented a prototype of Grus. Extensive evaluations demonstrate that Grus can significantly reduce latency variation and satisfy 4.5× more SLO terms than state-of-the-art solutions.

Patent
18 Dec 2018
TL;DR: In this paper, the authors present a technique for restricting or controlling communications devices, which are performing D2D communications, so as to reduce or restrict access to communications resources for D2DM communications in accordance with a priority given to the communications devices.
Abstract: A communications device and method of communicating using a communications device is disclosed for performing device-to-device communications The communications device is configured with at least one of an indication of first communications resources of the wireless access interface which can be allocated on request to the communications device by a mobile communications network for transmitting signals to one or more other communications devices in accordance with the first mode of operation or an indication of second communications resources of the wireless access interface, which can be used by the communications device for transmitting signals to one or more other communications devices in accordance with a second mode of operation using a device to device communications protocol The method includes receiving, from the mobile communications network, an indication of whether the communications device can use at least one of the first communications resources of the wireless access interface for performing device-to-device communications in accordance with the first mode of operation or the second communications resources of the wireless access interface for performing device to device communications in accordance with the second mode of operation, and depending on the indication provided by the mobile communications network and the configuration of the first communications resources and/or the second communications resources, transmitting signals to the one or more other communications devices via the first communications resources or transmitting signals to the one or more other communications devices via the second communications resources in accordance with the second mode of operation using the device to device communications protocol Embodiments of the present technique can provide an arrangement for restricting or controlling communications devices, which are performing D2D communications so as to reduce or restrict access to communications resources for D2D communications in accordance with a priority given to the communications devices

Patent
03 May 2018
TL;DR: In this article, the authors present a wireless communication system, and a device and method in the wireless communications system, which consists of a channel information acquiring unit, configured to acquire first channel information about a channel between a first communication device and a second communication device; a pre-coding unit, configurable to pre-code a first reference signal based on the first-channel information; a measurement configuration information generating unit, defined for the second communication devices, wherein the measurement configuration information comprises measurement instructions on the pre-coded first-reference signal; and a control unit,
Abstract: Disclosed are a wireless communication system, and a device and method in the wireless communication system. The device comprises: a channel information acquiring unit, configured to acquire first channel information about a channel between a first communication device and a second communication device; a pre-coding unit, configured to pre-code a first reference signal based on the first channel information; a measurement configuration information generating unit, configured to generate measurement configuration information for the second communication device, wherein the measurement configuration information comprises measurement instructions on the pre-coded first reference signal; and a control unit, configured to control, based on the second communication device, according to the measurement configuration information and aiming at second channel information fed back by the pre-coded first reference signal, transmission of a data signal. According to the embodiments of the present invention, interference between user equipments can be effectively removed, the operation complexity is reduced, and the whole performance of the system is optimized.

Patent
09 Aug 2018
TL;DR: In this paper, a video processing device acquires view point position information indicating the range of movement of a viewpoint position for an object in an input image frame where the object is imaged in chronological order at multiple different view point positions.
Abstract: The present technology relates to a video processing device, a video processing method, and a program for providing a bullet time video centering on a moving object. The video processing device acquires view point position information indicating the range of movement of a view point position for an object in an input image frame where the object is imaged in chronological order at multiple different view point positions. Time information indicates a time range in the chronological order of imaging the input image frame. The device processes the input image frame such that the object is at a specific position on an output image frame when the view point position moves within the time range indicated by the time information and the view point position movement range indicated by the view point position information.

Patent
10 Aug 2018
TL;DR: In this article, a three-phase transmitter that sets voltages of first, second, and third output terminals based on first and second signals was proposed. But the transmitter was designed for a single-input single-output (SIMO) channel.
Abstract: A three-phase transmitter that sets voltages of first, second, and third output terminals based on first, second, and third signals. The transmitter includes a first transmitting section configured to set the voltage of the first output terminal based on the first and third signals; a second transmitting section configured to set the voltage of the second output terminal based on the first and second signals; and a third transmitting section configured to set the voltage of the third output terminal based on the second and third signals.

Patent
13 Dec 2018
TL;DR: In this article, an object recognition unit subjects a video of the vehicle and a video inside the vehicle room to the object recognition, the videos having been obtained by the video obtaining unit, and detects, on the basis of the result of the recognizing, near miss characteristics related to a collision between objects, abnormal driving of an own vehicle, illegal driving of the own vehicle or surrounding vehicles, or the like.
Abstract: An object recognition unit subjects a video of surroundings of the vehicle and a video of the inside of the vehicle room to the object recognition, the videos having been obtained by the video obtaining unit, and detects, on the basis of the result of the recognizing, near miss characteristics related to a collision between objects, abnormal driving of an own vehicle, illegal driving of the own vehicle or surrounding vehicles, or the like. Basically, according to a detection signal of near miss characteristics output from the object recognition unit, the control unit controls trigger recording of a video of the in-vehicle camera obtained by the video obtaining unit.