scispace - formally typeset
T

Tae-Hyun Oh

Researcher at Pohang University of Science and Technology

Publications -  89
Citations -  1901

Tae-Hyun Oh is an academic researcher from Pohang University of Science and Technology. The author has contributed to research in topics: Computer science & Closed captioning. The author has an hindex of 17, co-authored 69 publications receiving 1123 citations. Previous affiliations of Tae-Hyun Oh include Microsoft & Massachusetts Institute of Technology.

Papers
More filters
Proceedings ArticleDOI

Learning to Localize Sound Source in Visual Scenes

TL;DR: Zhang et al. as discussed by the authors proposed a two-stream network structure which handles each modality, with an attention mechanism, for sound source localization, which can be extended to a unified architecture with a simple modification for the supervised and semi-supervised learning settings as well.
Proceedings ArticleDOI

Listen to Look: Action Recognition by Previewing Audio

TL;DR: In this article, an attention-based long short-term memory network was proposed to iteratively select useful moments in untrimmed videos, reducing long-term temporal redundancy for efficient video-level recognition.
Proceedings ArticleDOI

Speech2Face: Learning the Face Behind a Voice

TL;DR: This paper designs and trains a deep neural network to perform the task of reconstructing a facial image of a person from a short audio recording of that person speaking, and evaluates and numerically quantify how these Speech2Face reconstructions, obtained directly from audio, resemble the true face images of the speakers.
Journal ArticleDOI

Semantic soft segmentation

TL;DR: This work introduces semantic soft segments, a set of layers that correspond to semantically meaningful regions in an image with accurate soft transitions between different objects, and proposes a graph structure that embeds texture and color features from the image as well as higher-level semantic information generated by a neural network.
Posted Content

Learning to Localize Sound Source in Visual Scenes

TL;DR: A novel unsupervised algorithm to address the problem of localizing the sound source in visual scenes, and a two-stream network structure which handles each modality, with attention mechanism is developed for sound source localization.