scispace - formally typeset
Search or ask a question
Author

Nath Tan Nguyen

Bio: Nath Tan Nguyen is an academic researcher from Laval University. The author has contributed to research in topics: Video tracking & Rendering (computer graphics). The author has an hindex of 1, co-authored 1 publications receiving 24 citations.

Papers
More filters
Journal ArticleDOI
TL;DR: The paper provides the main conclusions of consultations with producers of video description regarding their practices and with end-users regarding their needs, as well as an analysis of described productions that lead to propose a video description typology.
Abstract: This paper presents the status of a R&D project targeting the development of computer-vision tools to assist humans in generating and rendering video description for people with vision loss. Three principal issues are discussed: (1) production practices, (2) needs of people with vision loss, and (3) current system design, core technologies and implementation. The paper provides the main conclusions of consultations with producers of video description regarding their practices and with end-users regarding their needs, as well as an analysis of described productions that lead to propose a video description typology. The current status of a prototype software is also presented (audio-vision manager) that uses many computer-vision technologies (shot transition detection, key-frame identification, key-face recognition, key-text spotting, visual motion, gait/gesture characterization, key-place identification, key-object spotting and image categorization) to automatically extract visual content, associate textual descriptions and add them to the audio track with a synthetic voice. A proof of concept is also briefly described for a first adaptive video description player which allows end users to select various levels of video description.

28 citations


Cited by
More filters
Proceedings ArticleDOI
24 Oct 2011
TL;DR: The main results are that earcons can be used together with speech synthesis to enhance understanding of videos; that earCons should be accompanied with explanations; and that a potential side effect of earcons is related to video rhythm perception.
Abstract: Our approach to address the question of online video accessibility for people with sensory disabilities is based on video annotations that are rendered as video enrichments during the playing of the video. We present an exploratory work that focuses on video accessibility for blind people with audio enrichments composed of speech synthesis and earcons (i.e. nonverbal audio messages). Our main results are that earcons can be used together with speech synthesis to enhance understanding of videos; that earcons should be accompanied with explanations; and that a potential side effect of earcons is related to video rhythm perception.

30 citations

Proceedings ArticleDOI
03 Jul 2020
TL;DR: The HILML approach facilitates human-machine collaboration to produce high quality video descriptions while keeping a low barrier to entry for volunteer describers and was significantly faster and easier to use for first-time video describers compared to a human-only control condition with no machine learning assistance.
Abstract: Video accessibility is crucial for blind and visually impaired individuals for education, employment, and entertainment purposes. However, professional video descriptions are costly and time-consuming. Volunteer-created video descriptions could be a promising alternative, however, they can vary in quality and can be intimidating for novice describers. We developed a Human-in-the-Loop Machine Learning (HILML) approach to video description by automating video text generation and scene segmentation and allowing humans to edit the output. The HILML approach facilitates human-machine collaboration to produce high quality video descriptions while keeping a low barrier to entry for volunteer describers. Our HILML system was significantly faster and easier to use for first-time video describers compared to a human-only control condition with no machine learning assistance. The quality of the video descriptions and understanding of the topic created by the HILML system compared to the human-only condition were rated as being significantly higher by blind and visually impaired users.

24 citations

Proceedings ArticleDOI
13 Jun 2010
TL;DR: An application of video indexing/summarization to produce Videodescription (VD) for the blinds is presented and the main outcomes of this R&D activity started 5 years ago in the laboratory are presented.
Abstract: We present an application of video indexing/summarization to produce Videodescription (VD) for the blinds. Audio and computer vision technologies can automatically detect and recognize many elements that are pertinent to VD which can speed-up the VD production process. We have developed and integrated many of them into a first computer-assisted VD production software. The paper presents the main outcomes of this R&D activity started 5 years ago in our laboratory. Up to now, usability performance on various video and TV series types have shown a reduction of up to 50% in the VD time production process.

21 citations

Proceedings ArticleDOI
06 May 2021
TL;DR: In this article, the authors report that most online videos are inaccessible to blind and visually impaired (BVI) people, and they use a time-consuming trial-and-error approach: clicking on a video, watching a portion, leaving the video, and repeating the process.
Abstract: User-generated videos are an increasingly important source of information online, yet most online videos are inaccessible to blind and visually impaired (BVI) people. To find videos that are accessible, or understandable without additional description of the visual content, BVI people in our formative studies reported that they used a time-consuming trial-and-error approach: clicking on a video, watching a portion, leaving the video, and repeating the process. BVI people also reported video accessibility heuristics that characterize accessible and inaccessible videos. We instantiate 7 of the identified heuristics (2 audio-related, 2 video-related, and 3 audio-visual) as automated metrics to assess video accessibility. We collected a dataset of accessibility ratings of videos by BVI people and found that our automatic video accessibility metrics correlated with the accessibility ratings (Adjusted R2 = 0.642). We augmented a video search interface with our video accessibility metrics and predictions. BVI people using our augmented video search interface selected an accessible video more efficiently than when using the original search interface. By integrating video accessibility metrics, video hosting platforms could help people surface accessible videos and encourage content creators to author more accessible products, improving video accessibility for all.

19 citations

Proceedings ArticleDOI
25 Apr 2020
TL;DR: A Human-in-the-Loop Machine Learning (HILML) approach to video description is developed by automating video text generation and scene segmentation while allowing humans to edit the output.
Abstract: Video accessibility is crucial for blind and visually impaired individuals for education, employment, and entertainment purposes. However, professional video descriptions are costly and time-consuming. Volunteer-created video descriptions could be a promising alternative, however, they can vary in quality and can be intimidating for novice describers. We developed a Human-in-the-Loop Machine Learning (HILML) approach to video description by automating video text generation and scene segmentation while allowing humans to edit the output. Our HILML system was significantly faster and easier to use for first-time video describers compared to a human-only control condition with no machine learning assistance. The quality of the video descriptions and understanding of the topic created by the HILML system compared to the human-only condition were rated as being significantly higher by blind and visually impaired users.

15 citations