Showing papers on "Closed captioning published in 2012"

PDF

Open Access

Proceedings Article•DOI•

Real-time captioning by groups of non-experts

[...]

Walter S. Lasecki¹, Christopher D. Miller¹, Adam Sadilek¹, Andrew Abumoussa¹, Donato Borrello¹, Raja S. Kushalnagar², Jeffrey P. Bigham¹ - Show less +3 more•Institutions (2)

University of Rochester¹, Rochester Institute of Technology²

07 Oct 2012

TL;DR: This paper introduces a new approach in which groups of non-expert captionists (people who can hear and type) collectively caption speech in real-time on-demand, and presents Legion:Scribe, an end-to-end system that allows deaf people to request captions at any time.

...read moreread less

Abstract: Real-time captioning provides deaf and hard of hearing people immediate access to spoken language and enables participation in dialogue with others. Low latency is critical because it allows speech to be paired with relevant visual cues. Currently, the only reliable source of real-time captions are expensive stenographers who must be recruited in advance and who are trained to use specialized keyboards. Automatic speech recognition (ASR) is less expensive and available on-demand, but its low accuracy, high noise sensitivity, and need for training beforehand render it unusable in real-world situations. In this paper, we introduce a new approach in which groups of non-expert captionists (people who can hear and type) collectively caption speech in real-time on-demand. We present Legion:Scribe, an end-to-end system that allows deaf people to request captions at any time. We introduce an algorithm for merging partial captions into a single output stream in real-time, and a captioning interface designed to encourage coverage of the entire audio stream. Evaluation with 20 local participants and 18 crowd workers shows that non-experts can provide an effective solution for captioning, accurately covering an average of 93.2% of an audio stream with only 10 workers and an average per-word latency of 2.9 seconds. More generally, our model in which multiple workers contribute partial inputs that are automatically merged in real-time may be extended to allow dynamic groups to surpass constituent individuals (even experts) on a variety of human performance tasks.

...read moreread less

213 citations

Patent•

Search system using media metadata tracks

[...]

Todd Stiers

22 Jun 2012

TL;DR: In this paper, real-time metadata tracks recorded to media streams allow search and analysis operations in a variety of contexts and allow insertion of content specific advertising during appropriate portions of a media stream based on the content of the metadata tracks.

...read moreread less

Abstract: Real-time metadata tracks recorded to media streams allow search and analysis operations in a variety of contexts. Search queries can be performed using information in real-time metadata tracks such as closed captioning, sub-title, statistical tracks, miscellaneous data tracks. Media streams can also be augmented with additional tracks. The metadata tracks not only allow efficient searching and indexing, but also allow insertion of content specific advertising during appropriate portions of a media stream based on the content of the metadata tracks.

...read moreread less

36 citations

Proceedings Article•DOI•

Enhancing learning accessibility through fully automatic captioning

[...]

Maria Federico, Marco Furini

16 Apr 2012

TL;DR: This approach couples the usage of off-the-shelf ASR (Automatic Speech Recognition) software with a novel caption alignment mechanism that smartly introduces unique audio markups into the audio stream before giving it to the ASR and transforms the plain transcript produced by theASR into a timecoded transcript.

...read moreread less

Abstract: The simple act of listening or of taking notes while attending a lesson may represent an insuperable burden for millions of people with some form of disabilities (e.g., hearing impaired, dyslexic and ESL students). In this paper, we propose an architecture that aims at automatically creating captions for video lessons by exploiting advances in speech recognition technologies. Our approach couples the usage of off-the-shelf ASR (Automatic Speech Recognition) software with a novel caption alignment mechanism that smartly introduces unique audio markups into the audio stream before giving it to the ASR and transforms the plain transcript produced by the ASR into a timecoded transcript.

...read moreread less

36 citations

Proceedings Article•DOI•

Development and evaluation of indexed captioned searchable videos for STEM coursework

[...]

Tayfun Tuna¹, Jaspal Subhlok¹, Lecia Barker², Varun Varghese¹, Olin G. Johnson¹, Shishir K. Shah¹ - Show less +2 more•Institutions (2)

University of Houston¹, University of Texas at Austin²

29 Feb 2012

TL;DR: The development and evaluation of ICS videos framework and assessment of its value as an academic learning resource are reported on.

...read moreread less

Abstract: Videos of classroom lectures have proven to be a popular and versatile learning resource. This paper reports on videos featuring Indexing, Captioning, and Search capability (ICS Videos). The goal is to allow a user to rapidly search and access a topic of interest, a key shortcoming of the standard video format. A lecture is automatically divided into logical indexed video segments by analyzing video frames. Text is automatically identified with OCR technology enhanced with image transformations to drive keyword search. Captions can be added to videos. The ICS video player integrates indexing, search, and captioning in video playback and has been used by dozens of courses and 1000s of students. This paper reports on the development and evaluation of ICS videos framework and assessment of its value as an academic learning resource.

...read moreread less

32 citations

Journal Article•DOI•

Captions On, Off, on TV, Online: Accessibility and Search Engine Optimization in Online Closed Captioning

[...]

Elizabeth Ellcessor¹•Institutions (1)

Miami University¹

01 Jul 2012-Television & New Media

TL;DR: This article considers the current state of closed captioning for online videos, in the U.S. context, and argues that captions and deafness have long been associated with the private, complicating their advancement under civil rights laws concerned with the public sphere and facilitating advancement through telecommunications laws and notions of consumer choice.

...read moreread less

Abstract: This article considers the current state of closed captioning for online videos, in the U.S. context. As media access is foundational to cultural citizenship, captions and similar accessibility fea...

...read moreread less

30 citations

Proceedings Article•DOI•

Online quality control for real-time crowd captioning

[...]

Walter S. Lasecki¹, Jeffrey P. Bigham¹•Institutions (1)

University of Rochester¹

22 Oct 2012

TL;DR: This paper presents methods for quickly identifying workers who are producing good partial captions and estimating the quality of their input and evaluates these methods in experiments run on Mechanical Turk.

...read moreread less

Abstract: Approaches for real-time captioning of speech are either expensive (professional stenographers) or error-prone (automatic speech recognition). As an alternative approach, we have been exploring whether groups of non-experts can collectively caption speech in real-time. In this approach, each worker types as much as they can and the partial captions are merged together in real-time automatically. This approach works best when partial captions are correct and received within a few seconds of when they were spoken, but these assumptions break down when engaging workers on-demand from existing sources of crowd work like Amazon's Mechanical Turk. In this paper, we present methods for quickly identifying workers who are producing good partial captions and estimating the quality of their input. We evaluate these methods in experiments run on Mechanical Turk in which a total of 42 workers captioned 20 minutes of audio. The methods introduced in this paper were able to raise overall accuracy from 57.8% to 81.22% while keeping coverage of the ground truth signal nearly unchanged.

...read moreread less

29 citations

Proceedings Article•DOI•

A readability evaluation of real-time crowd captions in the classroom

[...]

Raja S. Kushalnagar¹, Walter S. Lasecki², Jeffrey P. Bigham²•Institutions (2)

Rochester Institute of Technology¹, University of Rochester²

22 Oct 2012

TL;DR: This study asked 48 deaf and hearing readers to evaluate transcripts produced by a professional captionist, ASR and crowd captioning software respectively and found the readers preferred crowd captions over professional captions and ASR.

...read moreread less

Abstract: Deaf and hard of hearing individuals need accommodations that transform aural to visual information, such as captions that are generated in real-time to enhance their access to spoken information in lectures and other live events. The captions produced by professional captionists work well in general events such as community or legal meetings, but is often unsatisfactory in specialized content events such as higher education classrooms. In addition, it is hard to hire professional captionists, especially those that have experience in specialized content areas, as they are scarce and expensive. The captions produced by commercial automatic speech recognition (ASR) software are far cheaper, but is often perceived as unreadable due to ASR's sensitivity to accents, background noise and slow response time. We ran a study to evaluate the readability of captions generated by a new crowd captioning approach versus professional captionists and ASR. In this approach, captions are typed by classmates into a system that aligns and merges the multiple incomplete caption streams into a single, comprehensive real-time transcript. Our study asked 48 deaf and hearing readers to evaluate transcripts produced by a professional captionist, ASR and crowd captioning software respectively and found the readers preferred crowd captions over professional captions and ASR.

...read moreread less

25 citations

Patent•

Captioning evaluation system

[...]

Richard T. Polumbus, Troy A. Greenwood

31 Dec 2012

TL;DR: In this article, the system accepts captioning data and determines a number of errors in the caption data, as well as the number of words per minute across the entirety of an event corresponding to the captioning and time intervals of the event.

...read moreread less

Abstract: A captioning evaluation system. The system accepts captioning data and determines a number of errors in the captioning data, as well as the number of words per minute across the entirety of an event corresponding to the captioning data and time intervals of the event. The errors may be used to determine the accuracy of the captioning and the words per minute, both for the entire event and the time intervals, used to determine a cadence and/or rhythm for the captioning. The accuracy and cadence may be used to score the captioning data and captioner.

...read moreread less

24 citations

Journal Article•DOI•

Using Open Captions to Revise Writing in Digital Stories Composed by d/Deaf and Hard of Hearing Students

[...]

Barbara K. Strassman¹, Katie O'Dell•Institutions (1)

The College of New Jersey¹

01 Jan 2012-American Annals of the Deaf

TL;DR: The findings do suggest that the images acted as procedural facilitators, triggering recall of vocabulary and details, in individuals who are d/Deaf and hard of hearing.

...read moreread less

Abstract: Using a nonexperimental design , the researchers explored the effect of captioning as part of the writing process of individuals who are d/Deaf and hard of hearing Sixty-nine d/Deaf and hard of hearing middle school students composed responses to four writing-to-learn activities in a word processor Two compositions were revised and published with software that displayed texts as captions to digital images; two compositions were revised with a word processor and published on paper Analysis showed increases in content-area vocabulary, text length, and inclusion of main ideas and details for texts revised in the captioning software Given the nonexperimental design, it is not possible to determine the extent to which the results could be attributed to captioned revisions However, the findings do suggest that the images acted as procedural facilitators, triggering recall of vocabulary and details

...read moreread less

17 citations

Patent•

Method for Crowd Sourced Multimedia Captioning for Video Content

[...]

Mehul K. Sanghavi¹, Ravindra Phulari¹, Michael P. Greenzeiger¹•Institutions (1)

Apple Inc.¹

20 Nov 2012

TL;DR: In this paper, Cue points are developed with respect to a video and enhancement information is aligned with the cue points such that the cue point and the enhancement information may be maintained separate from the video and applied to any version of a video.

...read moreread less

Abstract: Methods and apparatus are presented for providing enhancement information associated video, for example subtitles or closed captions. Cue points are developed with respect to a video and enhancement information is aligned with the cue points such that the cue point and enhancement information may be maintained separate from the video and applied to any version of a video. Some disclosed embodiments relate to using groups of volunteers to provide and edit enhancement information in a five stage process. The volunteer groups may be operated in a crowd sourcing fashion.

...read moreread less

15 citations

Patent•

System and method for transcoding live closed captions and subtitles

[...]

Scott C. Labrozzi¹, James Christopher Akers¹•Institutions (1)

Cisco Systems, Inc.¹

09 Jan 2012

TL;DR: In this article, a method for generating a plurality of fragments based on the text is presented, which are then used to convert the video data into a second format to be provided as an output, which is based on video data that was received.

...read moreread less

Abstract: A method is provided in one example and includes receiving video data from a video source in a first format, where the video data includes associated text to be overlaid on the video data as part of a video stream. The method also includes generating a plurality of fragments based on the text. The fragments include respective regions having a designated time duration. The method also includes using the plurality of fragments to convert the video data into a second format to be provided as an output, which is based on the video data that was received. In more specific embodiments, the first format is associated with a Paint-On caption or a Roll-Up caption, and the second format is associated with a Pop-On caption. The first format can also be associated with subtitles.

...read moreread less

Patent•

Method and system for targeted commerce in network broadcasting

[...]

Heather Anne Waibel¹, Nicholas Albert Rudock¹•Institutions (1)

PayPal¹

19 Dec 2012

TL;DR: In this paper, a system and a method for using the system for targeted commerce in network broadcasting are provided, which includes an interface device configured to receive a multimedia stream from a network, wherein the multimedia stream includes a close captioning string and wherein the interface device is further configured to process the multimedia streaming by providing advertisements in the multimedia streams according to a correlation between the close caption string and a plurality of vendor keywords; and a viewing device configures to receive the processed multimedia stream and display to a viewer.

...read moreread less

Abstract: A system and a method for using the system for targeted commerce in network broadcasting are provided. The system includes an interface device configured to receive a multimedia stream from a network, wherein the multimedia stream includes a close captioning string and wherein the interface device is further configured to process the multimedia stream by providing advertisements in the multimedia stream according to a correlation between the close captioning string and a plurality of vendor keywords; and a viewing device configured to receive the processed multimedia stream and display to a viewer.

...read moreread less

Proceedings Article•

Creating enriched youtube media fragments with nerd using timed-text

[...]

Yunjia Li¹, Giuseppe Rizzo², Raphaël Troncy², Mike Wald¹, Gary Wills¹ - Show less +1 more•Institutions (2)

University of Southampton¹, Institut Eurécom²

11 Nov 2012

TL;DR: This demo enables the automatic creation of semantically annotated YouTube media fragments by first ingested in the Synote system and then NERD is used to extract named entities from the transcripts which are then temporally aligned with the video.

...read moreread less

Abstract: This demo enables the automatic creation of semantically annotated YouTube media fragments. A video is first ingested in the Synote system and a new method enables to retrieve its associated subtitles or closed captions. Next, NERD is used to extract named entities from the transcripts which are then temporally aligned with the video. The entities are disambiguated in the LOD cloud and a user interface enables to browse through the entities detected in a video or get more information. We evaluated our application with 60 videos from 3 YouTube channels.

...read moreread less

Proceedings Article•DOI•

Novel Approach to Live Captioning Through Re-speaking: Tailoring Speech Recognition to Re-speaker's Needs.

[...]

Ales Prazák¹, Zdenek Loose¹, Jan Trmal¹, Josef Psutka¹, Josef Psutka¹ - Show less +1 more•Institutions (1)

University of West Bohemia¹

09 Sep 2012

TL;DR: The concept of respeaking using only one re-speaker with enhanced re- Speakers tasks fully integrated to the recognition system and captioning software is described and three-level evaluation method of final re-Speaker’s skills is proposed.

...read moreread less

Abstract: A novel approach to the live captioning through re-speaking is introduced in this paper. We describe our concept of respeaking using only one re-speaker with enhanced re-speaker tasks fully integrated to the recognition system and captioning software. New techniques for instant correction of recognition output, punctuation mark introduction or new word addition are presented. Our real-time recognition system of the Czech language with a vocabulary containing more than one million words is described and an architecture of captioning system that we operate is illustrated. Last part of the paper is dedicated to the re-speaker training methodology and a three-level evaluation method of final re-speaker’s skills is proposed.

...read moreread less

Proceedings Article•DOI•

N-gram Language Model Based on Multi-Word Expressions in Web Documents for Speech Recognition and Closed-Captioning

[...]

Shinya Takahashi¹, Tsuyoshi Morimoto¹•Institutions (1)

Fukuoka University¹

13 Nov 2012

TL;DR: The experimental results show that the proposed method for constructing N-gram language model based on multi word expressions from web retrieval results can improve the recognition performance and the closed-captioning accuracy.

...read moreread less

Abstract: Automatic speech recognition technique is generally used to align the closed caption text to video data. It is important to increase the speech recognition accuracy for the accurate closed-captioning. This paper proposes the method for constructing N-gram language model based on multi word expressions (MWEs) from web retrieval results to improve the speech recognition performance. The web retrieval experiment for examining the distribution of web count numbers for MWEs and the speech recognition experiment for investigating the effectiveness of MWEs are conducted. The experimental results show that the proposed method can improve the recognition performance and the closed-captioning accuracy.

...read moreread less

Proceedings Article•DOI•

Synote: Important Enhancements to Learning with Recorded Lectures

[...]

Mike Wald¹, Yunjia Li¹•Institutions (1)

University of Southampton¹

04 Jul 2012

TL;DR: Three new important enhancements to Synote are explained, Crowdsourcing correction of speech recognition errors allows for sustainable captioning of the lecture while the development of an integrated mobile speech recognition application enables synchronized live verbal contributions from the class to also be captured through captions.

...read moreread less

Abstract: This paper explains three new important enhancements to Synote, the freely available, award winning, open source, web based application that makes web hosted recordings easier to access, search, manage, and exploit for learners, teachers and other users. Synote uniquely achieves this through the creation of synchronized notes, bookmarks, tags, links, images and text captions, enabling users to easily find, or associate their notes or resources with, any part of a recording available on the web. Students surveyed would like to be able to access all their lectures through Synote. The facility to convert and import narrated PowerPoint PPTX files means that teachers can capture their lectures without requiring institution-wide expensive lecture capture systems. Crowdsourcing correction of speech recognition errors allows for sustainable captioning of the lecture while the development of an integrated mobile speech recognition application enables synchronized live verbal contributions from the class to also be captured through captions.

...read moreread less

Patent•

Volume level-based closed-captioning control

[...]

Dale Llewelyn Mountain

28 Dec 2012

TL;DR: In this article, a content receiver detects that a volume of an audio of a video presentation has been adjusted by a user, and determines the adjusted audio volume level that results from the adjustment.

...read moreread less

Abstract: Methods and apparatus are provided for a control of closed captioning based on an audio volume level. A content receiver detects that a volume of an audio of a video presentation has been adjusted by a user, and determines the adjusted audio volume level that results from the adjustment. The content receiver compares the resulting adjusted audio volume level to a threshold level. When the content determines that the adjusted audio volume level is under the threshold level, it enables closed captioning of the video presentation, thus presenting the user with both audio and closed captioning. When the content receiver determines that the adjusted audio volume level is above the threshold level, it disables closed captioning for the video presentation. The content receiver may use a microphone to determine the adjusted audio volume level.

...read moreread less

Patent•

Closed captioning management system

[...]

Torbjorn Einarsson

24 Apr 2012

TL;DR: In this paper, closed captioning information can be toggled on/off using menu options and preferences as well as automatically managed by intelligently monitoring the environment surrounding a device.

...read moreread less

Abstract: Media content typically includes closed captioning information such as subtitles in domestic and foreign languages. Techniques and mechanisms provide that closed captioning information may be toggled on/off using menu options and preferences as well as automatically managed by intelligently monitoring the environment surrounding a device. Device sensors such as microphones and vibration monitors determine the noise level of an environment as well as the spectral characteristics of the noise to determine whether the noise profile would interfere with the video playback experience. A particular environmental noise profile could automatically trigger the display of closed captioning information or present an easy access, otherwise unavailable toggle to display closed captioning information associated with a video stream.

...read moreread less

Book Chapter•DOI•

Captioning of Live TV Programs through Speech Recognition and Re-speaking

[...]

Aleš Pražák¹, Zdeněk Loose¹, Jan Trmal¹, Josef Psutka¹, Josef Psutka¹ - Show less +1 more•Institutions (1)

University of West Bohemia¹

03 Sep 2012

TL;DR: This paper describes the recognition system design with advanced re-speaker interaction, distributed captioning system architecture and neglected re-Speaker training, and some evaluation of skilled re- Speakers is presented.

...read moreread less

Abstract: In this paper we introduce our complete solution for captioning of live TV programs used by the Czech Television, the public service broadcaster in the Czech Republic. Live captioning using speech recognition and re-speaking is on the increase and widely used for example in BBC; however, many specific issues have to be solved each time a new captioning system is being put in operation. Our concept of re-speaking assumes a complex integration of re-speaker’s skills, not only verbatim repetition with fully automatic processing. This paper describes the recognition system design with advanced re-speaker interaction, distributed captioning system architecture and neglected re-speaker training. Some evaluation of our skilled re-speakers is presented too.

...read moreread less

Patent•

Method and system for providing synchronized playback of media streams and corresponding closed captions

[...]

Narendra B. Babu¹, Syed Mohasin Zaki¹, Venkata S. Adimatyam¹, Jacques Franklin¹•Institutions (1)

Verizon Communications¹

13 Mar 2012

TL;DR: In this article, an approach for providing synchronized playback of media streams and corresponding closed captions is described, where one or more portions of a media stream and closed caption data is received, at a virtual video server resident on a user device, from an external video server.

...read moreread less

Abstract: An approach for providing synchronized playback of media streams and corresponding closed captions is described. One or more portions of a media stream and corresponding closed caption data is received, at a virtual video server resident on a user device, from an external video server. The one or more portions of the media stream and the corresponding closed caption data is buffered by the virtual video server. The one or more portions of the media stream is delivered to a video player application and the corresponding closed caption data is delivered to a rendering application as to synchronize playback of the one or more portions of the media stream and the corresponding closed caption data by the respective applications, wherein the video player application and the rendering application are resident on the user device.

...read moreread less

Patent•

Systems and methods for extracting and processing intelligent structured data from media files

[...]

Howard Briggs

21 Sep 2012

TL;DR: In this paper, the automatic processing and indexing of video and audio source files including the automatic generation of and maintenance of video, audio, concordance, text and closed caption files corresponding to the media content of the source files.

...read moreread less

Abstract: The automatic processing and indexing of video and audio source files including the automatic generation of and maintenance of video, audio, concordance, text and closed caption files corresponding to the media content of the source files. Generating and maintaining the files in such a way that the content of these files remains aligned so that the timing synchronization of the audio, the video, the text and closed caption information during play back is strictly maintained, even after text is edited and/or translated to another language.

...read moreread less

Patent•

Method And Apparatus For Using Contextual Content Augmentation To Provide Information On Recent Events In A Media Program

[...]

James G. Hanko, Christopher Unkel, Duane Northcutt

30 Dec 2012

TL;DR: In this article, a method and apparatus for content augmentation in an audio video system is presented, which concerns storing embedded data, such as close captioning or metadata, and displaying that embedded data concerning a past event in response to a user request.

...read moreread less

Abstract: The present invention concerns a method and apparatus for content augmentation in an audio video system. In particular, the invention concerns storing embedded data, such as close captioning or metadata, and displaying that embedded data concerning a past event in response to a user request. The user request way be received from a remote control, via voice recognition, or facial recognition. In addition, the apparatus is operative to facilitate the viewer to scroll through buffered embedded data independent of any video being displayed. Thus the viewer may review closed captioning information for video which had previously been displayed.

...read moreread less

Patent•

Search-based navigation of media content

[...]

Curtis Calhoun

27 Apr 2012

TL;DR: In this article, closed captioning, social media content, and tags associated with various media segments are analyzed to allow identification of particular entities depicted in the different media segments, and image recognition and audio recognition algorithms are also performed to further identify entities or validate results from the analysis of metadata.

...read moreread less

Abstract: Mechanisms are provided to allow for improved media content navigation. Metadata such as closed captioning, social media content, and tags associated with various media segments are analyzed to allow identification of particular entities depicted in the various media segments. Image recognition and audio recognition algorithms can also be performed to further identify entities or validate results from the analysis of metadata.

...read moreread less

Proceedings Article•DOI•

Blue herd: automated captioning for videoconferences

[...]

Ira R. Forman¹, Ben J. Fletcher¹, John G. Hartley¹, Bill Rippon¹, Allen Keith Wilson¹ - Show less +1 more•Institutions (1)

IBM¹

22 Oct 2012

TL;DR: The system that was developed for personal computers is explained and the experiments to include mobile devices and multi-participant meeting rooms are described.

...read moreread less

Abstract: Blue Herd is a project in IBM Research to investigate automated captioning for videoconferences. Today videoconferences are held among meeting participants connected with a variety of devices: personal computers, mobile devices, and multi-participant meeting rooms. Blue Herd is charged with studying automated real-time captioning in that context. This poster explains the system that was developed for personal computers and describes our experiments to include mobile devices and multi-participant meeting rooms.

...read moreread less

The Effect of TV Captions on the Comprehension of Non-Native Saudi Learners of English

[...]

Mubarak Alkhatnai¹, Saudi Arabia•Institutions (1)

University of Edinburgh¹

01 Jan 2012

TL;DR: Results indicated that while captions may aid one in comprehension, they also tend to limit one’s interpretations, reaffirming the nature of written language as an authoritative source of information.

...read moreread less

Abstract: This paper investigates the effectiveness of closed captioning in aiding Saudi students who are learning ESL (English as a second language). Research was carried out in a qualitative manner, and participants were 12 Saudi students pursuing their studies at Indiana University of Pennsylvania, USA (IUP). Participants in the study were asked to compose a narrative after viewing a 5-minute film segment, both with and without captioning. Their responses were then analyzed, and results indicated that while captions may aid one in comprehension, they also tend to limit one’s interpretations, reaffirming the nature of written language as an authoritative source of information.

...read moreread less

Book Chapter•DOI•

Important new enhancements to inclusive learning using recorded lectures

[...]

Mike Wald¹•Institutions (1)

University of Southampton¹

11 Jul 2012

TL;DR: Three new important enhancements to Synote, the freely available, award winning, open source, web based application that makes web hosted recordings easier to access, search, manage, and exploit for learners, teachers and other users are explained.

...read moreread less

Abstract: This paper explains three new important enhancements to Synote, the freely available, award winning, open source, web based application that makes web hosted recordings easier to access, search, manage, and exploit for learners, teachers and other users. The facility to convert and import narrated PowerPoint PPTX files means that teachers can capture and caption their lectures without requiring institution-wide expensive lecture capture or captioning systems. Crowdsourcing correction of speech recognition errors allows for sustainable captioning of any originally uncaptioned lecture while the development of an integrated mobile speech recognition application enables synchronized live verbal contributions from the class to also be captured through captions.

...read moreread less

Patent•

Discovery of live and on-demand content using metadata

[...]

Curtis Calhoun

12 Nov 2012

TL;DR: In this paper, a closed caption content search and ranking system was proposed to allow for content discovery using closed caption search and search mechanisms, which analyze titles, descriptions, social media content, metadata, etc., and intelligently organize content for presentation to a viewer.

...read moreread less

Abstract: Mechanisms are provided to allow for content discovery using closed caption content search and ranking. Search mechanisms analyze titles, descriptions, social media content, metadata, etc., and intelligently organize content for presentation to a viewer. Image recognition and audio recognition algorithms can also be performed to further identify entities or validate results from the analysis of metadata. Other closed captioning content may be analyzed to determine the relevance of a piece of media content to a particular search term found in the piece of media content. Results are ranked based on the prominence of search and related terms in titles, descriptions, and closed caption contents along with the popularity of the media content itself.

...read moreread less

Proceedings Article•DOI•

Towards a video processing architecture for SBTVD

[...]

Marcelo Negreiros¹, H. A. Klein¹, Alexsandro Cristóvão Bonatto¹, André Brugnara Soares¹, Altamiro Amadeu Susin¹ - Show less +1 more•Institutions (1)

Universidade Federal do Rio Grande do Sul¹

20 Mar 2012

TL;DR: The paper discusses design and implementation issues for several modules like video scaler, video captioning and also the generation of video outputs signals (VGA or composite PAL-M) and implementation results using a FPGA-based hardware platform.

...read moreread less

Abstract: In this paper a video processing architecture for use in a set top box (STB) compatible with the Brazilian Digital Television System (SBTVD) is presented. After the decoding process, a video frame is stored in the STB memory and is scanned by the output subsystem while executing several operations in order to fit the external display. The paper discusses design and implementation issues for several modules like video scaler, video captioning and also the generation of video outputs signals (VGA or composite PAL-M). Implementation results using a FPGA-based hardware platform are also provided. The goal is to go to silicon implementation after the FPGA validation phase.

...read moreread less

Patent•

Providing customised supplementary content to a personal user device

[...]

Renaud Difrancesco¹, Kishor Thakorbai Patel¹, Huw Hopkins¹•Institutions (1)

Sony Broadcast & Professional Research Laboratories¹

19 Oct 2012

TL;DR: In this article, the authors propose a system for providing a media content item to a user, where the user profile information associated with the user of the device is stored on the device, and a supplementary information requester is used to transmit the identifying data which identifies the media content items, the user profiles and a request for supplementary information related to the media contents to a server over a network.

...read moreread less

Abstract: A device 10 for providing a media content item to a user comprises: a media content item receiver operable to receive the media content item as a data stream MED, where the data stream may be a TV programme which comprises identifying data. The device stores user profile information associated with the user of the device, and further includes a supplementary information requester operable to transmit the identifying data which identifies: the media content item, the user profile information, and a request for supplementary information related to the media content item to a server over a network. The device receives supplementary information SI related to the media content item from the server over the network, such as sub-titles, closed captioning or foreign language dubbing. The type of supplementary information is determined by the server on the basis of the transmitted identifying data which identifies the media content item and the transmitted user profile information. The received supplementary information is transmitted to a separate personal user device 50, such as headphones 50C, a tablet computer 50B or glasses 50A, as specified by the stored user profile information.

...read moreread less

Journal Article•

Effective Captioning Method by Using Croud-Sourcing Approach

[...]

Nagatsuma Reiko, Fukuda Kentarou, Yaginuma Yoshitomo, Hirose Yoko

29 Nov 2012-IEICE technical report. Welfare Information technology