Topic

Closed captioning

About: Closed captioning is a research topic. Over the lifetime, 3011 publications have been published within this topic receiving 64494 citations. The topic is also known as: CC.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Describing Like Humans: On Diversity in Image Captioning

[...]

Qingzhong Wang¹, Antoni B. Chan¹•Institutions (1)

City University of Hong Kong¹

01 Jun 2019

TL;DR: This paper proposed a new metric for measuring the diversity of image captions, which is derived from latent semantic analysis and kernelized to use CIDEr similarity, and also showed that balancing the cross-entropy loss and cIDEr reward in reinforcement learning during training can effectively control the tradeoff between diversity and accuracy of the generated captions.

...read moreread less

Abstract: Recently, the state-of-the-art models for image captioning have overtaken human performance based on the most popular metrics, such as BLEU, METEOR, ROUGE and CIDEr. Does this mean we have solved the task of image captioning The above metrics only measure the similarity of the generated caption to the human annotations, which reflects its accuracy. However, an image contains many concepts and multiple levels of detail, and thus there is a variety of captions that express different concepts and details that might be interesting for different humans. Therefore only evaluating accuracy is not sufficient for measuring the performance of captioning models --- the diversity of the generated captions should also be considered. In this paper, we proposed a new metric for measuring the diversity of image captions, which is derived from latent semantic analysis and kernelized to use CIDEr similarity. We conduct extensive experiments to re-evaluate recent captioning models in the context of both diversity and accuracy. We find that there is still a large gap between the model and human performance in terms of both accuracy and diversity, and the models that have optimized accuracy (CIDEr) have low diversity. We also show that balancing the cross-entropy loss and CIDEr reward in reinforcement learning during training can effectively control the tradeoff between diversity and accuracy of the generated captions.

...read moreread less

61 citations

Journal Article•DOI•

Video Captioning with Multi-Faceted Attention

[...]

Xiang Long¹, Chuang Gan¹, Gerard de Melo²•Institutions (2)

Tsinghua University¹, Rutgers University²

19 Mar 2018-Transactions of the Association for Computational Linguistics

TL;DR: The authors proposed a unified and extensible framework to jointly leverage multiple kinds of visual features and semantic attributes, and achieved state-of-the-art performance on the MSVD and VTT datasets.

...read moreread less

Abstract: Video captioning has attracted an increasing amount of interest, due in part to its potential for improved accessibility and information retrieval. While existing methods rely on different kinds of visual features and model architectures, they do not make full use of pertinent semantic cues. We present a unified and extensible framework to jointly leverage multiple sorts of visual features and semantic attributes. Our novel architecture builds on LSTMs with two multi-faceted attention layers. These first learn to automatically select the most salient visual features or semantic attributes, and then yield overall representations for the input and output of the sentence generation component via custom feature scaling operations. Experimental results on the challenging MSVD and MSR-VTT datasets show that our framework outperforms previous work and performs robustly even in the presence of added noise to the features and attributes.

...read moreread less

61 citations

Patent•

Distribution of Closed Captioning From a Server to a Client Over a Home Network

[...]

Christopher J. Stone¹, Albert F. Elcock¹, Patrick J. Leary¹, Jeffrey M. Newdeck¹•Institutions (1)

General Instrument¹

01 Dec 2006

TL;DR: In this paper, a multimedia server distributes closed captioning over a network to a client device running a media player that does not support standardized closed-captioning, such as CEA-608-B or CEA 708-B, Advanced Television Systems Committee ATSC A/53 or the Society of Cable Telecommunications Engineers SCTE 20 and/orSCTE 21.

...read moreread less

Abstract: A multimedia server distributes closed captioning over a network to a client device running a media player that does not support standardized closed captioning. The multimedia server receives a media stream including closed captioning that is encoded according to a closed captioning standard such as Consumer Electronics Association CEA-608-B or CEA 708-B, Advanced Television Systems Committee ATSC A/53 or the Society of Cable Telecommunications Engineers SCTE 20 and/or SCTE 21. The multimedia server transcodes the closed captioning into a format that is usable by the media player and transmits the transcoded closed captioning to the client device over the network so that the media player can render the closed captioning synchronously with programming content included in the media stream.

...read moreread less

61 citations

Patent•

Video signal processing system with auxiliary information processing capability

[...]

Carlsgaard Eric Stephen, Forler Joseph Wayne, William John Testin

21 May 2002

TL;DR: In this article, an exemplary television signal system such as described herein involves using closed caption (CC) data from a standard definition signal, processing CC data, and overlaying the CC data at a video rate of a higher definition signal selected for viewing that does not carry its own embedded closed caption data.

...read moreread less

Abstract: A system as described herein enables a user to access auxiliary information when viewing an enhanced performance television signal or program. Particularly, a television signal system is operative, configured, and/or enabled to allow a user to access and/or utilize auxiliary information when viewing a high definition or progressive-scan television signal. Briefly, an exemplary television signal system receives the auxiliary information/data (e.g. closed caption data) on a selected interlaced standard definition input, processes the auxiliary data, and combines or overlays the auxiliary data with a television (video) signal received on a selected input that does not have its own embedded auxiliary information/data. More particularly, an exemplary television signal system such as described herein involves using closed caption (CC) data from a standard definition signal, processing the CC data, and overlaying the CC data at a video rate of a higher definition signal selected for viewing that does not carry its own embedded closed caption data.

...read moreread less

61 citations

Proceedings Article•DOI•

Automatic news video segmentation and categorization based on closed-captioned text

[...]

Weiyu Zhu¹, Candemir Toklu², Shih-Ping Liou²•Institutions (2)

University of Illinois at Urbana–Champaign¹, Princeton University²

22 Aug 2001

TL;DR: A novel statistical approach is presented, called the weighted voting method, for automatic news video story categorization based on the closed captioned text.

...read moreread less

Abstract: In this paper, we present a novel statistical approach, called the weighted voting method, for automatic news video story categorization based on the closed captioned text. News video is initially segmented into stories using the demarcations in the closed captioned text, then a set of

...read moreread less

61 citations

Collapse

Network Information

Performance

Metrics

4,575

Papers

96,790

Citations

No. of papers in the topic in previous years
Year	Papers
2023	536
2022	1,030
2021	504
2020	530
2019	448
2018	334

Closed captioning

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics