Home
/
Topics
/
Closed captioning

Topic

Closed captioning

About: Closed captioning is a research topic. Over the lifetime, 3011 publications have been published within this topic receiving 64494 citations. The topic is also known as: CC.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1989
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977

Papers

PDF

Open Access

More filters

Patent•

Closed-captioning system and method

[...]

Mark Gilmore Mears

27 Jan 2006

TL;DR: In this paper, a method and apparatus for processing closed caption information associated with a video program by identifying a parameter associated with the video program; and, formatting the appearance of the caption information in response to the identified parameter.

...read moreread less

Abstract: A method and apparatus for processing closed caption information associated with a video program by identifying a parameter associated with the video program; and, formatting the appearance of the closed caption information in response to the identified parameter The parameter may comprise genre information, and may be identified from program and system information protocol signals, extended data service information, or program guide data

...read moreread less

28 citations

Journal Article•DOI•

Dual-CNN: A Convolutional language decoder for paragraph image captioning

[...]

Ruifan Li¹, Ruifan Li², Haoyu Liang¹, Shi Yihui¹, Fangxiang Feng¹, Xiaojie Wang¹, Xiaojie Wang² - Show less +3 more•Institutions (2)

Beijing University of Posts and Telecommunications¹, Chinese Ministry of Education²

05 Jul 2020-Neurocomputing

TL;DR: A Dual-CNN decoder with long-term memory ability and parallel computation, which can produce a semantically coherent paragraph for an image, which achieves comparable results compared with state-of-the-art models.

...read moreread less

28 citations

Journal Article•DOI•

Integration of textual cues for fine-grained image captioning using deep CNN and LSTM

[...]

Neeraj Gupta¹, Anand Singh Jalal¹•Institutions (1)

GLA University¹

01 Dec 2020-Neural Computing and Applications

TL;DR: This paper has proposed a model which incorporates a deep convolutional neural network and long short-term memory to boost the accuracy of image captioning by fusing text feature available in an image with the visual features extracted in state-of-the-art methods.

...read moreread less

Abstract: The automatic narration of a natural scene is an important trait in artificial intelligence that unites computer vision and natural language processing. Caption generation is a challenging task in scene understanding. Most of the state-of-the-art methods are using deep convolutional neural network models to extract visual features of the entire image, based on which the parallel structures between images and sentences are exploited using recurrent neural networks for image captioning. However, in such models, only visual features are exploited for caption generation. This work investigated that fusion of text available in an image can give more fined-grained captioning of a scene. In this paper, we have proposed a model which incorporates a deep convolutional neural network and long short-term memory to boost the accuracy of image captioning by fusing text feature available in an image with the visual features extracted in state-of-the-art methods. We have validated the effectiveness of the proposed model on the benchmark datasets (Flickr8k and Flickr30k). The experimental outcomes illustrate that the proposed model outperformed the state-of-the-art methods for image captioning.

...read moreread less

27 citations

Proceedings Article•DOI•

X-LXMERT: Paint, Caption and Answer Questions with Multi-Modal Transformers

[...]

Jaemin Cho¹, Jiasen Lu², Dustin Schwenk², Hannaneh Hajishirzi², Aniruddha Kembhavi² - Show less +1 more•Institutions (2)

University of North Carolina at Chapel Hill¹, Allen Institute for Artificial Intelligence²

23 Sep 2020

TL;DR: X-LXMERT as mentioned in this paper is an extension to LXMERT with training refinements including: discretizing visual representations, using uniform masking with a large range of masking ratios and aligning the right pre-training datasets to the right objectives which enables it to paint.

...read moreread less

Abstract: Mirroring the success of masked language models, vision-and-language counterparts like VILBERT, LXMERT and UNITER have achieved state of the art performance on a variety of multimodal discriminative tasks like visual question answering and visual grounding. Recent work has also successfully adapted such models towards the generative task of image captioning. This begs the question: Can these models go the other way and generate images from pieces of text? Our analysis of a popular representative from this model family – LXMERT – finds that it is unable to generate rich and semantically meaningful imagery with its current training setup. We introduce X-LXMERT, an extension to LXMERT with training refinements including: discretizing visual representations, using uniform masking with a large range of masking ratios and aligning the right pre-training datasets to the right objectives which enables it to paint. X-LXMERT’s image generation capabilities rival state of the art generative models while its question answering and captioning abilities remains comparable to LXMERT. Finally, we demonstrate the generality of these training refinements by adding image generation capabilities into UNITER to produce X-UNITER.

...read moreread less

27 citations

Posted Content•

Auto-captions on GIF: A Large-scale Video-sentence Dataset for Vision-language Pre-training

[...]

Yingwei Pan, Yehao Li, Jianjie Luo, Jun Xu, Ting Yao, Tao Mei - Show less +2 more

05 Jul 2020-arXiv: Computer Vision and Pattern Recognition

TL;DR: A detailed analysis of Auto-captions on GIF dataset in comparison to existing video-sentence datasets is presented and an evaluation of a Transformer-based encoder-decoder structure for vision-language pre-training, which is further adapted to video captioning downstream task and yields the compelling generalizability on MSR-VTT.

...read moreread less

Abstract: In this work, we present Auto-captions on GIF, which is a new large-scale pre-training dataset for generic video understanding. All video-sentence pairs are created by automatically extracting and filtering video caption annotations from billions of web pages. Auto-captions on GIF dataset can be utilized to pre-train the generic feature representation or encoder-decoder structure for video captioning, and other downstream tasks (e.g., sentence localization in videos, video question answering, etc.) as well. We present a detailed analysis of Auto-captions on GIF dataset in comparison to existing video-sentence datasets. We also provide an evaluation of a Transformer-based encoder-decoder structure for vision-language pre-training, which is further adapted to video captioning downstream task and yields the compelling generalizability on MSR-VTT. The dataset is available at \url{http://www.auto-video-captions.top/2020/dataset}.

...read moreread less

27 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
…
133
134
135
136
137
138
139
…
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

4,575

Papers

96,790

Citations

No. of papers in the topic in previous years
Year	Papers
2023	536
2022	1,030
2021	504
2020	530
2019	448
2018	334

Closed captioning

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics