scispace - formally typeset
Search or ask a question
Topic

Closed captioning

About: Closed captioning is a research topic. Over the lifetime, 3011 publications have been published within this topic receiving 64494 citations. The topic is also known as: CC.


Papers
More filters
Proceedings ArticleDOI
01 Jan 2016
TL;DR: This paper proposes an additional convolutional neural network solely devoted to sentiments, where dataset on sentiment is built from a data-driven, multi-label approach, and shows that this method can generate image captions with sentiment terms that are more compatible with the images than solely relying on features devoted to object classification, while capable of preserving the semantics.
Abstract: Image captioning task has become a highly competitive research area with successful application of convolutional and recurrent neural networks, especially with the advent of long short-term memory (LSTM) architecture. However, its primary focus has been a factual description of the images, including the objects, movements, and their relations. While such focus has demonstrated competence, describing the images along with nonfactual elements, namely sentiments of the images expressed via adjectives, has mostly been neglected. We attempt to address this issue by fine-tuning an additional convolutional neural network solely devoted to sentiments, where dataset on sentiment is built from a data-driven, multi-label approach. Our experimental results show that our method can generate image captions with sentiment terms that are more compatible with the images than solely relying on features devoted to object classification, while capable of preserving the semantics.

16 citations

Proceedings ArticleDOI
09 Sep 2019
TL;DR: This work creates a new dataset for figure captioning, FigCAP, and proposes the Label Maps Attention Model, a method to achieve accurate generation of labels in figures that outperforms the baselines.
Abstract: Figures are human-friendly but difficult for computers to process automatically. In this work, we investigate the problem of figure captioning. The goal is to automatically generate a natural language description of a given figure. We create a new dataset for figure captioning, FigCAP. To achieve accurate generation of labels in figures, we propose the Label Maps Attention Model. Extensive experiments show that our method outperforms the baselines. A successful solution to this task allows figure content to be accessible to those with visual impairment by providing input to a text-to-speech system; and enables automatic parsing of vast repositories of documents where figures are pervasive.

16 citations

Patent
04 Nov 2015
TL;DR: In this paper, the authors present a system for obtaining content over the Internet, identifying text within the content (e.g., such as closed captioning or recipe text) or creating text from the content using such technologies as speech recognition, analyzing the text for actionable directions, and translating those actions into instructions suitable for network-connected cooking appliances.
Abstract: Systems and methods for obtaining content over the Internet, identifying text within the content (e.g., such as closed captioning or recipe text) or creating text from the content using such technologies as speech recognition, analyzing the text for actionable directions, and translating those actionable directions into instructions suitable for network-connected cooking appliances. Certain embodiments provide additional guidance to avoid or correct mistakes in the cooking process, and allow for the customization of recipes to address, e.g., dietary restrictions, culinary preferences, translation into a foreign language, etc.

16 citations

Proceedings ArticleDOI
03 Jun 2018
TL;DR: This paper uses image captioning to produce a textual description from an image, then exploits a natural language processing algorithm to extract main components in the produced description, and generates a general graph according to detected components in descriptions of the image.
Abstract: Image captioning is the process of analyzing an image and generating a textual description according to objects and actions in the image. Thus, both image processing and natural language understanding are required for an image captioning system. Applications of image captioning can vary from assisting visually impaired people to detecting fake news in social media. One of significant utilizations of image captioning would be the detection of particular actions in images. In this paper, we use image captioning to produce a textual description from an image. Then we exploit a natural language processing algorithm to extract main components in the produced description. Finally we generate a general graph according to detected components in descriptions of the image. The generated graph shows objects and pairwise relationship between them along with their attributes. Thus, it can be used to determine if there is any particular relation in a sequence of input images.

16 citations

Proceedings ArticleDOI
02 Jun 2017
TL;DR: This paper presents a simplistic encoder and decoder based implementation with significant modifications and optimizations which enable these models to run on low-end hardware of hand-held devices and implements a first of its kind Android application to demonstrate the realtime applicability and optimizations of this approach.
Abstract: The recent advances in Deep Learning based Machine Translation and Computer Vision have led to excellent Image Captioning models using advanced techniques like Deep Reinforcement Learning While these models are very accurate, these often rely on the use of expensive computation hardware making it difficult to apply these models in realtime scenarios, where their actual applications can be realised In this paper, we carefully follow some of the core concepts of Image Captioning and its common approaches and present our simplistic encoder and decoder based implementation with significant modifications and optimizations which enable us to run these models on low-end hardware of hand-held devices We also compare our results evaluated using various metrics with state-of-the-art models and analyze why and where our model trained on MSCOCO dataset lacks due to the trade-off between computation speed and quality Using the state-of-the-art TensorFlow framework by Google, we also implement a first of its kind Android application to demonstrate the realtime applicability and optimizations of our approach

16 citations


Network Information
Related Topics (5)
Feature vector
48.8K papers, 954.4K citations
83% related
Object detection
46.1K papers, 1.3M citations
82% related
Convolutional neural network
74.7K papers, 2M citations
82% related
Deep learning
79.8K papers, 2.1M citations
82% related
Unsupervised learning
22.7K papers, 1M citations
81% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023536
20221,030
2021504
2020530
2019448
2018334