scispace - formally typeset
Proceedings ArticleDOI

Image Captioning Methods and Metrics

Reads0
Chats0
TLDR
This paper demonstrates a concise state of art image captioning and its method for caption generation using deep learning concepts and evaluates the proposed system experimental analysis with numerous existing systems and shows the effeteness of system.
Abstract
Image Captioning is one of the emerging topics of research in the field of AI. It uses a combination of Computer Vision (CV) and Natural Language Processing (NLP) to derive features from the image, use this information to identify objects, actions, their relationships, and generate a description for the image. It is most important concept in artificial intelligence applied in the fields like aid to the blind, self-driving cars, and many more. This paper we demonstrates a concise state of art image captioning and its method for caption generation using deep learning concepts. We also determine the approach for image caption generation using Convolutional Neural Network (CNN) and Generative Adversarial Network (GAN) model in deep learning framework. Using this approach system intelligent enough to create sentences for images. It uses the encoder-decoder architecture, where CNN is used for image vector generation and LSTM is used for the generation of a logical sentence using the NLP concepts. Finally, we evaluate the proposed system experimental analysis with numerous existing systems and show the effeteness of system.

read more

Citations
More filters
Proceedings ArticleDOI

A Comparative Study of Machine Learning Based Image Captioning Models

TL;DR: Zhang et al. as discussed by the authors performed a comparative analysis on three Machine Learning (ML) algorithms, i.e. k-Nearest Neighbor (KNN), Convolution Neural Network (CNN) with Long Short Term Memory (LSTM) and Attention Based LSTM.
Proceedings ArticleDOI

Sequential Memory Modelling for Video Captioning

TL;DR: In this paper , an encoder-decoder network end-in-frame based on a deep learning approach was used to generate video subtitles, and the model, dataset and parameters used to evaluate the model.
Journal ArticleDOI

Arabic Captioning for Images of Clothing Using Deep Learning

Rasha AL-Malki, +1 more
- 01 Apr 2023 - 
TL;DR: In this article , the authors proposed a model for captioning images of clothing in the Arabic language using deep learning, which achieved a BLEU-1 score of 88.52.
Proceedings ArticleDOI

Sequential Memory Modelling for Video Captioning

TL;DR: In this paper , an encoder-decoder network end-in-frame based on a deep learning approach was used to generate video subtitles, and the model, dataset and parameters used to evaluate the model.
References
More filters
Proceedings ArticleDOI

Bleu: a Method for Automatic Evaluation of Machine Translation

TL;DR: This paper proposed a method of automatic machine translation evaluation that is quick, inexpensive, and language-independent, that correlates highly with human evaluation, and that has little marginal cost per run.
Proceedings ArticleDOI

Deep visual-semantic alignments for generating image descriptions

TL;DR: A model that generates natural language descriptions of images and their regions based on a novel combination of Convolutional Neural Networks over image regions, bidirectional Recurrent Neural networks over sentences, and a structured objective that aligns the two modalities through a multimodal embedding is presented.
Posted Content

Show and Tell: A Neural Image Caption Generator

TL;DR: This paper presents a generative model based on a deep recurrent architecture that combines recent advances in computer vision and machine translation and that can be used to generate natural sentences describing an image.
Journal ArticleDOI

Deep Visual-Semantic Alignments for Generating Image Descriptions

TL;DR: A model that generates natural language descriptions of images and their regions based on a novel combination of Convolutional Neural Networks over image regions, bidirectional Recurrent Neural networks over sentences, and a structured objective that aligns the two modalities through a multimodal embedding is presented.

Association for the Advancement of Artificial Intelligence

TL;DR: AIIDE is the premier conference on artificial intelligence in computer games and interactive entertainment that brings together technical leaders to examine how computer games can be improved using AI technologies, and to promote new approaches and commercial developments.
Related Papers (5)