Deep Visual-Semantic Alignments for Generating Image Descriptions
Citations
49,914 citations
[...]
38,208 citations
5,896 citations
2,466 citations
Cites background from "Deep Visual-Semantic Alignments for..."
...…(Mikolov et al., 2010, 2011; Sutskever et al., 2011), machine translation (Liu et al., 2014; Auli et al., 2013; Sutskever et al., 2014), speech recognition (Robinson et al., 1996; Graves et al., 2013; Graves and Jaitly, 2014; Sak et al., 2014), image captioning (Karpathy and Fei-Fei, 2015) etc....
[...]
...This template is naturally suited for many NLP tasks such as language modeling [2], [63], [64], machine translation [65]–[67], speech recognition [68]–[71], image captioning [72]....
[...]
2,445 citations
References
73,978 citations
72,897 citations
"Deep Visual-Semantic Alignments for..." refers background or methods in this paper
...We achieved the best results using RMSprop [45], which is an adaptive step size method that scales the gradient of each weight by a running average of its gradient magnitudes....
[...]
...The results of these experiments can be found in Table 1, and example retrievals in Figure 5....
[...]
55,235 citations
"Deep Visual-Semantic Alignments for..." refers methods in this paper
...We achieved the best results using RMSprop [45], which is an adaptive step size method that scales the gradient of each weight by a running average of its gradient magnitudes....
[...]
49,914 citations
49,639 citations
"Deep Visual-Semantic Alignments for..." refers background in this paper
...Then we introduce our novel objective, which learns the embedding representations so that semantically similar concepts across the two modalities occupy nearby regions of the space....
[...]