Recurrent Models of Visual Attention
Citations
14,807 citations
Cites background from "Recurrent Models of Visual Attentio..."
...Attention can be viewed, broadly, as a tool to bias the allocation of available processing resources towards the most informative components of an input signal [17, 18, 22, 29, 32]....
[...]
10,913 citations
8,055 citations
Cites background from "Recurrent Models of Visual Attentio..."
...…allowing models to learn alignments between different modalities, e.g., between image objects and agent actions in the dynamic control problem (Mnih et al., 2014), between speech frames and text in the speech recognition task (Chorowski et al., 2014), or between visual features of a picture…...
[...]
..., between image objects and agent actions in the dynamic control problem (Mnih et al., 2014), between speech frames and text in the speech recognition task (Chorowski et al....
[...]
6,485 citations
Cites background or methods or result from "Recurrent Models of Visual Attentio..."
..., 2014) and object recognition (Ba et al., 2014; Mnih et al., 2014), we investigate models that can attend to salient part of an image while generating its caption....
[...]
...In particular however, our work directly extends the work of Bahdanau et al. (2014); Mnih et al. (2014); Ba et al. (2014)....
[...]
...Similar, but more complicated variance reduction techniques have previously been used by Mnih et al. (2014) and Ba et al. (2014)....
[...]
...As pointed out and used in Ba et al. (2014) and Mnih et al. (2014), this is formulation is equivalent to the REINFORCE learning rule (Williams, 1992), where the reward for the attention choosing a sequence of actions is a real value proportional to the log likelihood of the target sentence under…...
[...]
...…by recent advances in caption generation and inspired by recent success in employing attention in machine translation (Bahdanau et al., 2014) and object recognition (Ba et al., 2014; Mnih et al., 2014), we investigate models that can attend to salient part of an image while generating its caption....
[...]
5,411 citations
References
73,978 citations
72,897 citations
"Recurrent Models of Visual Attentio..." refers methods in this paper
...The experiment done on a dynamic environment used a core of LSTM units [10]....
[...]
21,729 citations
18,620 citations
10,525 citations