scispace - formally typeset
Y

Yi Bin

Researcher at University of Electronic Science and Technology of China

Publications -  22
Citations -  691

Yi Bin is an academic researcher from University of Electronic Science and Technology of China. The author has contributed to research in topics: Computer science & Closed captioning. The author has an hindex of 9, co-authored 16 publications receiving 376 citations.

Papers
More filters
Journal ArticleDOI

Describing Video With Attention-Based Bidirectional LSTM

TL;DR: A novel video captioning framework, which integrates bidirectional long-short term memory (BiLSTM) and a soft attention mechanism to generate better global representations for videos as well as enhance the recognition of lasting motions in videos.
Journal ArticleDOI

Video Captioning by Adversarial LSTM

TL;DR: This paper adopts a standard generative adversarial network (GAN) architecture, characterized by an interplay of two competing processes: a “generator” that generates textual sentences given the visual content of a video and a "discriminator" that controls the accuracy of the generated sentences.
Proceedings ArticleDOI

Graph-to-Tree Learning for Solving Math Word Problems

TL;DR: Graph2Tree is proposed, a novel deep learning architecture that combines the merits of the graph-based encoder and tree-based decoder to generate better solution expressions for math word problem (MWP) solution expressions.
Proceedings ArticleDOI

Bidirectional Long-Short Term Memory for Video Description

TL;DR: Wang et al. as discussed by the authors proposed a novel video captioning framework, termed as ''Bidirectional Long-Short Term Memory'' (BiLSTM), which deeply captures bidirectional global temporal structure in video.
Proceedings ArticleDOI

Adaptively Attending to Visual Attributes and Linguistic Knowledge for Captioning

TL;DR: This work designs a key control unit, termed visual gate, to adaptively decide "when" and "what" the language generator attend to during the word generation process, and employs a bottom-up workflow to learn a pool of semantic attributes for serving as the propositional attention resources.