scispace - formally typeset
J

Jinjin Zhang

Researcher at Chinese Academy of Sciences

Publications -  5
Citations -  62

Jinjin Zhang is an academic researcher from Chinese Academy of Sciences. The author has contributed to research in topics: Encoder & Text processing. The author has an hindex of 2, co-authored 5 publications receiving 42 citations. Previous affiliations of Jinjin Zhang include Beihang University.

Papers
More filters
Proceedings ArticleDOI

Lock3DFace: A large-scale database of low-cost Kinect 3D faces

TL;DR: A large-scale database consisting of low cost Kinect 3D face videos, namely Lock3DFace, for3D face analysis, particularly for 3D Face Recognition (FR), and the standard experimental protocol for low-cost 3D FR is designed.
Book ChapterDOI

Representation and Correlation Enhanced Encoder-Decoder Framework for Scene Text Recognition

TL;DR: In this article, a representation and correlation enhanced encoder-decoder framework (RCEED) is proposed to enhance the correlation between scene and text feature space by aligning local visual feature, global context feature, and position information.
Posted Content

Pose-Based Two-Stream Relational Networks for Action Recognition in Videos.

TL;DR: This work opens a new door to action recognition by combining 2D human pose extracted from raw video and image appearance by proposing a pose-object relational network (PSRN) to model the relationship between human poses and action-related objects.
Posted Content

A Feasible Framework for Arbitrary-Shaped Scene Text Recognition

TL;DR: This paper proposes a feasible framework for multi-lingual arbitrary-shaped STR, including instance segmentation based text detection and language model based attention mechanism for text recognition.
Posted Content

Representation and Correlation Enhanced Encoder-Decoder Framework for Scene Text Recognition

TL;DR: In this paper, a representation and correlation enhanced encoder-decoder framework (RCEED) is proposed to enhance the correlation between scene and text feature space by aligning local visual feature, global context feature, and position information.