Home
/
Authors
/
Hyun-Soo Choi

Author

Hyun-Soo Choi

Bio: Hyun-Soo Choi is an academic researcher from Samsung. The author has contributed to research in topics: Deep learning & Wearable computer. The author has an hindex of 7, co-authored 28 publications receiving 403 citations.

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

ICDAR2017 Robust Reading Challenge on Multi-Lingual Scene Text Detection and Script Identification - RRC-MLT

[...]

Nibal Nayef, Fei Yin¹, Imen Bizid, Hyun-Soo Choi², Yuan Feng¹, Dimosthenis Karatzas³, Zhenbo Luo², Umapada Pal, Christophe Rigaud, Joseph Chazalon, Wafa Khlif, Muhammad Muzzamil Luqman, Jean-Christophe Burie, Cheng-Lin Liu, Jean-Marc Ogier - Show less +11 more•Institutions (3)

Chinese Academy of Sciences¹, Samsung², CVC Capital Partners³

01 Nov 2017

TL;DR: This paper presents the dataset, the tasks and the findings of this RRC-MLT challenge, which aims at assessing the ability of state-of-the-art methods to detect Multi-Lingual Text in scene images, such as in contents gathered from the Internet media and in modern cities where multiple cultures live and communicate together.

...read moreread less

Abstract: Text detection and recognition in a natural environment are key components of many applications, ranging from business card digitization to shop indexation in a street. This competition aims at assessing the ability of state-of-the-art methods to detect Multi-Lingual Text (MLT) in scene images, such as in contents gathered from the Internet media and in modern cities where multiple cultures live and communicate together. This competition is an extension of the Robust Reading Competition (RRC) which has been held since 2003 both in ICDAR and in an online context. The proposed competition is presented as a new challenge of the RRC. The dataset built for this challenge largely extends the previous RRC editions in many aspects: the multi-lingual text, the size of the dataset, the multi-oriented text, the wide variety of scenes. The dataset is comprised of 18,000 images which contain text belonging to 9 languages. The challenge is comprised of three tasks related to text detection and script classification. We have received a total of 16 participations from the research and industrial communities. This paper presents the dataset, the tasks and the findings of this RRC-MLT challenge.

...read moreread less

321 citations

Proceedings Article•DOI•

Arbitrary Shape Scene Text Detection With Adaptive Text Region Representation

[...]

Xiaobing Wang¹, Yingying Jiang¹, Zhenbo Luo¹, Cheng-Lin Liu², Hyun-Soo Choi¹, Sungjin Kim¹ - Show less +2 more•Institutions (2)

Samsung¹, Chinese Academy of Sciences²

01 Jun 2019

TL;DR: Recurrent neural network based adaptive text region representation is proposed for text region refinement, where a pair of boundary points are predicted each time step until no new points are found, and text regions of arbitrary shapes are detected and represented with adaptive number of boundary Points.

...read moreread less

Abstract: Scene text detection attracts much attention in computer vision, because it can be widely used in many applications such as real-time text translation, automatic information entry, blind person assistance, robot sensing and so on. Though many methods have been proposed for horizontal and oriented texts, detecting irregular shape texts such as curved texts is still a challenging problem. To solve the problem, we propose a robust scene text detection method with adaptive text region representation. Given an input image, a text region proposal network is first used for extracting text proposals. Then, these proposals are verified and refined with a refinement network. Here, recurrent neural network based adaptive text region representation is proposed for text region refinement, where a pair of boundary points are predicted each time step until no new points are found. In this way, text regions of arbitrary shapes are detected and represented with adaptive number of boundary points. This gives more accurate description of text regions. Experimental results on five benchmarks, namely, CTW1500, TotalText, ICDAR2013, ICDAR2015 and MSRA-TD500, show that the proposed method achieves state-of-the-art in scene text detection.

...read moreread less

162 citations

Posted Content•

Arbitrary Shape Scene Text Detection with Adaptive Text Region Representation

[...]

Xiaobing Wang¹, Yingying Jiang¹, Zhenbo Luo¹, Cheng-Lin Liu², Hyun-Soo Choi¹, Sungjin Kim¹ - Show less +2 more•Institutions (2)

Samsung¹, Chinese Academy of Sciences²

15 May 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: In this paper, a text region proposal network is used to extract text proposals and then these proposals are verified and refined with a refinement network, where a pair of boundary points are predicted each time step until no new points are found.

...read moreread less

Abstract: Scene text detection attracts much attention in computer vision, because it can be widely used in many applications such as real-time text translation, automatic information entry, blind person assistance, robot sensing and so on. Though many methods have been proposed for horizontal and oriented texts, detecting irregular shape texts such as curved texts is still a challenging problem. To solve the problem, we propose a robust scene text detection method with adaptive text region representation. Given an input image, a text region proposal network is first used for extracting text proposals. Then, these proposals are verified and refined with a refinement network. Here, recurrent neural network based adaptive text region representation is proposed for text region refinement, where a pair of boundary points are predicted each time step until no new points are found. In this way, text regions of arbitrary shapes are detected and represented with adaptive number of boundary points. This gives more accurate description of text regions. Experimental results on five benchmarks, namely, CTW1500, TotalText, ICDAR2013, ICDAR2015 and MSRATD500, show that the proposed method achieves state-of-the-art in scene text detection.

...read moreread less

37 citations

Patent•

Method and device for providing image

[...]

Moon-Sik Jeong¹, Hye-Sun Kim¹, Su-jung Bae¹, Seong-Oh Lee¹, Hyeon-hee Cha¹, Sung-Do Choi¹, Hyun-Soo Choi¹ - Show less +3 more•Institutions (1)

Samsung¹

28 Jul 2015

TL;DR: An image providing method as discussed by the authors includes displaying a first image, the first image including an object and a background; receiving a user input selecting the object or the background as a region of interest; acquiring first identification information associated with the region of the interest based on first attribute information of the image; acquiring a second image from a target image; and generating an effect image based on at least one image and the second image.

...read moreread less

Abstract: An image providing method includes displaying a first image, the first image including an object and a background; receiving a user input selecting the object or the background as a region of interest; acquiring first identification information associated with the region of interest based on first attribute information of the first image; acquiring a second image from a target image, the second image including second identification information, the second identification information being the same as the first identification information; and generating an effect image based on at least one of the first image and the second image

...read moreread less

16 citations

Book Chapter•DOI•

Pivot Correlational Neural Network for Multimodal Video Categorization

[...]

Sunghun Kang¹, Junyeong Kim¹, Hyun-Soo Choi², Sungjin Kim², Chang D. Yoo¹ - Show less +1 more•Institutions (2)

KAIST¹, Samsung²

08 Sep 2018

TL;DR: From the experimental results, Pivot CorrNN achieves the best performance on the FCVID database and performance comparable to the state-of-the-art on YouTube-8M database.

...read moreread less

Abstract: This paper considers an architecture for multimodal video categorization referred to as Pivot Correlational Neural Network (Pivot CorrNN). The architecture consists of modal-specific streams dedicated exclusively to one specific modal input as well as modal-agnostic pivot stream that considers all modal inputs without distinction, and the architecture tries to refine the pivot prediction based on modal-specific predictions. The Pivot CorrNN consists of three modules: (1) maximizing pivot-correlation module that maximizes the correlation between the hidden states as well as the predictions of the modal-agnostic pivot stream and modal-specific streams in the network, (2) contextual Gated Recurrent Unit (cGRU) module which extends the capability of a generic GRU to take multimodal inputs in updating the pivot hidden-state, and (3) adaptive aggregation module that aggregates all modal-specific predictions as well as the modal-agnostic pivot predictions into one final prediction. We evaluate the Pivot CorrNN on two publicly available large-scale multimodal video categorization datasets, FCVID and YouTube-8M. From the experimental results, Pivot CorrNN achieves the best performance on the FCVID database and performance comparable to the state-of-the-art on YouTube-8M database.

...read moreread less

14 citations

1
2
3
4
…
5
6

Collapse

Cited by

PDF

Open Access

More filters

Patent•

Mobile terminal and method for controlling the same

[...]

Jong Hwan Kim¹•Institutions (1)

LG Electronics¹

13 Mar 2015

TL;DR: In this article, a mobile terminal including a body; a touchscreen provided to a front and extending to side of the body and configured to display content; and a controller configured to detect one side of a body when it comes into contact with a side of an external terminal, display a first area on the touchscreen corresponding to a contact area of body and the external terminal and a second area including the content.

...read moreread less

Abstract: A mobile terminal including a body; a touchscreen provided to a front and extending to side of the body and configured to display content; and a controller configured to detect one side of the body comes into contact with one side of an external terminal, display a first area on the touchscreen corresponding to a contact area of the body and the external terminal and a second area including the content, receive an input of moving the content displayed in the second area to the first area, display the content in the first area, and share the content in the first area with the external terminal.

...read moreread less

1,441 citations

Proceedings Article•DOI•

Character Region Awareness for Text Detection

[...]

Young Min Baek¹, Bado Lee¹, Dongyoon Han¹, Sangdoo Yun¹, Hwalsuk Lee¹ - Show less +1 more•Institutions (1)

Naver Corporation¹

15 Jun 2019

TL;DR: Zhang et al. as mentioned in this paper proposed a new scene text detection method to effectively detect text area by exploring each character and affinity between characters, which significantly outperforms the state-of-the-art detectors.

...read moreread less

Abstract: Scene text detection methods based on neural networks have emerged recently and have shown promising results. Previous methods trained with rigid word-level bounding boxes exhibit limitations in representing the text region in an arbitrary shape. In this paper, we propose a new scene text detection method to effectively detect text area by exploring each character and affinity between characters. To overcome the lack of individual character level annotations, our proposed framework exploits both the given character-level annotations for synthetic images and the estimated character-level ground-truths for real images acquired by the learned interim model. In order to estimate affinity between characters, the network is trained with the newly proposed representation for affinity. Extensive experiments on six benchmarks, including the TotalText and CTW-1500 datasets which contain highly curved texts in natural images, demonstrate that our character-level text detection significantly outperforms the state-of-the-art detectors. According to the results, our proposed method guarantees high flexibility in detecting complicated scene text images, such as arbitrarily-oriented, curved, or deformed texts.

...read moreread less

635 citations

Journal Article•DOI•

Gliding Vertex on the Horizontal Bounding Box for Multi-Oriented Object Detection

[...]

Yongchao Xu¹, Fu Mingtao¹, Qimeng Wang¹, Yukang Wang¹, Kai Chen², Gui-Song Xia³, Xiang Bai¹ - Show less +3 more•Institutions (3)

Huazhong University of Science and Technology¹, Shanghai Jiao Tong University², Wuhan University³

01 Apr 2021-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: An obliquity factor based on area ratio between the object and its horizontal bounding box, guiding the selection of horizontal or oriented detection for each object is introduced, and five extra target variables are added to the regression head of faster R-CNN, which requires ignorable extra computation time.

...read moreread less

Abstract: Object detection has recently experienced substantial progress. Yet, the widely adopted horizontal bounding box representation is not appropriate for ubiquitous oriented objects such as objects in aerial images and scene texts. In this paper, we propose a simple yet effective framework to detect multi-oriented objects. Instead of directly regressing the four vertices, we glide the vertex of the horizontal bounding box on each corresponding side to accurately describe a multi-oriented object. Specifically, We regress four length ratios characterizing the relative gliding offset on each corresponding side. This may facilitate the offset learning and avoid the confusion issue of sequential label points for oriented objects. To further remedy the confusion issue for nearly horizontal objects, we also introduce an obliquity factor based on area ratio between the object and its horizontal bounding box, guiding the selection of horizontal or oriented detection for each object. We add these five extra target variables to the regression head of faster R-CNN, which requires ignorable extra computation time. Extensive experimental results demonstrate that without bells and whistles, the proposed method achieves superior performances on multiple multi-oriented object detection benchmarks including object detection in aerial images, scene text detection, pedestrian detection in fisheye images.

...read moreread less

395 citations

Posted Content•

Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes

[...]

Pengyuan Lyu¹, Minghui Liao¹, Cong Yao, Wenhao Wu, Xiang Bai¹ - Show less +1 more•Institutions (1)

Huazhong University of Science and Technology¹

06 Jul 2018-arXiv: Computer Vision and Pattern Recognition

TL;DR: This paper investigates the problem of scene text spotting, which aims at simultaneous text detection and recognition in natural images, and proposes an end-to-end trainable neural network model, named as Mask TextSpotter, which is inspired by the newly published work Mask R-CNN.

...read moreread less

Abstract: Recently, models based on deep neural networks have dominated the fields of scene text detection and recognition. In this paper, we investigate the problem of scene text spotting, which aims at simultaneous text detection and recognition in natural images. An end-to-end trainable neural network model for scene text spotting is proposed. The proposed model, named as Mask TextSpotter, is inspired by the newly published work Mask R-CNN. Different from previous methods that also accomplish text spotting with end-to-end trainable deep neural networks, Mask TextSpotter takes advantage of simple and smooth end-to-end learning procedure, in which precise text detection and recognition are acquired via semantic segmentation. Moreover, it is superior to previous methods in handling text instances of irregular shapes, for example, curved text. Experiments on ICDAR2013, ICDAR2015 and Total-Text demonstrate that the proposed method achieves state-of-the-art results in both scene text detection and end-to-end text recognition tasks.

...read moreread less

326 citations

Posted Content•

R3Det: Refined Single-Stage Detector with Feature Refinement for Rotating Object

[...]

Xue Yang¹, Junchi Yan¹, Ziming Feng, Tao He•Institutions (1)

Shanghai Jiao Tong University¹

15 Aug 2019-arXiv: Computer Vision and Pattern Recognition

TL;DR: The key idea of feature refinement module is to re-encode the position information of the current refined bounding box to the corresponding feature points through feature interpolation to realize feature reconstruction and alignment.

...read moreread less

Abstract: Rotation detection is a challenging task due to the difficulties of locating the multi-angle objects and separating them effectively from the background. Though considerable progress has been made, for practical settings, there still exist challenges for rotating objects with large aspect ratio, dense distribution and category extremely imbalance. In this paper, we propose an end-to-end refined single-stage rotation detector for fast and accurate object detection by using a progressive regression approach from coarse to fine granularity. Considering the shortcoming of feature misalignment in existing refined single-stage detector, we design a feature refinement module to improve detection performance by getting more accurate features. The key idea of feature refinement module is to re-encode the position information of the current refined bounding box to the corresponding feature points through pixel-wise feature interpolation to realize feature reconstruction and alignment. For more accurate rotation estimation, an approximate SkewIoU loss is proposed to solve the problem that the calculation of SkewIoU is not derivable. Experiments on three popular remote sensing public datasets DOTA, HRSC2016, UCAS-AOD as well as one scene text dataset ICDAR2015 show the effectiveness of our approach. Tensorflow and Pytorch version codes are available at this https URL and this https URL, and R3Det is also integrated in our open source rotation detection benchmark: this https URL.

...read moreread less

286 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106

Collapse