scispace - formally typeset
Search or ask a question
Institution

CVC Capital Partners

About: CVC Capital Partners is a based out in . It is known for research contribution in the topics: Image retrieval & Metric (mathematics). The organization has 8 authors who have published 12 publications receiving 498 citations.

Papers
More filters
Proceedings ArticleDOI
01 Nov 2017
TL;DR: This paper presents the dataset, the tasks and the findings of this RRC-MLT challenge, which aims at assessing the ability of state-of-the-art methods to detect Multi-Lingual Text in scene images, such as in contents gathered from the Internet media and in modern cities where multiple cultures live and communicate together.
Abstract: Text detection and recognition in a natural environment are key components of many applications, ranging from business card digitization to shop indexation in a street. This competition aims at assessing the ability of state-of-the-art methods to detect Multi-Lingual Text (MLT) in scene images, such as in contents gathered from the Internet media and in modern cities where multiple cultures live and communicate together. This competition is an extension of the Robust Reading Competition (RRC) which has been held since 2003 both in ICDAR and in an online context. The proposed competition is presented as a new challenge of the RRC. The dataset built for this challenge largely extends the previous RRC editions in many aspects: the multi-lingual text, the size of the dataset, the multi-oriented text, the wide variety of scenes. The dataset is comprised of 18,000 images which contain text belonging to 9 languages. The challenge is comprised of three tasks related to text detection and script classification. We have received a total of 16 participations from the research and industrial communities. This paper presents the dataset, the tasks and the findings of this RRC-MLT challenge.

321 citations

Journal ArticleDOI
TL;DR: In this article, an adaptive structural SVM (A-SSVM) is proposed to adapt a pre-learned classifier between different domains by taking into account the inherent structure in feature space (e.g., the parts in a DPM).
Abstract: The accuracy of object classifiers can significantly drop when the training data (source domain) and the application scenario (target domain) have inherent differences. Therefore, adapting the classifiers to the scenario in which they must operate is of paramount importance. We present novel domain adaptation (DA) methods for object detection. As proof of concept, we focus on adapting the state-of-the-art deformable part-based model (DPM) for pedestrian detection. We introduce an adaptive structural SVM (A-SSVM) that adapts a pre-learned classifier between different domains. By taking into account the inherent structure in feature space (e.g., the parts in a DPM), we propose a structure-aware A-SSVM (SA-SSVM). Neither A-SSVM nor SA-SSVM needs to revisit the source-domain training data to perform the adaptation. Rather, a low number of target-domain training examples (e.g., pedestrians) are used. To address the scenario where there are no target-domain annotated samples, we propose a self-adaptive DPM based on a self-paced learning (SPL) strategy and a Gaussian Process Regression (GPR). Two types of adaptation tasks are assessed: from both synthetic pedestrians and general persons (PASCAL VOC) to pedestrians imaged from an on-board camera. Results show that our proposals avoid accuracy drops as high as 15 points when comparing adapted and non-adapted detectors.

133 citations

Proceedings ArticleDOI
15 Jun 2019
TL;DR: This paper proposes two new loss functions that model the communication of a deep teacher network to a small student network and shows that embeddings computed using small student networks perform significantly better than those computed using standard networks of similar size.
Abstract: Metric learning networks are used to compute image embeddings, which are widely used in many applications such as image retrieval and face recognition. In this paper, we propose to use network distillation to efficiently compute image embeddings with small networks. Network distillation has been successfully applied to improve image classification, but has hardly been explored for metric learning. To do so, we propose two new loss functions that model the communication of a deep teacher network to a small student network. We evaluate our system in several datasets, including CUB-200-2011, Cars-196, Stanford Online Products and show that embeddings computed using small student networks perform significantly better than those computed using standard networks of similar size. Results on a very compact network (MobileNet-0.25), which can be used on mobile devices, show that the proposed method can greatly improve Recall@1 results from 27.5\% to 44.6\%. Furthermore, we investigate various aspects of distillation for embeddings, including hint and attention layers, semi-supervised learning and cross quality distillation. (Code is available at https://github.com/yulu0724/EmbeddingDistillation).

96 citations

Proceedings ArticleDOI
01 Nov 2017
TL;DR: This paper presents the LSDE string representation and its application to handwritten word spotting and shows how such a representation produces a more semantically interpretable retrieval from the user's perspective than other state of the art ones such as PHOC and DCToW.
Abstract: In this paper we present the LSDE string representation and its application to handwritten word spotting LSDE is a novel embedding approach for representing strings that learns a space in which distances between projected points are correlated with the Levenshtein edit distance between the original strings We show how such a representation produces a more semantically interpretable retrieval from the user's perspective than other state of the art ones such as PHOC and DCToW We also conduct a preliminary handwritten word spotting experiment on the George Washington dataset

27 citations

Proceedings Article
23 Feb 2013
TL;DR: A novel approach for the automatic text localization in scanned comics book pages, an essential step towards a fully automatic comics book understanding, focuses on speech text as it is semantically important and represents the majority of the text present in comics.
Abstract: Comic books constitute an important cultural heritage asset in many countries. Digitization combined with subsequent document understanding enable direct content-based search as opposed to metadata only search (e.g. album title or author name). Few studies have been done in this direction. In this work we detail a novel approach for the automatic text localization in scanned comics book pages, an essential step towards a fully automatic comics book understanding. We focus on speech text as it is semantically important and represents the majority of the text present in comics. The approach is compared with existing methods of text localization found in the literature and results are presented.

25 citations


Authors

Showing all 8 results

Network Information
Related Institutions (5)
Australian Artificial Intelligence Institute
152 papers, 3.7K citations

75% related

Laboratoire d'Informatique Fondamentale de Lille
1K papers, 17.1K citations

75% related

Zebra Technologies
248 papers, 2.6K citations

73% related

VM Labs
28 papers, 2.8K citations

72% related

Gracenote
202 papers, 5.4K citations

72% related

Performance
Metrics
No. of papers from the Institution in previous years
YearPapers
20193
20174
20162
20141
20131
19981