scispace - formally typeset
Search or ask a question

Showing papers by "Federico Pernici published in 2021"


Proceedings ArticleDOI
10 Jan 2021
TL;DR: In this article, an event aggregation strategy is proposed to convert the output of an event camera into frames processable by traditional computer vision algorithms, which can encode temporal information directly into pixel values, which are then interpreted by deep learning models.
Abstract: In this paper we present an event aggregation strategy to convert the output of an event camera into frames processable by traditional Computer Vision algorithms. The proposed method first generates sequences of intermediate binary representations, which are then losslessly transformed into a compact format by simply applying a binary-to-decimal conversion. This strategy allows us to encode temporal information directly into pixel values, which are then interpreted by deep learning models. We apply our strategy, called Temporal Binary Representation, to the task of Gesture Recognition, obtaining state of the art results on the popular DVS128 Gesture Dataset. To underline the effectiveness of the proposed method compared to existing ones, we also collect an extension of the dataset under more challenging conditions on which to perform experiments.

17 citations


Journal ArticleDOI
TL;DR: In this article, the authors show that the stationarity of the embedding and its maximal separated representation can be theoretically justified by setting the weights of the fixed classifier to values taken from the coordinate vertices of the three regular polytopes available in Rd.
Abstract: Neural networks are widely used as a model for classification in a large variety of tasks. Typically, a learnable transformation (i.e., the classifier) is placed at the end of such models returning a value for each class used for classification. This transformation plays an important role in determining how the generated features change during the learning process. In this work, we argue that this transformation not only can be fixed (i.e., set as nontrainable) with no loss of accuracy and with a reduction in memory usage, but it can also be used to learn stationary and maximally separated embeddings. We show that the stationarity of the embedding and its maximal separated representation can be theoretically justified by setting the weights of the fixed classifier to values taken from the coordinate vertices of the three regular polytopes available in Rd, namely, the d-Simplex, the d-Cube, and the d-Orthoplex. These regular polytopes have the maximal amount of symmetry that can be exploited to generate stationary features angularly centered around their corresponding fixed weights. Our approach improves and broadens the concept of a fixed classifier, recently proposed by Hoffer et al., to a larger class of fixed classifier models. Experimental results confirm the theoretical analysis, the generalization capability, the faster convergence, and the improved performance of the proposed method. Code will be publicly available.

12 citations


Proceedings ArticleDOI
10 Jan 2021
TL;DR: In this paper, a fixed classifier is proposed for class-incremental learning, where a number of pre-allocated output nodes are subject to the classification loss right from the beginning of the learning phase.
Abstract: In class-incremental learning, a learning agent faces a stream of data with the goal of learning new classes while not forgetting previous ones. Neural networks are known to suffer under this setting, as they forget previously acquired knowledge. To address this problem, effective methods exploit past data stored in an episodic memory while expanding the final classifier nodes to accommodate the new classes. In this work, we substitute the expanding classifier with a novel fixed classifier in which a number of pre-allocated output nodes are subject to the classification loss right from the beginning of the learning phase. Contrarily to the standard expanding classifier, this allows: (a) the output nodes of future unseen classes to firstly see negative samples since the beginning of learning together with the positive samples that incrementally arrive; (b) to learn features that do not change their geometric configuration as novel classes are incorporated in the learning model. Experiments with public datasets show that the proposed approach is as effective as the expanding classifier while exhibiting novel intriguing properties of the internal feature representation that are otherwise not-existent. Our ablation study on pre-allocating a large number of classes further validates the approach.

10 citations


Book ChapterDOI
10 Jan 2021
TL;DR: In this article, the authors review the relationship between pseudo-labeling and semi-supervised learning and evaluate their performance on Fine-Grained Visual Classification datasets, and show that entropy-minimization slightly outperforms a recent proposed method based on pseudolabeling.
Abstract: Pseudo-labeling is a simple and well known strategy in Semi-Supervised Learning with neural networks. The method is equivalent to entropy minimization as the overlap of class probability distribution can be reduced minimizing the entropy for unlabeled data. In this paper we review the relationship between the two methods and evaluate their performance on Fine-Grained Visual Classification datasets. We include also the recent released iNaturalist-Aves that is specifically designed for Semi-Supervised Learning. Experimental results show that although in some cases supervised learning may still have better performance than the semi-supervised methods, Semi Supervised Learning shows effective results. Specifically, we observed that entropy-minimization slightly outperforms a recent proposed method based on pseudo-labeling.

5 citations


Posted Content
TL;DR: In this paper, the authors exploit Semi-Supervised Learning (SSL) to increase the amount of training data to improve the performance of Fine-Grained Visual Categorization (FGVC), which has not been investigated in the past in spite of prohibitive annotation costs that FGVC requires.
Abstract: In this paper we exploit Semi-Supervised Learning (SSL) to increase the amount of training data to improve the performance of Fine-Grained Visual Categorization (FGVC). This problem has not been investigated in the past in spite of prohibitive annotation costs that FGVC requires. Our approach leverages unlabeled data with an adversarial optimization strategy in which the internal features representation is obtained with a second-order pooling model. This combination allows to back-propagate the information of the parts, represented by second-order pooling, onto unlabeled data in an adversarial training setting. We demonstrate the effectiveness of the combined use by conducting experiments on six state-of-the-art fine-grained datasets, which include Aircrafts, Stanford Cars, CUB-200-2011, Oxford Flowers, Stanford Dogs, and the recent Semi-Supervised iNaturalist-Aves. Experimental results clearly show that our proposed method has better performance than the only previous approach that examined this problem; it also obtained higher classification accuracy with respect to the supervised learning methods with which we compared.

Posted Content
TL;DR: In this paper, the authors propose a method to learn internal feature representation models that are compatible with previously learned ones by encouraging stationarity to the learned representation model without relying on previously learned models.
Abstract: In this paper, we propose a novel method to learn internal feature representation models that are \textit{compatible} with previously learned ones. Compatible features enable for direct comparison of old and new learned features, allowing them to be used interchangeably over time. This eliminates the need for visual search systems to extract new features for all previously seen images in the gallery-set when sequentially upgrading the representation model. Extracting new features is typically quite expensive or infeasible in the case of very large gallery-sets and/or real time systems (i.e., face-recognition systems, social networks, life-long learning systems, robotics and surveillance systems). Our approach, called Compatible Representations via Stationarity (CoReS), achieves compatibility by encouraging stationarity to the learned representation model without relying on previously learned models. Stationarity allows features' statistical properties not to change under time shift so that the current learned features are inter-operable with the old ones. We evaluate single and sequential multi-model upgrading in growing large-scale training datasets and we show that our method improves the state-of-the-art in achieving compatible features by a large margin. In particular, upgrading ten times with training data taken from CASIA-WebFace and evaluating in Labeled Face in the Wild (LFW), we obtain a 49\% increase in measuring the average number of times compatibility is achieved, which is a 544\% relative improvement over previous state-of-the-art.