scispace - formally typeset
Search or ask a question
Author

Giuseppe Valenzise

Other affiliations: University of Alberta, Leonardo, Polytechnic University of Milan  ...read more
Bio: Giuseppe Valenzise is an academic researcher from Université Paris-Saclay. The author has contributed to research in topics: Point cloud & Image quality. The author has an hindex of 20, co-authored 114 publications receiving 1654 citations. Previous affiliations of Giuseppe Valenzise include University of Alberta & Leonardo.


Papers
More filters
Proceedings ArticleDOI
05 Sep 2007
TL;DR: An audio-based video surveillance system which automatically detects anomalous audio events in a public square, such as screams or gunshots, and localizes the position of the acoustic source, in such a way that a video-camera is steered consequently.
Abstract: This paper describes an audio-based video surveillance system which automatically detects anomalous audio events in a public square, such as screams or gunshots, and localizes the position of the acoustic source, in such a way that a video-camera is steered consequently. The system employs two parallel GMM classifiers for discriminating screams from noise and gunshots from noise, respectively. Each classifier is trained using different features, chosen from a set of both conventional and innovative audio features. The location of the acoustic source which has produced the sound event is estimated by computing the time difference of arrivals of the signal at a microphone array and using linear-correction least square localization algorithm. Experimental results show that our system can detect events with a precision of 93% at a false rejection rate of 5% when the SNR is 10dB, while the source direction can be estimated with a precision of one degree. A real-time implementation of the system is going to be installed in a public square of Milan.

363 citations

Proceedings ArticleDOI
01 Sep 2019
TL;DR: In this article, the authors proposed a data-driven geometry compression method for static point clouds based on learned convolutional transforms and uniform quantization. And they cast the decoding process as a binary classification of the point cloud occupancy map.
Abstract: Efficient point cloud compression is fundamental to enable the deployment of virtual and mixed reality applications, since the number of points to code can range in the order of millions. In this paper, we present a novel data-driven geometry compression method for static point clouds based on learned convolutional transforms and uniform quantization. We perform joint optimization of both rate and distortion using a trade-off parameter. In addition, we cast the decoding process as a binary classification of the point cloud occupancy map. Our method outperforms the MPEG reference solution in terms of rate-distortion on the Microsoft Voxelized Upper Bodies dataset with 51.5% BDBR savings on average. Moreover, while octree-based methods face exponential diminution of the number of points at low bitrates, our method still produces high resolution outputs even at low bitrates. Code and supplementary material are available at https://github.com/mauriceqch/pcc_geo_cnn.

118 citations

Proceedings ArticleDOI
03 Sep 2007
TL;DR: An audio event detection system which automatically classifies an audio event as ambient noise, scream or gunshot, and is able to guarantee a precision of 90% at a false rejection rate of 8%.
Abstract: This paper describes an audio event detection system which automatically classifies an audio event as ambient noise, scream or gunshot. The classification system uses two parallel GMM classifiers for discriminating screams from noise and gunshots from noise. Each classifier is trained using different features, appropriately chosen from a set of 47 audio features, which are selected according to a 2-step process. First, feature subsets of increasing size are assembled using filter selection heuristics. Then, a classifier is trained and tested with each feature subset. The obtained classification performance is used to determine the optimal feature vector dimension. This allows a noticeable speed-up w.r.t. wrapper feature selection methods. In order to validate the proposed detection algorithm, we carried out extensive experiments on a rich set of gunshots and screams mixed with ambient noise at different SNRs. Our results demonstrate that the system is able to guarantee a precision of 90% at a false rejection rate of 8%.

97 citations

Journal ArticleDOI
TL;DR: DeepTMO as discussed by the authors proposes a conditional generative adversarial network (cGAN) to learn to adapt to vast scenic-content (e.g., outdoor, indoor, human, structures, etc.) and tackles the HDR related scene-specific challenges such as contrast and brightness, while preserving the fine-grained details.
Abstract: A computationally fast tone mapping operator (TMO) that can quickly adapt to a wide spectrum of high dynamic range (HDR) content is quintessential for visualization on varied low dynamic range (LDR) output devices such as movie screens or standard displays. Existing TMOs can successfully tone-map only a limited number of HDR content and require an extensive parameter tuning to yield the best subjective-quality tone-mapped output. In this paper, we address this problem by proposing a fast, parameter-free and scene-adaptable deep tone mapping operator (DeepTMO) that yields a high-resolution and high-subjective quality tone mapped output. Based on conditional generative adversarial network (cGAN), DeepTMO not only learns to adapt to vast scenic-content ( e.g. , outdoor, indoor, human, structures, etc.) but also tackles the HDR related scene-specific challenges such as contrast and brightness, while preserving the fine-grained details. We explore 4 possible combinations of Generator-Discriminator architectural designs to specifically address some prominent issues in HDR related deep-learning frameworks like blurring, tiling patterns and saturation artifacts. By exploring different influences of scales, loss-functions and normalization layers under a cGAN setting, we conclude with adopting a multi-scale model for our task. To further leverage on the large-scale availability of unlabeled HDR data, we train our network by generating targets using an objective HDR quality metric, namely Tone Mapping Image Quality Index (TMQI). We demonstrate results both quantitatively and qualitatively, and showcase that our DeepTMO generates high-resolution, high-quality output images over a large spectrum of real-world scenes. Finally, we evaluate the perceived quality of our results by conducting a pair-wise subjective study which confirms the versatility of our method.

93 citations

Journal ArticleDOI
TL;DR: An image hashing algorithm based on compressive sensing principles is proposed, which solves both the authentication and the tampering identification problems and is robust to moderate content-preserving transformations including cropping, scaling, and rotation.
Abstract: In the last decade, the increased possibility to produce, edit, and disseminate multimedia contents has not been adequately balanced by similar advances in protecting these contents from unauthorized diffusion of forged copies When the goal is to detect whether or not a digital content has been tampered with in order to alter its semantics, the use of multimedia hashes turns out to be an effective solution to offer proof of legitimacy and to possibly identify the introduced tampering We propose an image hashing algorithm based on compressive sensing principles, which solves both the authentication and the tampering identification problems The original content producer generates a hash using a small bit budget by quantizing a limited number of random projections of the authentic image The content user receives the (possibly altered) image and uses the hash to estimate the mean square error distortion between the original and the received image In addition, if the introduced tampering is sparse in some orthonormal basis or redundant dictionary, an approximation is given in the pixel domain We emphasize that the hash is universal, eg, the same hash signature can be used to detect and identify different types of tampering At the cost of additional complexity at the decoder, the proposed algorithm is robust to moderate content-preserving transformations including cropping, scaling, and rotation In addition, in order to keep the size of the hash small, hash encoding/decoding takes advantage of distributed source codes

86 citations


Cited by
More filters
Reference EntryDOI
15 Oct 2004

2,118 citations

Book ChapterDOI
Eric V. Denardo1
01 Jan 2011
TL;DR: This chapter sees how the simplex method simplifies when it is applied to a class of optimization problems that are known as “network flow models” and finds an optimal solution that is integer-valued.
Abstract: In this chapter, you will see how the simplex method simplifies when it is applied to a class of optimization problems that are known as “network flow models.” You will also see that if a network flow model has “integer-valued data,” the simplex method finds an optimal solution that is integer-valued.

828 citations

Book
16 Nov 1998

766 citations

Journal Article
TL;DR: Methods for learning dictionaries that are appropriate for the representation of given classes of signals and multisensor data are described and dimensionality reduction based on dictionary representation can be extended to address specific tasks such as data analy sis or classification.
Abstract: We describe methods for learning dictionaries that are appropriate for the representation of given classes of signals and multisensor data. We further show that dimensionality reduction based on dictionary representation can be extended to address specific tasks such as data analy sis or classification when the learning includes a class separability criteria in the objective function. The benefits of dictionary learning clearly show that a proper understanding of causes underlying the sensed world is key to task-specific representation of relevant information in high-dimensional data sets.

705 citations