scispace - formally typeset
Search or ask a question
Topic

Upsampling

About: Upsampling is a research topic. Over the lifetime, 2426 publications have been published within this topic receiving 57613 citations.


Papers
More filters
Posted Content
TL;DR: In this paper, a WaveNet-based model conditioned on a log-mel spectrogram representation of a bandwidth-constrained speech audio signal of 8 kHz and audio with artifacts from GSM full-rate (FR) compression was proposed to reconstruct the higher-resolution signal.
Abstract: Large-scale mobile communication systems tend to contain legacy transmission channels with narrowband bottlenecks, resulting in characteristic "telephone-quality" audio. While higher quality codecs exist, due to the scale and heterogeneity of the networks, transmitting higher sample rate audio with modern high-quality audio codecs can be difficult in practice. This paper proposes an approach where a communication node can instead extend the bandwidth of a band-limited incoming speech signal that may have been passed through a low-rate codec. To this end, we propose a WaveNet-based model conditioned on a log-mel spectrogram representation of a bandwidth-constrained speech audio signal of 8 kHz and audio with artifacts from GSM full-rate (FR) compression to reconstruct the higher-resolution signal. In our experimental MUSHRA evaluation, we show that a model trained to upsample to 24kHz speech signals from audio passed through the 8kHz GSM-FR codec is able to reconstruct audio only slightly lower in quality to that of the Adaptive Multi-Rate Wideband audio codec (AMR-WB) codec at 16kHz, and closes around half the gap in perceptual quality between the original encoded signal and the original speech sampled at 24kHz. We further show that when the same model is passed 8kHz audio that has not been compressed, is able to again reconstruct audio of slightly better quality than 16kHz AMR-WB, in the same MUSHRA evaluation.

10 citations

Patent
24 Oct 2002
TL;DR: In this article, it has been shown that if a sampling rate conversion operation is preceded by an up-sampling operation and only after the conversion is followed by a down sampling operation to a wanted sampling frequency, then the complexity in terms of the ultimate number of calculations, in particular multiplications and additions, is reduced.
Abstract: A time discrete filter comprises a sampling rate converter provided with an input and an output, and a down-sampler having a down-sampling factor nd. The time discrete filter further comprises an up-sampler having an up-sampling factor nu, whereby the up-sampler is coupled to the converter input, and the converter output is coupled to the down-sampler. It has been found that if a sampling rate conversion operation is preceded by an up-sampling operation and only after the conversion is followed by a down-sampling operation to a wanted sampling frequency, that then the complexity in terms of the ultimate number of calculations, in particular multiplications and additions, is reduced. This leads to a decrease of the number of instructions per second which is a measure for the complexity of a Digital Signal Processing (DSP) algorithm. In addition this leads to an associated decrease of power consumed by a DSP, such as applied in for example audio, video, and (tele)communication devices, as well as radio and television apparatus.

10 citations

Journal ArticleDOI
Wei Jia1, Li Li, Anique Akhtar1, Zhu Li1, Shan Liu2 
TL;DR: This work proposes using a convolutional neural network (CNN) to improve the quality of the reconstructed point cloud quality of video-based point cloud compression (V-PCC), and introduces a reconstructed geometry video as the other input of the CNN to provide more useful information in order to indicate the occupancy map video.
Abstract: In video-based point cloud compression (V-PCC), a dynamic point cloud is projected onto geometry and attribute videos patch by patch for compression. In addition to the geometry and attribute videos, an occupancy map video is compressed into a V-PCC bitstream to indicate whether a two-dimensional(2D) point in the projected geometry video corresponds to any point in three-dimensional (3D) space. The occupancy map video is usually downsampled before compression to obtain a tradeoff between the bitrate and the reconstructed point cloud quality. Due to the accuracy loss in the downsampling process, some noisy points are generated, which leads to severe objective and subjective quality degradation of the reconstructed point cloud. To improve the quality of the reconstructed point cloud, we propose using a convolutional neural network (CNN) to improve the accuracy of the occupancy map video. We mainly make the following contributions. First, we improve the accuracy of the occupancy map video by formulating the problem as a binary segmentation problem since the pixel values of the occupancy map video are either0or1. Second, in addition to the downsampled occupancy map video, we introduce are constructed geometry video as the other input of the CNNto provide more useful information in order to indicate the occupancy map video. To the best of our knowledge, this is the first learning-based work to improve the performance of V-PCC. Compared to state-of-the-art schemes, our proposed CNN-based approach achieves much more accurate occupancy map videos and significant bitrate savings.

10 citations

Patent
16 Aug 2012
TL;DR: In this article, an edge-based interpolation method is proposed for upsampling by determining an edge characteristic associated with an interpolation point, the edge characteristic having an edge magnitude and an edge angle.
Abstract: Edge-based interpolation for upsampling. One method may include determining an edge characteristic associated with an interpolation point, the edge characteristic having an edge magnitude and an edge angle; selecting an interpolation filter in response to the edge angle; and determining a pixel value at the interpolation point using the selected interpolation filter. Other embodiments include edge-based interpolation followed by an adaptive sharpening filter. The sharpening filter is controlled by the edge-based interpolation parameters that determine the pixels to be sharpened and the sharpening strength.

9 citations

Proceedings ArticleDOI
10 May 1992
TL;DR: A systematic approach to the complexity minimization of digital square timing recovery is discussed and it is demonstrated that, for narrowband data signals, a prefilter is mandatory to suppress the pattern noise.
Abstract: A systematic approach to the complexity minimization of digital square timing recovery is discussed It is demonstrated that, for narrowband data signals, a prefilter is mandatory to suppress the pattern noise A least squared error approach is used to determine the tap weights of an FIR prefilter implementation The data interpolator coefficients are optimized using a minimum mean squared error method Two versions of the clock recovery are considered One is based on a processing rate of two samples per symbol for the prefilter and data interpolator The squarer and postprocessor (time extractor) part require four samples/symbol A prefilter interpolator is used for upsampling The other version processes four samples per symbol in all blocks A postprocessing method is proposed, which reduces the complexity of the postprocessor block by 25% in comparison with the DFT-approach Depending on the excess bandwidth factor r of the data signal, 15-21 multiply and accumulate operations per symbol are sufficient for a blockwise timing estimation >

9 citations


Network Information
Related Topics (5)
Convolutional neural network
74.7K papers, 2M citations
90% related
Image segmentation
79.6K papers, 1.8M citations
90% related
Feature extraction
111.8K papers, 2.1M citations
89% related
Deep learning
79.8K papers, 2.1M citations
88% related
Feature (computer vision)
128.2K papers, 1.7M citations
87% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023469
2022859
2021330
2020322
2019298
2018236