scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Data-Adaptive Packing Method for Compression of Dynamic Point Cloud Sequences

08 Jul 2019-pp 904-909
TL;DR: The proposed data-adaptive packing method divides the input sequences into some groups adaptively, in which patches in all frames are packed in a consistent way, which benefits the subsequent 2D video encoding greatly.
Abstract: This paper proposes a data-adaptive packing method for compression of dynamic point cloud sequences. Due to the increasing popularity in emerging immersive applications, the interest in representing the virtual world with dynamic point cloud sequences has never been higher. However, it remains a challenging work to compress these sequences with high efficiency. The proposed method divides the input sequences into some groups adaptively, in which patches in all frames are packed in a consistent way. Thus, the occupancy maps as well as the corresponding depth and texture images, are highly consistent both in temporal and spatial, which benefits the subsequent 2D video encoding greatly. Experimental results show that the proposed method achieves significant improvement in coding efficiency over the reference method of MPEG Point Cloud Compression (PCC).
Citations
More filters
Journal ArticleDOI
Siyang Yu, Si Sun, Wei Yan, Guangshuai Liu, Xurui Li 
01 Feb 2022-Sensors
TL;DR: This work proposes an improved compression means for dynamic point cloud based on curvature estimation and hierarchical strategy to meet the demands in real-world scenarios and achieves improved compression performance and faster runtime than traditional video-based dynamic point clouds compression.
Abstract: As a kind of information-intensive 3D representation, point cloud rapidly develops in immersive applications, which has also sparked new attention in point cloud compression. The most popular dynamic methods ignore the characteristics of point clouds and use an exhaustive neighborhood search, which seriously impacts the encoder’s runtime. Therefore, we propose an improved compression means for dynamic point cloud based on curvature estimation and hierarchical strategy to meet the demands in real-world scenarios. This method includes initial segmentation derived from the similarity between normals, curvature-based hierarchical refining process for iterating, and image generation and video compression technology based on de-redundancy without performance loss. The curvature-based hierarchical refining module divides the voxel point cloud into high-curvature points and low-curvature points and optimizes the initial clusters hierarchically. The experimental results show that our method achieved improved compression performance and faster runtime than traditional video-based dynamic point cloud compression.

9 citations

Journal ArticleDOI
TL;DR: In this article , a low-latency synchronous rate control structure is designed to reduce the overhead of pre-coding, and the basic unit (BU) parameters are predicted accurately based on CNN-LSTM neural network.
Abstract: Due to limited transmission resources and storage capacity, efficient rate control is important in Video-based Point Cloud Compression (V-PCC). In this paper, we propose a learning-based rate control method to improve the rate-distortion (RD) performance of V-PCC. A low-latency synchronous rate control structure is designed to reduce the overhead of pre-coding. The basic unit (BU) parameters are predicted accurately based on our proposed CNN-LSTM neural network, instead of the online updating approach, which can be inaccurate due to low consistency between adjacent 2D frames in V-PCC. When determining the quantization parameters for the BU, a patch-based clipping method is proposed to avoid unnecessary clipping. This approach is able to improve the RD performance and subjective dynamic point cloud quality. Experiments show that our proposed rate control method outperforms present approaches.

4 citations

Proceedings ArticleDOI
12 Oct 2020
TL;DR: This work develops a novel volumetric video security mechanism, namely VVSec, which makes benign use of adversarial perturbations to obfuscate the security and privacy-sensitive 3D face models and ensures that the 3D models cannot be exploited to bypass deep learning-based face authentications.
Abstract: Volumetric video (VV) streaming has drawn an increasing amount of interests recently with the rapid advancements in consumer VR/AR devices and the relevant multimedia and graphics research. While the resource and performance challenges in volumetric video streaming have been actively investigated by the multimedia community, the potential security and privacy concerns with this new type of multimedia have not been studied. We for the first time identify an effective threat model that extracts 3D face models from volumetric videos and compromises face ID-based authentications To defend against such attack, we develop a novel volumetric video security mechanism, namely VVSec, which makes benign use of adversarial perturbations to obfuscate the security and privacy-sensitive 3D face models. Such obfuscation ensures that the 3D models cannot be exploited to bypass deep learning-based face authentications. Meanwhile, the injected perturbations are not perceivable by the end-users, maintaining the original quality of experience in volumetric video streaming. We evaluate VVSec using two datasets, including a set of frames extracted from an empirical volumetric video and a public RGB-D face image dataset. Our evaluation results demonstrate the effectiveness of both the proposed attack and defense mechanisms in volumetric video streaming.

4 citations


Cites background from "Data-Adaptive Packing Method for Co..."

  • ...ing on addressing various resource and performance challenges in volumetric video capturing [14, 27, 29], encoding [55, 63, 70], and streaming [36, 64, 76], to make it deployable under the existing net-...

    [...]

Patent
18 Jun 2020
TL;DR: In this paper, a point cloud encoding and decoding method and a codec are presented, which help to improve the parallel processing efficiency of upsampling so as to improve decoding efficiency.
Abstract: The present application relates to the technical field of encoding and decoding, and discloses a point cloud encoding and decoding method and a codec, which help to improve the parallel processing efficiency of upsampling so as to improve encoding and decoding efficiency. The point cloud coding (including encoding and decoding) method comprises: obtaining a target processing mode for a pixel block to be processed according to a lookup table of first occupied codewords, wherein the first occupied codewords are used to indicate whether a current reference pixel block is an occupied pixel block, and/or whether a plurality of spatially adjacent pixel blocks of the current reference pixel block are respectively occupied pixel blocks, in which the current reference pixel block is a pixel block in a first occupancy map of a point cloud to be coded, the pixel block to be processed is a pixel block in a second occupancy map of the point cloud to be coded, and the current reference pixel block corresponds to the pixel block to be processed; performing a filling process on the pixel block to be processed according to the target processing mode, so as to obtain a filled pixel block; and reconstructing the point cloud to be coded according to the filled second occupancy map, wherein the filled second occupancy map comprises the filled pixel block.

1 citations

Posted Content
TL;DR: In this article, an advanced geometry surface coding (AGSC) method was proposed for dynamic point clouds (DPC) compression, which consists of two modules, including an error projection model-based RDO and an occupancy map-based merge prediction.
Abstract: In video-based dynamic point cloud compression (V-PCC), 3D point clouds are projected onto 2D images for compressing with the existing video codecs. However, the existing video codecs are originally designed for natural visual signals, and it fails to account for the characteristics of point clouds. Thus, there are still problems in the compression of geometry information generated from the point clouds. Firstly, the distortion model in the existing rate-distortion optimization (RDO) is not consistent with the geometry quality assessment metrics. Secondly, the prediction methods in video codecs fail to account for the fact that the highest depth values of a far layer is greater than or equal to the corresponding lowest depth values of a near layer. This paper proposes an advanced geometry surface coding (AGSC) method for dynamic point clouds (DPC) compression. The proposed method consists of two modules, including an error projection model-based (EPM-based) RDO and an occupancy map-based (OM-based) merge prediction. Firstly, the EPM model is proposed to describe the relationship between the distortion model in the existing video codec and the geometry quality metric. Secondly, the EPM-based RDO method is presented to project the existing distortion model on the plane normal and is simplified to estimate the average normal vectors of coding units (CUs). Finally, we propose the OM-based merge prediction approach, in which the prediction pixels of merge modes are refined based on the occupancy map. Experiments tested on the standard point clouds show that the proposed method achieves an average 9.84\% bitrate saving for geometry compression.
References
More filters
Journal ArticleDOI
TL;DR: A subjective study in a state-of-the-art mixed reality system shows that introduced prediction distortions are negligible compared with the original reconstructed point clouds and shows the benefit of reconstructed point cloud video as a representation in the 3D virtual world.
Abstract: We present a generic and real-time time-varying point cloud codec for 3D immersive video. This codec is suitable for mixed reality applications in which 3D point clouds are acquired at a fast rate. In this codec, intra frames are coded progressively in an octree subdivision. To further exploit inter-frame dependencies, we present an inter-prediction algorithm that partitions the octree voxel space in $N \times N \times N$ macroblocks ( $N=8,16,32$ ). The algorithm codes points in these blocks in the predictive frame as a rigid transform applied to the points in the intra-coded frame. The rigid transform is computed using the iterative closest point algorithm and compactly represented in a quaternion quantization scheme. To encode the color attributes, we defined a mapping of color per vertex attributes in the traversed octree to an image grid and use legacy image coding method based on JPEG. As a result, a generic compression framework suitable for real-time 3D tele-immersion is developed. This framework has been optimized to run in real time on commodity hardware for both the encoder and decoder. Objective evaluation shows that a higher rate-distortion performance is achieved compared with available point cloud codecs. A subjective study in a state-of-the-art mixed reality system shows that introduced prediction distortions are negligible compared with the original reconstructed point clouds. In addition, it shows the benefit of reconstructed point cloud video as a representation in the 3D virtual world. The codec is available as open source for integration in immersive and augmented communication applications and serves as a base reference software platform in JTC1/SC29/WG11 (MPEG) for the further development of standardized point-cloud compression solutions.

346 citations


"Data-Adaptive Packing Method for Co..." refers background in this paper

  • ...Unfortunately, compression methods for these sequences are suffering from poor temporal and spatial compression performance [1, 2]....

    [...]

Proceedings ArticleDOI
14 May 2012
TL;DR: This work presents a novel lossy compression approach for point cloud streams which exploits spatial and temporal redundancy within the point data and presents a technique for comparing the octree data structures of consecutive point clouds.
Abstract: We present a novel lossy compression approach for point cloud streams which exploits spatial and temporal redundancy within the point data. Our proposed compression framework can handle general point cloud streams of arbitrary and varying size, point order and point density. Furthermore, it allows for controlling coding complexity and coding precision. To compress the point clouds, we perform a spatial decomposition based on octree data structures. Additionally, we present a technique for comparing the octree data structures of consecutive point clouds. By encoding their structural differences, we can successively extend the point clouds at the decoder. In this way, we are able to detect and remove temporal redundancy from the point cloud data stream. Our experimental results show a strong compression performance of a ratio of 14 at 1 mm coordinate precision and up to 40 at a coordinate precision of 9 mm.

341 citations


"Data-Adaptive Packing Method for Co..." refers background in this paper

  • ...Temporal and spatial redundancies in successive point clouds are firstly exploited in [10]....

    [...]

Proceedings ArticleDOI
01 Oct 2014
TL;DR: This paper constructs graphs on small neighborhoods of the point cloud by connecting nearby points, and treats the attributes as signals over the graph, and adopts graph transform, which is equivalent to Karhunen-Loève Transform on such graphs, to decorrelate the signal.
Abstract: Compressing attributes on 3D point clouds such as colors or normal directions has been a challenging problem, since these attribute signals are unstructured. In this paper, we propose to compress such attributes with graph transform. We construct graphs on small neighborhoods of the point cloud by connecting nearby points, and treat the attributes as signals over the graph. The graph transform, which is equivalent to Karhunen-Loeve Transform on such graphs, is then adopted to decorrelate the signal. Experimental results on a number of point clouds representing human upper bodies demonstrate that our method is much more efficient than traditional schemes such as octree-based methods.

229 citations


"Data-Adaptive Packing Method for Co..." refers background in this paper

  • ...[8] treat color as signals and construct a color graph for each level of the octree....

    [...]

Journal ArticleDOI
TL;DR: This is the first paper that exploits both the spatial correlation inside each frame and the temporal correlation between the frames (through the motion estimation) to compress the color and the geometry of 3D point cloud sequences in an efficient way.
Abstract: This paper addresses the problem of compression of 3D point cloud sequences that are characterized by moving 3D positions and color attributes. As temporally successive point cloud frames are similar, motion estimation is key to effective compression of these sequences. It however remains a challenging problem as the point cloud frames have varying numbers of points without explicit correspondence information. We represent the time-varying geometry of these sequences with a set of graphs, and consider 3D positions and color attributes of the points clouds as signals on the vertices of the graphs. We then cast motion estimation as a feature matching problem between successive graphs. The motion is estimated on a sparse set of representative vertices using new spectral graph wavelet descriptors. A dense motion field is eventually interpolated by solving a graph-based regularization problem. The estimated motion is finally used for removing the temporal redundancy in the predictive coding of the 3D positions and the color characteristics of the point cloud sequences. Experimental results demonstrate that our method is able to accurately estimate the motion between consecutive frames. Moreover, motion estimation is shown to bring significant improvement in terms of the overall compression performance of the sequence. To the best of our knowledge, this is the first paper that exploits both the spatial correlation inside each frame (through the graph) and the temporal correlation between the frames (through the motion estimation) to compress the color and the geometry of 3D point cloud sequences in an efficient way.

193 citations


"Data-Adaptive Packing Method for Co..." refers background in this paper

  • ...[11] proposed a time-varying framework aimed at addressing the problem of motion estimation and motion compensation in consecutive point clouds compression....

    [...]

Journal ArticleDOI
TL;DR: In this article, a spectral graph wavelet descriptor is used to estimate the motion of 3D point clouds between consecutive frames and a dense motion field is interpolated by solving a graph-based regularization problem.
Abstract: This paper addresses the problem of compression of 3D point cloud sequences that are characterized by moving 3D positions and color attributes. As temporally successive point cloud frames share some similarities, motion estimation is key to effective compression of these sequences. It, however, remains a challenging problem as the point cloud frames have varying numbers of points without explicit correspondence information. We represent the time-varying geometry of these sequences with a set of graphs, and consider 3D positions and color attributes of the point clouds as signals on the vertices of the graphs. We then cast motion estimation as a feature-matching problem between successive graphs. The motion is estimated on a sparse set of representative vertices using new spectral graph wavelet descriptors. A dense motion field is eventually interpolated by solving a graph-based regularization problem. The estimated motion is finally used for removing the temporal redundancy in the predictive coding of the 3D positions and the color characteristics of the point cloud sequences. Experimental results demonstrate that our method is able to accurately estimate the motion between consecutive frames. Moreover, motion estimation is shown to bring a significant improvement in terms of the overall compression performance of the sequence. To the best of our knowledge, this is the first paper that exploits both the spatial correlation inside each frame (through the graph) and the temporal correlation between the frames (through the motion estimation) to compress the color and the geometry of 3D point cloud sequences in an efficient way.

168 citations