Categories
Specifications

Why MPEG made two codecs for Point Cloud Compression?

MPEG approved in 2020 FDIS (the final step in standards preparation) for two codecs for Point Cloud Compression: the first one called V-PCC (Video-based Point Cloud Compression) and the second called G-PCC (Geometry-based Point Cloud Compression). A legitimate question may be why two codecs in the same time? Are these codecs in competition one with the other or are they designed for different purposes?

The short answer is Yes, V-PCC and G-PCC are each one designed for different types of point clouds: V-PCC for dense point clouds such as the ones representing natural objects (reconstructed from a set of images) or graphics objects (designed by artists), and G-PCC for sparse point clouds such as the ones obtained by LIDARs when aquiring large spaces. Of course, someone may use V-PCC for sparse point clouds and G-PCC for dense one but this will be a wrong usage and conduct to sub-optimal results.

Let’s take some time to explain the reasons behind this two codecs strategy. Point clouds are now used in various applications with various requirements. The figure below shows two examples:

Both images are representing point clouds, the left one (by 8i Labs) is obtained by 3D reconstructions from a set of images captured by cameras positioned arround the person and the right one (by MERL) is obtained by using a LIDAR mounted on a car. Both point clouds are dynamic. This visual inspection allows to note that the first point cloud is dense and the second is sparse.

When the (3D) point cloud is dense, its projection on 2D planes remains dense enough (no or few holes in the projected image), therefore it is possible to transform the dense 3D point cloud into a set of images (or videos) and to encode this 2D representation instead of the original 3D one. Applying the same method (3D to 2D projection) for sparse point clouds is also possible but will conduct to poor images (with holes) that traditional image/video codecs will badly encode. V-PCC is based on this mecanism of projecting the 3D point cloud in a set of 2D patches, combine them in a 2D video and encode it with traditional video codecs. The first lesson here is: do not try to encode sparse point clouds with V-PCC, this will be sub-optimal.

When the (3D) point cloud is sparse, other representation space is needed to obtain a good compression. There are many methods proposed in the litterature and all of them try to exploit the 3D geometry to construct a predictable space. One method, generic enough, is the decomposition of the 3D space in octrees – a hierarchical segmentation of the volume that has as a benefit to quickly separate big empty spaces and refine the partitioning only where the information exists (non-empty blocks). Such approach is used in G-PCC and explains its name, Geometry-based point cloud compression. The second lesson here is : G-PCC is designed for sparse point clouds. The question is: can G-PCC be used for dense point clouds as well? The answer is yes but, be aware, this usage is sub-optimal.

Bottom line: V-PCC is designed (and optimal) for dense point clouds, G-PCC is designed (and optimal) for sparse point clouds. Comparing V-PCC with G-PCC on same content will systematically show that V-PCC is better (if the content is dense) and G-PCC is better (if the content is sparse).