Skip to content

YujunHuang063/3D-GP-LMVIC

Repository files navigation

3D-LMVIC: Learning-based Multi-View Image Compression with 3D Gaussian Geometric Priors

Abstract: Multi-view image compression is vital for 3D-related applications. To effectively model correlations between views, existing methods typically predict disparity between two views on a 2D plane, which works well for small disparities, such as in stereo images, but struggles with larger disparities caused by significant view changes. To address this, we propose a novel approach: learning-based multi-view image compression with 3D Gaussian geometric priors (3D-LMVIC). Our method leverages 3D Gaussian Splatting to derive geometric priors of the 3D scene, enabling more accurate disparity estimation across views within the compression model. Additionally, we introduce a depth map compression model to reduce redundancy in geometric information between views. A multi-view sequence ordering method is also proposed to enhance correlations between adjacent views. Experimental results demonstrate that 3D-GP-LMVIC surpasses both traditional and learning-based methods in performance, while maintaining fast encoding and decoding speed.

Setup

To install the required dependencies, run:

pip install -r requirements.txt

Additionally, please install diff-gaussian-rasterization-w-depth and simple-knn. We recommend installing them in an environment with CUDA 11.x and GCC 9.4.0, as higher versions of GCC may lead to installation issues.

Data Preparation

Using the Auditorium scene from the Tanks&Temples dataset as an example, the expected data structure is:

Tanks&Temples
|---Auditorium
    |---images
        |---<image 0>
        |---<image 1>
        |---...
    |---sparse
        |---0
        |---cameras.bin
        |---images.bin
        |---points3D.bin
    |---scene_params
        |---point_cloud
            |---iteration_30000
                |---point_cloud.ply
|---...

The sparse folder is generated by COLMAP. You can find instructions for using COLMAP here. After placing the Tanks&Temples dataset in the gaussian_splatting/data directory, run gaussian_splatting/prepare_Tanks&Temples.sh to generate the scene_params folder.

Training and Evaluation

Please refer to train.sh and eval.sh for the training and evaluation scripts, respectively. When performing evaluation with entropy coding, it is important to include the --dep_decoder_last_layer_double parameter. Without this, numerical errors may lead to entropy decoding failures. This parameter ensures that the last layer of the depth map compression model's decoder uses double-precision calculations.

About

3D-LMVIC: Learning-based Multi-View Image Coding with 3D Gaussian Geometric Priors

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors