LGM
[ECCV 2024 Oral] LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation.
This is the official implementation of *LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation*. The project is written primarily in Python, distributed under the MIT License license, first published in 2024. It has gained significant community traction with 2,086 stars and 139 forks on GitHub. Key topics include: gaussian-splatting, image-to-3d, text-to-3d.
Large Multi-View Gaussian Model
This is the official implementation of LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation.
Project Page | Arxiv | Weights | <a href="https://huggingface.co/spaces/ashawkey/LGM"><img src="https://img.shields.io/badge/%F0%9F%A4%97%20Gradio%20Demo-Huggingface-orange"></a>
https://github.com/3DTopia/LGM/assets/25863658/cf64e489-29f3-4935-adba-e393a24c26e8
News
[2024.4.3] Thanks to @yxymessi and @florinshen, we have fixed a severe bug in rotation normalization here. We have finetuned the model with correct normalization for 30 more epochs and uploaded new checkpoints.
Replicate Demo:
Thanks to @camenduru!
Install
bash# xformers is required! please refer to https://github.com/facebookresearch/xformers for details. # for example, we use torch 2.1.0 + cuda 11.8 pip install torch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 --index-url https://download.pytorch.org/whl/cu118 pip install -U xformers --index-url https://download.pytorch.org/whl/cu118 # a modified gaussian splatting (+ depth, alpha rendering) git clone --recursive https://github.com/ashawkey/diff-gaussian-rasterization pip install ./diff-gaussian-rasterization # for mesh extraction pip install git+https://github.com/NVlabs/nvdiffrast # other dependencies pip install -r requirements.txt
Pretrained Weights
Our pretrained weight can be downloaded from huggingface.
For example, to download the fp16 model for inference:
bashmkdir pretrained && cd pretrained wget https://huggingface.co/ashawkey/LGM/resolve/main/model_fp16_fixrot.safetensors cd ..
For MVDream and ImageDream, we use a diffusers implementation.
Their weights will be downloaded automatically.
Inference
Inference takes about 10GB GPU memory (loading all imagedream, mvdream, and our LGM).
bash### gradio app for both text/image to 3D python app.py big --resume pretrained/model_fp16.safetensors ### test # --workspace: folder to save output (*.ply and *.mp4) # --test_path: path to a folder containing images, or a single image python infer.py big --resume pretrained/model_fp16.safetensors --workspace workspace_test --test_path data_test ### local gui to visualize saved ply python gui.py big --output_size 800 --test_path workspace_test/saved.ply ### mesh conversion python convert.py big --test_path workspace_test/saved.ply
For more options, please check options.
Training
NOTE:
Since the dataset used in our training is based on AWS, it cannot be directly used for training in a new environment.
We provide the necessary training code framework, please check and modify the dataset implementation!
We also provide the ~80K subset of Objaverse used to train LGM in objaverse_filter.
bash# debug training accelerate launch --config_file acc_configs/gpu1.yaml main.py big --workspace workspace_debug # training (use slurm for multi-nodes training) accelerate launch --config_file acc_configs/gpu8.yaml main.py big --workspace workspace
Acknowledgement
This work is built on many amazing research works and open-source projects, thanks a lot to all the authors for sharing!
Citation
@article{tang2024lgm,
title={LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation},
author={Tang, Jiaxiang and Chen, Zhaoxi and Chen, Xiaokang and Wang, Tengfei and Zeng, Gang and Liu, Ziwei},
journal={arXiv preprint arXiv:2402.05054},
year={2024}
}
Contributors
Showing top 3 contributors by commit count.
