GitPedia

Geom3D

Geom3D: Geometric Modeling on 3D Structures, NeurIPS 2023

From chao1224·Updated June 11, 2026·View on GitHub·

Authors: Shengchao Liu, Weitao Du, Yanjing Li, Zhuoxinran Li, Zhiling Zheng, Chenru Duan, Zhiming Ma, Omar Yaghi, Anima Anandkumar, Christian Borgs, Jennifer Chayes, Hongyu Guo, Jian Tang The project is written primarily in Python, distributed under the MIT License license, first published in 2023. Key topics include: 3d, 3d-structures, ai4science, biology, chemistry.

Symmetry-Informed Geometric Representation for Molecules, Proteins, and Crystalline Materials

Authors: Shengchao Liu, Weitao Du, Yanjing Li, Zhuoxinran Li, Zhiling Zheng, Chenru Duan, Zhiming Ma, Omar Yaghi, Anima Anandkumar, Christian Borgs, Jennifer Chayes, Hongyu Guo, Jian Tang

[ArXiv]

This is Geom3D, a platfrom for geometric modeling on 3D structures:

<p align="center"> <img src="./figure/pipeline.jpg" /> </p>

Environment

Conda

Setup the anaconda

bash
wget https://repo.continuum.io/archive/Anaconda3-2019.10-Linux-x86_64.sh bash Anaconda3-2019.10-Linux-x86_64.sh -b export PATH=$PWD/anaconda3/bin:$PATH

Packages

Start with some basic packages.

bash
conda create -n Geom3D python=3.7 conda activate Geom3D conda install -y -c rdkit rdkit conda install -y numpy networkx scikit-learn conda install -y -c conda-forge -c pytorch pytorch=1.9.1 conda install -y -c pyg -c conda-forge pyg=2.0.2 pip install ogb==1.2.1 pip install sympy pip install ase pip install lie_learn # for TFN and SE3-Trans pip install packaging # for SEGNN pip3 install e3nn # for SEGNN pip install transformers # for smiles pip install selfies # for selfies pip install atom3d # for Atom3D pip install cffi # for Atom3D pip install biopython # for Atom3D pip install cython # for pyximport conda install -y -c conda-forge py-xgboost-cpu # for XGB

Datasets

We cover three types of datasets:

  • Small Molecules
    • QM9
    • MD17
    • rMD17
    • COLL
  • Proteins
    • EC
    • FOLD
  • Small Molecules and Proteins
    • LBA
    • LEP
  • Materials
    • MatBench
    • QMOF

For dataset acquisition:

  • We provide a set of raw and processed dataset HuggingFace. You can download the data using python download_data.py under ./data.
  • Please refer to the data folder for more details.

Overview of Models

Representation Models

Geom3D includes the following representation models:

We also include the following 7 1D models and 11 2D models (specifically for small molecules):

Notice that there is no pretraining considered at this stage. For geoemtric pretraining models, please check the following section.

Geometric Pretraining

We include the following 14 geometric pretraining methods:

Scripts

The python scripts can be found in examples_3D. We list the bash scripts (and hyperparameters) in scripts. For example, the bash script for SchNet on QM9 is:

cd examples_3D

export model_3d=SchNet
export dataset=QM9
export task_list=(mu alpha homo lumo gap r2 zpve u0 u298 h298 g298 cv)

export lr_list=(5e-4)
export lr_scheduler_list=(CosineAnnealingLR)
export split=customized_01
export seed=42
export emb_dim_list=(128 300)
export batch_size_list=(128)

export epochs=1000

for task in "${task_list[@]}"; do
for lr in "${lr_list[@]}"; do
for lr_scheduler in "${lr_scheduler_list[@]}"; do
for emb_dim in "${emb_dim_list[@]}"; do
for batch_size in "${batch_size_list[@]}"; do

    export output_model_dir=output/random/"$model_3d"/"$dataset"/"$task"_"$split"_"$seed"/"$lr"_"$lr_scheduler"_"$emb_dim"_"$batch_size"_"$epochs"
    export output_file="$output_model_dir"/result.out
    mkdir -p "$output_model_dir"

    python finetune_QM9.py \
    --model_3d="$model_3d" --dataset="$dataset" --epochs="$epochs" \
    --task="$task" \
    --split="$split" --seed="$seed" \
    --batch_size="$batch_size" \
    --emb_dim="$emb_dim" \
    --lr="$lr" --lr_scheduler="$lr_scheduler" --no_eval_train --print_every_epoch=1 --num_workers=8 \
    --output_model_dir="$output_model_dir" \
    > "$output_file"
    
done
done
done
done
done

Now only the bash scripts for QM9 are available. We will release the complete version soon, together with Notebook demo. Please stay tuned.

Checkpoints

Checkpoints for all the pretraining and downstream tasks will be released soon.

Cite us

Feel free to cite this work if you find it useful to you!

@article{liu2023symmetry,
    title={Symmetry-Informed Geometric Representation for Molecules, Proteins, and Crystalline Materials},
    author={Liu, Shengchao and Du, Weitao and Li, Yanjing and Li, Zhuoxinran and Zheng, Zhiling and Duan, Chenru and Ma, Zhiming and Yaghi, Omar and Anandkumar, Anima and Borgs, Christian and others},
    journal={arXiv preprint arXiv:2306.09375},
    year={2023}
}

Contributors

Showing top 1 contributor by commit count.

View all contributors on GitHub →

This article is auto-generated from chao1224/Geom3D via the GitHub API.Last fetched: 6/19/2026