Image Generation models
Generative models (GAN, VAE, Diffusion Models, Autoregressive Models) implemented with Pytorch, Pytorch_lightning and hydra.
image:https://img.shields.io/badge/-Python 3.7--3.9-blue?style=for-the-badge&logo=python&logoColor=white[python, link=https://pytorch.org/get-started/locally/] image:https://img.shields.io/badge/-PyTorch 1.8+-ee4c2c?style=for-the-badge&logo=pytorch&logoColor=white[pytorch, link=https://pytorch.org/] image:https://img.shields.io/badge/-Lightning 1.3+-792ee5?style=for-the-badge&logo=pytorchlightning&logoColor=white[pytorch_lignthing, link=https://www.pytorchlightning.ai/] image:https://img.shields... The project is written primarily in Python, first published in 2021. Key topics include: computer-vision, diffusion-models, gan-pytorch, gans, generative-adversarial-network.
:img-size: 200
:toc: macro
++++
image:https://img.shields.io/badge/-Python 3.7--3.9-blue?style=for-the-badge&logo=python&logoColor=white[python, link=https://pytorch.org/get-started/locally/]
image:https://img.shields.io/badge/-PyTorch 1.8+-ee4c2c?style=for-the-badge&logo=pytorch&logoColor=white[pytorch, link=https://pytorch.org/]
image:https://img.shields.io/badge/-Lightning 1.3+-792ee5?style=for-the-badge&logo=pytorchlightning&logoColor=white[pytorch_lignthing, link=https://www.pytorchlightning.ai/]
image:https://img.shields.io/badge/config-hydra 1.1-89b8cd?style=for-the-badge&labelColor=gray[hydra, link=https://hydra.cc/]
An easily scalable and hierachical framework including lots of image generation method with various datasets.
++++
</div> <br> <br> ++++Highlights 💡:
[Highlights:]
- Various types of image generation methods(Continuous updating):
** GANs: WGAN, InfoGAN, BiGAN
** VAEs: VQ-VAE, Beta-VAE, FactorVAE
** Augoregressive Models: PixelCNN
** Diffusion Models: DDPM - Decomposition of model training, datasets and networks:
[source, bash]
python run.py model=wgan networks=conv64 datamodule=celeba exp_name=wgan/celeba_conv64
- Hierachical configuration of experiment in yaml file
** Manual change of configs inconfigs/model,configs/datamoduleandconfigs/networks
** Run predefined experiments inconfigs/experiment
[source, bash]
python run.py experiment=vanilla_gan/cifar10
** Override hyperparameters from command line
+
[source, bash]
python run.py experiment=vanilla_gan/cifar10 model.lrG=1e-3 model.lrD=1e-3 exp_name=vanilla_gan/custom_lr
- Run multiple experiments at the same time:
** Grid search of hyperparameters:
[source, bash]
python run.py experiment=vae/mnist_conv model.lr=1e-3,5e-4,1e-4 "exp_name=vae/lr_${model.lr}"
** Run multiple experiments from config files:
+
[source, bash]
python run.py -m experiment=vae/mnist_conv,vae/cifar10,vae/celeba
== Setup
- Clone this repo
[source, bash]
git clone https://github.com/Victarry/Image-Generation-models.git
- Create new python environment using conda and install requirements
[source, bash]
conda env create -n image-generation python=3.10
conda activate image-generation
pip install requirement.txt
- Run your first experiment ✔️
[source, bash]
python run.py experiment=vae/mnist_conv
For different datasets, refer to documentation of datasets.
== Project Structure
== Generative Adversarial Networks(GANs)
=== GAN
Generative adversarial nets. +
Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, Yoshua Bengio* +
NeurIPS 2014. [https://arxiv.org/abs/1406.2661[PDF]] [https://arxiv.org/abs/1701.00160[Tutorial]]
[cols="4*", options="header"]
|===
^| Dataset
^| MNIST
^| CelebA
^| CIFAR10
^.^| Results
| image:assets/gan/mnist.jpg[mnist_mlp, {img-size}, {img-size}]
| image:assets/gan/celeba.jpg[cleba_conv, {img-size}, {img-size}]
| image:assets/gan/cifar10.jpg[cifar10_conv, {img-size}, {img-size}]
|===
=== LSGAN
Least Squares Generative Adversarial Networks. +
Xudong Mao, Qing Li, Haoran Xie, Raymond Y.K. Lau, Zhen Wang, Stephen Paul Smolley. +
ICCV 2017. [https://arxiv.org/abs/1611.04076[PDF]]
[cols="4*", options="header"]
|===
^| Dataset
^| MNIST
^| CelebA
^| CIFAR10
^.^| Results
| image:assets/lsgan/mnist.jpg[mnist_mlp, {img-size}, {img-size}]
| image:assets/lsgan/celeba.jpg[cleba_conv, {img-size}, {img-size}]
| image:assets/lsgan/cifar10.jpg[cifar10_conv, {img-size}, {img-size}]
|===
=== WGAN
Wasserstein GAN +
Martin Arjovsky, Soumith Chintala, Léon Bottou. +
ICML 2017. [https://arxiv.org/abs/1701.07875[PDF]]
[cols="4*", options="header"]
|===
^| Dataset
^| MNIST
^| CelebA
^| CIFAR10
^.^| Results
| image:assets/wgan/mnist.jpg[mnist_mlp, {img-size}, {img-size}]
| image:assets/wgan/celeba.jpg[cleba_conv, {img-size}, {img-size}]
| image:assets/wgan/cifar10.jpg[cifar10_conv, {img-size}, {img-size}]
|===
=== WGAN-GP
Improved training of wasserstein gans +
Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, Aaron Courville +
NeurIPS 2017. [https://arxiv.org/abs/1704.00028[PDF]]
[cols="4*", options="header"]
|===
^| Dataset
^| MNIST
^| CelebA
^| CIFAR10
^.^| Results
| image:assets/wgangp/mnist.jpg[mnist_mlp, {img-size}, {img-size}]
| image:assets/wgangp/celeba.jpg[cleba_conv, {img-size}, {img-size}]
| image:assets/wgangp/cifar10.jpg[cifar10_conv, {img-size}, {img-size}]
|===
=== VAE-GAN
Autoencoding beyond pixels using a learned similarity metric. +
Anders Boesen Lindbo Larsen, Søren Kaae Sønderby, Hugo Larochelle, Ole Winther. +
ICML 2016. [https://arxiv.org/abs/1512.09300[PDF]]
[cols="4*", options="header"]
|===
^| Dataset
^| MNIST
^| CelebA
^| CIFAR10
^.^| Results
| image:assets/vaegan/mnist.jpg[mnist_mlp, {img-size}, {img-size}]
| image:assets/vaegan/celeba.jpg[cleba_conv, {img-size}, {img-size}]
| image:assets/vaegan/cifar10.jpg[cifar10_conv, {img-size}, {img-size}]
|===
=== BiGAN/ALI
BiGAN Adversarial Feature Learning +
Jeff Donahue, Philipp Krähenbühl, Trevor Darrell. +
ICLR 2017. [https://arxiv.org/abs/1605.09782[PDF]]
ALI Adversarial Learned Inference +
Vincent Dumoulin, Ishmael Belghazi, Ben Poole, Olivier Mastropietro, Alex Lamb, Martin Arjovsky, Aaron Courville +
ICLR 2017. [https://arxiv.org/abs/1606.00704[PDF]]
[cols="4*", options="header"]
|===
^| Dataset
^| MNIST
^| CelebA
^| CIFAR10
^.^| Results
| image:assets/bigan/mnist.jpg[mnist_mlp, {img-size}, {img-size}]
| image:assets/bigan/celeba.jpg[cleba_conv, {img-size}, {img-size}]
| image:assets/bigan/cifar10.jpg[cifar10_conv, {img-size}, {img-size}]
|===
=== GGAN
Geometric GAN +
Jae Hyun Lim, Jong Chul Ye. +
Arxiv 2017. [https://arxiv.org/abs/1705.02894[PDF]]
[cols="4*", options="header"]
|===
^| Dataset
^| MNIST
^| CelebA
^| CIFAR10
^.^| Results
| image:assets/ggan/mnist.jpg[mnist_mlp, {img-size}, {img-size}]
| image:assets/ggan/celeba.jpg[cleba_conv, {img-size}, {img-size}]
| image:assets/ggan/cifar10.jpg[cifar10_conv, {img-size}, {img-size}]
|===
=== InfoGAN
InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets +
Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, Pieter Abbeel +
NeruIPS 2016. [https://arxiv.org/abs/1606.03657[PDF]]
[cols="5*", options="header"]
|===
^| Manipulated Latent
^| Random samples
^| Discrete Latent (class label)
^| Continuous Latent-1 (rotation)
^| Continuous Latent-2 (thickness)
^.^| Results
| image:assets/infogan/random.jpg[mnist_mlp, {img-size}, {img-size}]
| image:assets/infogan/class.jpg[mnist_mlp, {img-size}, {img-size}]
| image:assets/infogan/rotation.jpg[cleba_conv, {img-size}, {img-size}]
| image:assets/infogan/thickness.jpg[cifar10_conv, {img-size}, {img-size}]
|===
== Variational Autoencoders(VAEs)
=== VAE
Auto-Encoding Variational Bayes. +
Diederik P.Kingma, Max Welling. +
ICLR 2014. [https://arxiv.org/abs/1312.6114[PDF]]
[cols="4*", options="header"]
|===
^| Dataset
^| MNIST
^| CelebA
^| CIFAR10
^.^| Results
| image:assets/vae/mnist.jpg[mnist_mlp, {img-size}, {img-size}]
| image:assets/vae/celeba.jpg[cleba_conv, {img-size}, {img-size}]
| image:assets/vae/cifar10.jpg[cifar10_conv, {img-size}, {img-size}]
|===
=== cVAE
Learning Structured Output Representation using Deep Conditional Generative Models +
Kihyuk Sohn, Honglak Lee, Xinchen Yan. +
NeurIPS 2015. [https://papers.nips.cc/paper/2015/hash/8d55a249e6baa5c06772297520da2051-Abstract.html[PDF]]
[cols="3*", options="header"]
|===
^| Dataset
^| MNIST
^| CIFAR10
^.^| Results
| image:assets/cvae/mnist.jpg[mnist_mlp, {img-size}, {img-size}]
| image:assets/cvae/cifar10.jpg[cifar10_conv, {img-size}, {img-size}]
|===
=== Beta-VAE
beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework +
Irina Higgins, Loic Matthey, Arka Pal, Christopher Burgess, Xavier Glorot, Matthew Botvinick, Shakir Mohamed, Alexander Lerchner. +
ICLR 2017. [https://openreview.net/forum?id=Sy2fzU9gl[PDF]]
[cols="3*", options="header"]
|===
^| Dataset
^| CelebA
^| dsprites
^.^| Sample
| image:assets/beta_vae/celeba_sample.jpg[celeba, {img-size}, {img-size}]
| image:assets/beta_vae/dsprites_sample.jpg[dsprites, {img-size}, {img-size}]
^.^| Latent Interpolation
| image:assets/beta_vae/celeba_traverse.jpg[celeba, {img-size}, {img-size}]
| image:assets/beta_vae/dsprites_traverse.jpg[dsprites, {img-size}, {img-size}]
|===
=== Factor-VAE
Disentangling by Factorising +
Hyunjik Kim, Andriy Mnih. +
NeurIPS 2017. [https://arxiv.org/abs/1802.05983[PDF]]
[cols="3*", options="header"]
|===
^| Dataset
^| CelebA
^| dsprites
^.^| Sample
| image:assets/factor_vae/fvae_sample_celeba.jpg[celeba, {img-size}, {img-size}]
| image:assets/factor_vae/fvae_dsprites_sample.jpg[dsprites, {img-size}, {img-size}]
^.^| Latent Interpolation
| image:assets/factor_vae/fvae_celeba_traverse.jpg[celeba, {img-size}, {img-size}]
| image:assets/factor_vae/fvae_dsprites_traverse.jpg[dsprites, {img-size}, {img-size}]
|===
=== AAE
Adversarial Autoencoders. +
Alireza Makhzani, Jonathon Shlens, Navdeep Jaitly, Ian Goodfellow, Brendan Frey. +
arxiv 2015. [https://arxiv.org/abs/1511.05644[PDF]]
[cols="4*", options="header"]
|===
^| Dataset
^| MNIST
^| CelebA
^| CIFAR10
^.^| Results
| image:assets/aae/mnist.jpg[mnist_mlp, {img-size}, {img-size}]
| image:assets/aae/celeba.jpg[cleba_conv, {img-size}, {img-size}]
| image:assets/aae/cifar10.jpg[cifar10_conv, {img-size}, {img-size}]
|===
=== AGE
AGE Adversarial Generator-Encoder Networks. +
Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky. +
AAAI 2018. [https://arxiv.org/abs/1704.02304[PDF]]
[cols="4*", options="header"]
|===
^| Dataset
^| MNIST
^| CelebA
^| CIFAR10
^.^| Results
| image:assets/age/mnist.jpg[mnist_mlp, {img-size}, {img-size}]
| TODO
| TODO
|===
=== VQ-VAE
Neural Discrete Representation Learning. +
Aaron van den Oord, Oriol Vinyals, Koray Kavukcuoglu +
NeruIPS 2017. [https://arxiv.org/abs/1711.00937[PDF]]
[cols="4*", options="header"]
|===
^| Dataset
^| MNIST
^| CelebA
^| CIFAR10
^.^| Ground truth
| image:assets/vqvae/mnist_real.jpg[mnist_mlp, {img-size}, {img-size}]
| image:assets/vqvae/celeba_real.jpg[cleba_conv, {img-size}, {img-size}]
| image:assets/vqvae/cifar10_real.jpg[cifar10_conv, {img-size}, {img-size}]
^.^| Reconstruction
| image:assets/vqvae/mnist_recon.jpg[mnist_mlp, {img-size}, {img-size}]
| image:assets/vqvae/celeba_recon.jpg[cleba_conv, {img-size}, {img-size}]
| image:assets/vqvae/cifar10_recon.jpg[cifar10_conv, {img-size}, {img-size}]
|===
== Augoregressive Models
=== MADE: Masked Autoencoder for Distribution Estimation
MADE: Masked Autoencoder for Distribution Estimation +
Mathieu Germain, Karol Gregor, Iain Murray, Hugo Larochelle +
ICML 2015. [https://arxiv.org/abs/1502.03509[PDF]]
[cols="2*", options="header"]
|===
^| Dataset
^.^| Samples
^.^| MNIST
| image:assets/made/mnist.jpg[mnist_mlp, {img-size}, {img-size}]
|===
=== PixelCNN
Conditional Image Generation with PixelCNN Decoders +
Aaron van den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, Koray Kavukcuoglu +
NeruIPS 2016. [https://arxiv.org/abs/1606.05328[PDF]]
[cols="3*", options="header"]
|===
^| Dataset
^.^| Samples
^.^| Class Condition Samples
^.^| MNIST
| image:assets/pixelcnn/mnist.jpg[mnist_mlp, {img-size}, {img-size}]
| image:assets/pixelcnn/mnist_cond.jpg[mnist_mlp, {img-size}, {img-size}]
|===
=== Transformer
Vanilla transformer based augoregressive models.
[cols="3*", options="header"]
|===
^| Dataset
^.^| Samples
^.^| Class Condition Samples
^.^| MNIST
| image:assets/tar/mnist.jpg[mnist_mlp, {img-size}, {img-size}]
| image:assets/tar/mnist_cond.jpg[mnist_mlp, {img-size}, {img-size}]
|===
[source, bash]
python run.py experiment=tar/mnist
python run.py experiment=tar/mnist_cond
== Diffusion Models
=== DDPM
Denoising Diffusion Probabilistic Models +
Jonathan Ho, Ajay Jain, Pieter Abbeel +
NeurIPS 2020. [https://arxiv.org/abs/2006.11239[PDF]]
[cols="4*", options="header"]
|===
^| Dataset
^| MNIST
^| CelebA
^| CIFAR10
^.^| Results
| image:assets/ddpm/mnist.jpg[mnist_mlp, {img-size}, {img-size}]
| image:assets/ddpm/celeba.jpg[cleba_conv, {img-size}, {img-size}]
| image:assets/ddpm/cifar10.jpg[cifar10_conv, {img-size}, {img-size}]
|===
Contributors
Showing top 1 contributor by commit count.
