lixilinx/psgd_torch

Pytorch implementation of preconditioned stochastic gradient descent (Kron and affine preconditioner, low-rank approximation preconditioner and more)

2 Releases

Latest: 5mo ago

PSGD 2.0 release 2.0Latest

lixilinx·5mo ago·December 25, 2025

📋 Changes

[psgd.py](https://github.com/lixilinx/psgd_torch/blob/master/psgd.py): functional APIs providing all the flexibilities.
[wrapped_as_torch_optimizer_for_ddp.py](https://github.com/lixilinx/psgd_torch/blob/master/wrapped_as_torch_optimizer_for_ddp.py): a basic momentum whitening torch.optim.Optimizer wrapping example for DDP training.
[wrapped_as_torch_optimizer_for_dtensor.py](https://github.com/lixilinx/psgd_torch/blob/master/wrapped_as_torch_optimizer_for_dtensor.py): one more basic momentum whitening torch.optim.Optimizer wrapping example for DTensor-based distributed training.

archived code1.0

lixilinx·5y ago·December 17, 2020