Quantization Collection
Repositories tagged with "quantization"
Repositories tagged with "quantization"
LlamaFactory
hiyouga
โUnified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)โ
faster-whisper
SYSTRAN
โFaster Whisper transcription with CTranslate2โ
Chinese-LLaMA-Alpaca
ymcui
โไธญๆLLaMA&Alpacaๅคง่ฏญ่จๆจกๅ+ๆฌๅฐCPU/GPU่ฎญ็ป้จ็ฝฒ (Chinese LLaMA & Alpaca LLMs)โ
Qbot
UFund-Me
โ[๐ฅupdating ...] AI ่ชๅจ้ๅไบคๆๆบๅจไบบ(ๅฎๅ จๆฌๅฐ้จ็ฝฒ) AI-powered Quantitative Investment Research Platform. ๐ online docs: https://ufund-me.github.io/Qbot โจ :news: qbot-mini: https://github.com/Charmve/iQuantโ
bitsandbytes
bitsandbytes-foundation
โAccessible large language models via k-bit quantization for PyTorch.โ
pngquant
kornelski
โLossy PNG compressor โ pngquant command based on libimagequant libraryโ
AutoGPTQ
AutoGPTQ
โAn easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.โ
CTranslate2
OpenNMT
โFast inference engine for Transformer modelsโ
turbovec
RyanCodrai
โA vector index built on TurboQuant, written in Rust with Python bindingsโ
nunchaku
nunchaku-ai
โ[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Modelsโ
optimum
huggingface
โ๐ Accelerate inference and training of ๐ค Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization toolsโ
SageAttention
thu-ml
โ[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.โ
llm-compressor
vllm-project
โTransformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLMโ
Pretrained-Language-Model
huawei-noah
โPretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.โ
deepsparse
neuralmagic
โSparsity-aware deep learning inference runtime for CPUsโ
nlp-architect
IntelLabs
โA model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networksโ
ComfyUI-nunchaku
nunchaku-ai
โComfyUI Plugin of Nunchakuโ
ao
pytorch
โPyTorch native quantization and sparsity for training and inferenceโ