Explore Roast Battle

HomeExplorequantization

Quantization Collection

Repositories tagged with "quantization"

📈 Trending

💻 Language

JavaScript TypeScript Python Rust Go Java C++C C#Swift Kotlin Ruby PHP Dart Scala Elixir Haskell Lua R Zig

🏷️ Topic

🤖Machine Learning 🌐Web Framework ⌨️CLI Tool 🗄️Database ⚙️DevOps ⛓️Blockchain 📱Mobile 🎮Game Dev 🔒Security 🔌API 🧪Testing 📚Documentation

Built with care

Explore Roast Battle·Compare Leaderboard

hiyouga

LlamaFactory

hiyouga

“Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)”

GitPedia Card744/004

SYSTRAN

faster-whisper

SYSTRAN

Pythondeep-learninginference

“Faster Whisper transcription with CTranslate2”

GitPedia Card091/004

ymcui

Chinese-LLaMA-Alpaca

ymcui

Pythonalpacaalpaca-2

“中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)”

GitPedia Card306/003

UFund-Me

Qbot

UFund-Me

Jupyter Notebookbacktestbitcoin

“[🔥updating ...] AI 自动量化交易机器人(完全本地部署) AI-powered Quantitative Investment Research Platform. 📃 online docs: https://ufund-me.github.io/Qbot ✨ :news: qbot-mini: https://github.com/Charmve/iQuant”

GitPedia Card129/003

bitsandbytes-foundation

bitsandbytes

bitsandbytes-foundation

Pythonllmmachine-learning

“Accessible large language models via k-bit quantization for PyTorch.”

GitPedia Card307/003

kornelski

pngquant

kornelski

“Lossy PNG compressor — pngquant command based on libimagequant library”

GitPedia Card607/003

AutoGPTQ

AutoGPTQ

AutoGPTQ

Pythondeep-learninginference

“An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.”

GitPedia Card389/003

OpenNMT

CTranslate2

OpenNMT

“Fast inference engine for Transformer models”

GitPedia Card886/002

RyanCodrai

turbovec

RyanCodrai

Pythonannavx512

“A vector index built on TurboQuant, written in Rust with Python bindings”

GitPedia Card324/002

nunchaku-ai

nunchaku

nunchaku-ai

Pythoncomfyuidiffusion-models

“[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models”

GitPedia Card673/002

huggingface

optimum

huggingface

Pythongraphcorehabana

“🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools”

GitPedia Card057/002

thu-ml

SageAttention

thu-ml

Cudaattentioncuda

“[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.”

GitPedia Card575/002

vllm-project

llm-compressor

vllm-project

Pythoncompressionquantization

“Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM”

GitPedia Card917/002

huawei-noah

Pretrained-Language-Model

huawei-noah

Pythonknowledge-distillationlarge-scale-distributed

“Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.”

GitPedia Card908/002

neuralmagic

deepsparse

neuralmagic

Pythoncomputer-visioncpus

“Sparsity-aware deep learning inference runtime for CPUs”

GitPedia Card916/002

IntelLabs

nlp-architect

IntelLabs

Pythonbertdeep-learning

“A model library for exploring state-of-the-art deep learning topologies and techniques for optimizing Natural Language Processing neural networks”

GitPedia Card925/002

nunchaku-ai

ComfyUI-nunchaku

nunchaku-ai

Pythoncomfyuidiffusion

“ComfyUI Plugin of Nunchaku”

GitPedia Card007/002

pytorch

ao

pytorch

“PyTorch native quantization and sparsity for training and inference”

GitPedia Card307/002