HomeExplorevlm

Vlm Collection

Repositories tagged with "vlm"

LEGENDARY

⭐161.1kHP

◆

🔮Psychic

★★★★★

transformers

huggingface

Pythonaudiodeep-learning

“🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training. ”

UI-TARS-desktop

bytedance

TypeScriptagentagent-tars

“The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra”

sglang

sgl-project

Pythonattentionblackwell

“SGLang is a high-performance serving framework for large language models and multimodal models.”

runanywhere-sdks

RunanywhereAI

C++androidapple-intelligence

“Production ready toolkit to run AI locally”

notebooks

roboflow

Jupyter Notebookautomatic-labeling-systemcomputer-vision

“A collection of tutorials on state-of-the-art computer vision models and techniques. Explore everything from foundational architectures like ResNet to cutting-edge models like RF-DETR, YOLO11, SAM 3, and Qwen3-VL.”

anomaly-detection-resources

yzhao062

Pythonanomaly-detectionawesome

“Anomaly detection related books, papers, videos, and toolboxes. Last update late 2025 for LLM and VLM works!”

nexa-sdk

qualcomm

Kotlingemma3go

“Run frontier LLMs and VLMs with day-0 model support across GPU, NPU, and CPU, with comprehensive runtime coverage for PC (Python/C++), mobile (Android & iOS), and Linux/IoT (Arm64 & x86 Docker). Supporting OpenAI GPT-OSS, IBM Granite-4, Qwen-3-VL, Gemma-3n, Ministral-3, and more.”

ERNIE

PaddlePaddle

Pythonernieernie-45

“The official repository for ERNIE 4.5 and ERNIEKit – its industrial-grade development toolkit based on PaddlePaddle.”

VLM-R1

om-ai-lab

Pythondeepseek-r1grpo

“Solve Visual Understanding with Reinforced VLMs”

UltraRAG

OpenBMB

Pythondeepseekdemo

“A Low-Code MCP Framework for Building Complex and Innovative RAG Pipelines”

LLM-RL-Visualized

changyeyu

Pythonaialgorithm

“🌟100+ 原创 LLM / RL 原理图📚，《大模型算法》作者巨献！💥（100+ LLM/RL Algorithm Maps ）”

star-vector

joanrod

Pythonllmmultimodal-large-language-models

“StarVector is a foundation model for SVG generation that transforms vectorization into a code generation task. Using a vision-language modeling architecture, StarVector processes both visual and textual inputs to produce high-quality SVG code with remarkable precision.”

lmms-eval

EvolvingLMMs-Lab

Pythonagiaudio-evaluation

“One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks”

PromptEnhancer

Hunyuan-PromptEnhancer

Pythonhunyuanhunyuan-image

“[CVPR 2026] PromptEnhancer is a prompt-rewriting tool, refining prompts into clearer, structured versions for better image generation.”

MiniMax-01

MiniMax-AI

Pythonlarge-language-modelsllm

“The official repo of MiniMax-Text-01 and MiniMax-VL-01, large-language-model & vision-language-model based on Linear Attention”

Local-File-Organizer

QiuYannnn

Pythonfile-organizerllama3

“An AI-powered file management tool that ensures privacy by organizing local texts, images. Using Llama3.2 3B and Llava v1.6 models with the Nexa SDK, it intuitively scans, restructures, and organizes files for quick, seamless access and easy retrieval.”