Explore Roast Battle

HomeExploremulti-modal

Multi Modal Collection

Repositories tagged with "multi-modal"

📈 Trending

💻 Language

JavaScript TypeScript Python Rust Go Java C++C C#Swift Kotlin Ruby PHP Dart Scala Elixir Haskell Lua R Zig

🏷️ Topic

🤖Machine Learning 🌐Web Framework ⌨️CLI Tool 🗄️Database ⚙️DevOps ⛓️Blockchain 📱Mobile 🎮Game Dev 🔒Security 🔌API 🧪Testing 📚Documentation

Built with care

Explore Roast Battle·Compare Leaderboard

agentscope-ai

agentscope

agentscope-ai

Pythonagentchatbot

“Build and run agents you can see, understand and trust.”

GitPedia Card644/004

OpenBMB

MiniCPM-V

OpenBMB

Pythonminicpmminicpm-o

“A Pocket-Sized MLLM for Ultra-Efficient Image and Video Understanding on Your Phone”

GitPedia Card288/004

TEN-framework

ten-framework

TEN-framework

Pythonaimulti-modal

“ Open-source framework for conversational voice AI agents”

GitPedia Card825/003

OpenGVLab

InternVL

OpenGVLab

Pythongptgpt-4o

“[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型”

GitPedia Card334/003

modelscope

modelscope

modelscope

Pythoncvdeep-learning

“ModelScope: bring the notion of Model-as-a-Service to life.”

GitPedia Card719/003

enricoros

big-AGI

enricoros

TypeScriptagiai-agents

“AI suite powered by state-of-the-art models and providing advanced AI/AGI functions. Includes AI personas, AGI functions, world-class Beam multi-model chats, text-to-image, voice, response streaming, code highlighting and execution, PDF import, presets for developers, much more. Deploy on-prem or in the cloud.”

GitPedia Card701/003

zai-org

CogVLM

zai-org

Pythoncross-modalitylanguage-model

“a state-of-the-art-level open visual language model | 多模态预训练模型”

GitPedia Card873/003

datajuicer

data-juicer

datajuicer

Pythondatadata-analysis

“Data processing for and with foundation models! 🍎 🍋 🌽 ➡️ ➡️🍸 🍹 🍷”

GitPedia Card910/003

OFA-Sys

Chinese-CLIP

OFA-Sys

Jupyter Notebookchineseclip

“Chinese version of CLIP which achieves Chinese cross-modal retrieval and representation generation.”

GitPedia Card108/003

valhalla

valhalla

valhalla

C++astardijkstra

“Open Source Routing Engine for OpenStreetMap”

GitPedia Card222/003

lucidrains

DALLE-pytorch

lucidrains

Pythonartificial-intelligenceattention-mechanism

“Implementation / replication of DALL-E, OpenAI's Text to Image Transformer, in Pytorch”

GitPedia Card491/003

marqo-ai

marqo

marqo-ai

Pythonecommercemachine-learning

“Ecommerce Search and Discovery - marqo.ai”

GitPedia Card663/003

zjunlp

DeepKE

zjunlp

Pythonattribute-extractionchinese

“[EMNLP 2022] An Open Toolkit for Knowledge Graph Extraction and Construction”

GitPedia Card657/002

VectorSpaceLab

OmniGen

VectorSpaceLab

Jupyter Notebookdiffusionimage

“OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340”

GitPedia Card009/002

open-compass

VLMEvalKit

open-compass

Pythonchatgptclaude

“Open-source evaluation toolkit of large multi-modality models (LMMs), support 220+ LMMs, 80+ benchmarks”

GitPedia Card929/002

zai-org

VisualGLM-6B

zai-org

Pythonchatglm-6bgpt

“Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型”

GitPedia Card030/002

SciSharp

LLamaSharp

SciSharp

“A C#/.NET library to run LLM (🦙LLaMA/LLaVA) on your local device efficiently.”

GitPedia Card590/002

PKU-YuanGroup

Video-LLaVA

PKU-YuanGroup

Pythoninstruction-tuninglarge-vision-language-model

“【EMNLP 2024🔥】Video-LLaVA: Learning United Visual Representation by Alignment Before Projection”

GitPedia Card898/002