Awesome CLIP

This repo collects the research resources based on CLIP (Contrastive Language-Image Pre-Training) proposed by OpenAI. If you would like to contribute, please open an issue.

CLIP

Training

OpenCLIP (3rd-party, PyTorch) [code]
Train-CLIP (3rd-party, PyTorch) [code]
Paddle-CLIP (3rd-party, PaddlePaddle) [code]

Applications

GAN

Object Detection

Roboflow Zero-shot Object Tracking [code]
Zero-Shot Detection via Vision and Language Knowledge Distillation [code]
Crop-CLIP [code]
Detic: Detecting Twenty-thousand Classes using Image-level Supervision [code]
CLIP-TD: CLIP Targeted Distillation for Vision-Language Tasks
SLIP: Self-supervision meets Language-Image Pre-training [code]
ReCLIP: A Strong Zero-Shot Baseline for Referring Expression Comprehension [code]

Information Retrieval

Unsplash Image Search [code]
CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval [code]
Less is More: ClipBERT for Video-and-Language Learning via Sparse Sampling [code]
Natural Language YouTube Search [code]
CLIP-as-service: Embed images and sentences into fixed-length vectors with CLIP [code]
clip-retrieval [code]
A CLIP-Hitchhiker’s Guide to Long Video Retrieval [code]
CLIP2Video: Mastering Video-Text Retrieval via Image CLIP [code]
X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval [code]
Extending CLIP for Category-to-image Retrieval in E-commerce [code]

Representation Learning

Text-to-3D Generation

Text-to-Image Generation

Big Sleep: A simple command line tool for text to image generation [code]
Deep Daze: A simple command line tool for text to image generation [code]
CLIP-CLOP: CLIP-Guided Collage and Photomontage [code]
CLIP-GEN: Language-Free Training of a Text-to-Image Generator with CLIP [code]

Prompt Learning

Video Understanding

Image Captioning

CLIP prefix captioning [code]
CLIPScore: A Reference-free Evaluation Metric for Image Captioning [code]
ClipCap: CLIP Prefix for Image Captioning [code]
Text-Only Training for Image Captioning using Noise-Injected CLIP [code]
Fine-grained Image Captioning with CLIP Reward [code]

Image Editing

Image Segmentation

3D Recognition

Audio

Language Tasks

CLIP Models are Few-shot Learners: Empirical Studies on VQA and Visual Entailment [code]

CLIP on Wheels: Zero-Shot Object Navigation as Object Localization and Exploration [code]

Awesome CLIP

Awesome CLIP

CLIP

Training

Applications

GAN

Object Detection

Information Retrieval

Representation Learning

Text-to-3D Generation

Text-to-Image Generation

Prompt Learning

Video Understanding

Image Captioning

Image Editing

Image Segmentation

3D Recognition

Audio

Language Tasks

Object Navigation

Localization

Others

Acknowledgment

Contributors

Awesome CLIP

CLIP

Training

Applications

GAN

Object Detection

Information Retrieval

Representation Learning

Text-to-3D Generation

Text-to-Image Generation

Prompt Learning

Video Understanding

Image Captioning

Image Editing

Image Segmentation

3D Recognition

Audio

Language Tasks

Object Navigation

Localization

Others

Acknowledgment

Contributors

Related Repositories