GitPedia

OmniShotCut

OmniShotCut is a sensitive and more informative SoTA on Shot Boundary Detection task.

From UVA-Computer-Vision-LabΒ·Updated June 18, 2026Β·View on GitHubΒ·

OmniShotCut is a sensitive and more informative SoTA for Shot Boundary Detection. \ OmniShotCut can detect shot changes of the video in diverse sources (anime, vlog, game, shorts, sports, screen recording, etc.), and recognize Sudden Jump and Transitions (dissolve, fade, wipe, etc.) by proposing a Shot-Query-based Video Transformer. The project is written primarily in Python, distributed under the MIT License license, first published in 2026. Key topics include: python, scene-detection, video-processing.

<p align="center"> <img src="__assets__/logo.png" height="100"> </p>

OmniShotCut: Holistic Relational Shot Boundary Detection with Shot-Query Transformer

OmniShotCut is a sensitive and more informative SoTA for Shot Boundary Detection.
OmniShotCut can detect shot changes of the video in diverse sources (anime, vlog, game, shorts, sports, screen recording, etc.), and recognize Sudden Jump and Transitions (dissolve, fade, wipe, etc.) by proposing a Shot-Query-based Video Transformer.

Paper
Website
<a href="https://huggingface.co/spaces/uva-cv-lab/OmniShotCut"><img src="https://img.shields.io/static/v1?label=%F0%9F%A4%97%20HF%20Space&message=Online+Demo&color=orange"></a>
<a href="https://huggingface.co/uva-cv-lab/OmniShotCut"><img src="https://img.shields.io/static/v1?label=%F0%9F%A4%97%20HuggingFace&message=Model+Weight&color=orange"></a>

πŸ”₯ Update | πŸ‘€ Visualization | πŸ”§ Installation | 🐍 Quick Start | ⚑ Inference | πŸ’» OmniShotCut Benchmark

<a name="Update"></a>Update πŸ”₯πŸ”₯πŸ”₯

  • Release ArXiv paper
  • Release the inference weights
  • Release Gradio demo (with online)
  • Release 'pip install omnishotcut' version
  • Release the benchmark
  • Release the training code and curation

:star: If you like OmniShotCut, please help ⭐⭐star⭐⭐ this repo. Thanks! :hugs:

<p align="center"> <img src="__assets__/teaser.png" style="border-radius: 15px"> </p> <p align="center"> <img src="__assets__/model_architecture.png" style="border-radius: 15px"> </p>

<a name="quick_start"></a> Quick Start 🐍

First install PyTorch with CUDA support.
Then install OmniShotCut:

shell
pip install git+https://github.com/UVA-Computer-Vision-Lab/OmniShotCut.git

Once installed, running shot boundary detection is just a few lines:

python
import omnishotcut # Load model β€” accepts a local checkpoint path or HuggingFace repo cut_model = omnishotcut.load("uva-cv-lab/OmniShotCut", filename = "OmniShotCut_ckpt.pth") # Run on a video file ranges = cut_model.inference("video.mp4", mode="clean_shot")

ranges is a list of [start_frame, end_frame] pairs for each detected shot.
By default mode="clean_shot" returns only clean cuts (no transitions).
Use mode="default" to also get dissolves, wipes, and fades with their labels:

python
ranges, intra_labels, inter_labels = cut_model.inference("video.mp4", mode="default")

Besides video file paths, inference() also accepts numpy arrays and torch tensors directly β€” both should be (T, H, W, 3) uint8 RGB:

<a name="Installation"></a> Full Local Installation πŸ”§

shell
conda create -n OmniShotCut python=3.10 conda activate OmniShotCut pip install -r requirements.txt pip install -e .

<a name="fast_inference"></a> Gradio Demo ⚑⚑⚑

Local Gradio can be created by simply running the following:

shell
python app.py

Click "Running on public URL".

<a name="inference"></a> Inference ⚑

This section presents more formal fun and controllable setting in running.

First, let us download the checkpoint

shell
mkdir checkpoints cd checkpoints wget https://huggingface.co/uva-cv-lab/OmniShotCut/resolve/main/OmniShotCut_ckpt.pth

We provide some modes for the inference. 'default' mode will shot the intra and inter label we define.
However, we believe that most users might want the most direct results, which is the general shots without any transitions.
To this end, please use '--mode clean_shot'.

Execute the inference by:

shell
python inference.py --checkpoint_path checkpoints/OmniShotCut_ckpt.pth --input_video_path __assets__/demo_video1.mp4 --overlap_window_length 20 --mode default

Results are saved to results.json. Visualization is saved to demo_video_results/, where vertical bars with the same color indicate the same shot.

πŸ“š Citation

bibtex
@article{wang2026omnishotcut, title={OmniShotCut: Holistic Relational Shot Boundary Detection with Shot-Query Transformer}, author={Wang, Boyang and Xu, Guangyi and Tang, Zhipeng and Zhang, Jiahui and Cheng, Zezhou}, journal={arXiv preprint arXiv:2604.24762}, year={2026} }

Contributors

Showing top 1 contributor by commit count.

View all contributors on GitHub β†’

This article is auto-generated from UVA-Computer-Vision-Lab/OmniShotCut via the GitHub API.Last fetched: 6/18/2026