arxiv:2604.24762

OmniShotCut: Holistic Relational Shot Boundary Detection with Shot-Query Transformer

Published on Apr 27

· Submitted by

Boyang Wang on Apr 28

Upvote

Authors:

Boyang Wang ,

Abstract

OmniShotCut formulates shot boundary detection as structured relational prediction using a shot query-based dense video Transformer, addressing limitations of existing methods through synthetic transition generation and a comprehensive benchmark.

AI-generated summary

Shot Boundary Detection (SBD) aims to automatically identify shot changes and divide a video into coherent shots. While SBD was widely studied in the literature, existing state-of-the-art methods often produce non-interpretable boundaries on transitions, miss subtle yet harmful discontinuities, and rely on noisy, low-diversity annotations and outdated benchmarks. To alleviate these limitations, we propose OmniShotCut to formulate SBD as structured relational prediction, jointly estimating shot ranges with intra-shot relations and inter-shot relations, by a shot query-based dense video Transformer. To avoid imprecise manual labeling, we adopt a fully synthetic transition synthesis pipeline that automatically reproduces major transition families with precise boundaries and parameterized variants. We also introduce OmniShotCutBench, a modern wide-domain benchmark enabling holistic and diagnostic evaluation.

View arXiv page View PDF Project page GitHub 22 Add to collection

Community

HikariDawn

Paper author Paper submitter about 24 hours ago

•

edited about 23 hours ago

OmniShotCut is a sensitive and more informative SoTA on the Shot Boundary Detection task.
OmniShotCut can detect shot changes of the video in diverse sources (anime, vlog, game, shorts, sports, screen recording, etc.), and recognize Sudden Jump and Transitions (dissolve, fade, wipe, etc.) by proposing a Shot-Query-based Video Transformer.