video-instance-segmentation-on-ovis-1

Video Instance Segmentation

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	mask AP	AP50	AP75	APho	APmo	AR1	APso	AR10	ModelName	ReleaseDate
DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries	✓ Link	57.1	83.8	62.9						DVIS-DAQ(VIT-L, Offline)	2024-03-29
Context-Aware Video Instance Segmentation	✓ Link	57.1	82.6	63.5			21.2		61.8	CAVIS(VIT-L, Offline)	2024-07-03
DVIS++: Improved Decoupled Framework for Universal Video Segmentation	✓ Link	53.4	78.9	58.5						DVIS++(VIT-L,Offline)	2023-12-20
General Object Foundation Model for Images and Videos at Scale	✓ Link	50.4		55.5						GLEE-Pro	2023-12-14
DVIS: Decoupled Video Instance Segmentation Framework	✓ Link	49.9	75.9	53.0			19.4		55.3	DVIS(Swin-L, Offline)	2023-06-06
DVIS++: Improved Decoupled Framework for Universal Video Segmentation	✓ Link	49.6	72.5	55.0	27.1	56.6	20.8	69.9	54.6	DVIS++(VIT-L, Online)	2023-12-20
Universal Instance Perception as Object Discovery and Retrieval	✓ Link	49.0	72.5	52.2						UNINEXT (ViT-H, Online)	2023-03-12
DVIS: Decoupled Video Instance Segmentation Framework	✓ Link	47.1	71.9	49.2			19.4		52.5	DVIS(Swin-L, Online)	2023-06-06
CTVIS: Consistent Training for Online Video Instance Segmentation	✓ Link	46.9	71.5	47.5	19.1	52.1				CTVIS (Swin-L)	2023-07-24
RefineVIS: Video Instance Segmentation with Temporal Attention Refinement		46	70.4	48.4			19.1		51.2	RefineVIS (Swin-L, offline)	2023-06-07
GRAtt-VIS: Gated Residual Attention for Auto Rectifying Video Instance Segmentation	✓ Link	45.7	69.1	47.8			19.2		49.4	GRAtt-VIS (Swin-L)	2023-05-26
A Generalized Framework for Video Instance Segmentation	✓ Link	45.4	69.2	47.8			18.9		49.0	GenVIS (Swin-L)	2022-11-16
NOVIS: A Case for End-to-End Near-Online Video Instance Segmentation		43.5	68.3	43.8			19.4		46.9	NOVIS (Swin-L)	2023-08-29
TarViS: A Unified Approach for Target-based Video Segmentation	✓ Link	43.2	67.8	44.6			18.0		50.4	TarViS (Swin-L)	2023-01-06
In Defense of Online Models for Video Instance Segmentation	✓ Link	42.6	65.7	45.2			17.9		49.6	IDOL (Swin-L)	2022-07-21
Robust Online Video Instance Segmentation with Track Queries	✓ Link	42.6	64.7	42.6			18.4		49.1	ROVIS (Swin-L)	2022-11-16
MDQE: Mining Discriminative Query Embeddings to Segment Occluded Instances on Challenging Videos	✓ Link	42.6	67.8	44.3	21.6	49.3	18.3	65.1	46.5	MDQE(SwinL)	2023-03-25
UniVS: Unified and Universal Video Segmentation with Prompts as Queries	✓ Link	41.7								UniVS(Swin-L)	2024-02-28
DVIS++: Improved Decoupled Framework for Universal Video Segmentation	✓ Link	41.2	68.9	40.9			16.8		47.3	DVIS++(R50, Offline)	2023-12-20
BoxVIS: Video Instance Segmentation with Box Annotations	✓ Link	40.6	68.4	39.9	20.9	45.8		59.4		BoxVIS(Swin-L & Box-sup)	2023-03-26
MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training	✓ Link	39.4	61.5	41.3			18.1		43.3	MinVIS (Swin-L)	2022-08-03
DVIS++: Improved Decoupled Framework for Universal Video Segmentation	✓ Link	37.2	62.8	37.3			15.8		42.9	DVIS++(R50, Online)	2023-12-20
GRAtt-VIS: Gated Residual Attention for Auto Rectifying Video Instance Segmentation	✓ Link	36.2	60.8	36.8			16.8		40.1	GRAtt-VIS (ResNet-50)	2023-05-26
DeVIS: Making Deformable Transformers Work for Video Instance Segmentation	✓ Link	35.5	59.3	38.3			16.6		39.8	DeVIS (Swin-L)	2022-07-22
CTVIS: Consistent Training for Online Video Instance Segmentation	✓ Link	35.5	60.8	34.9	16.1	41.9				CTVIS (ResNet-50)	2023-07-24
TarViS: A Unified Approach for Target-based Video Segmentation	✓ Link	34.0	55.0	34.4			16.1		40.9	TarViS (Swin-T)	2023-01-06
Universal Instance Perception as Object Discovery and Retrieval	✓ Link	34.0	55.5	35.6						UNINEXT (ResNet-50, Online)	2023-03-12
NOVIS: A Case for End-to-End Near-Online Video Instance Segmentation		32.7	56.2	32.6			15.7		37.1	NOVIS (ResNet-50)	2023-08-29
TarViS: A Unified Approach for Target-based Video Segmentation	✓ Link	31.1	52.5	30.4			15.9		39.9	TarViS (ResNet-50)	2023-01-06
In Defense of Online Models for Video Instance Segmentation	✓ Link	30.2	51.3	30			15		37.5	IDOL (ResNet-50)	2022-07-21
Tube-Link: A Flexible Cross Tube Framework for Universal Video Segmentation	✓ Link	29.5	51.5	30.2			15.5		34.5	Tube-Link(ResNet-50)	2023-03-22
VITA: Video Instance Segmentation via Object Token Association	✓ Link	27.7	51.9	24.9			14.9		33.0	VITA (Swin-L)	2022-06-09
DeVIS: Making Deformable Transformers Work for Video Instance Segmentation	✓ Link	23.7	47.6	20.8			12.0		28.9	DeVIS (ResNet-50)	2022-07-22
InstanceFormer: An Online Video Instance Segmentation Framework	✓ Link	22.8	42.5	21.61			12.9		29.3	InstanceFormer (Swin-L)	2022-08-22
InstanceFormer: An Online Video Instance Segmentation Framework	✓ Link	20.0	40.7	18.1			12		27.1	InstanceFormer(ResNet-50)	2022-08-22
Crossover Learning for Fast Online Video Instance Segmentation	✓ Link	18.1	35.5	16.9						CrossVIS (ResNet-50, calibration)	2021-04-13
Temporally Efficient Vision Transformer for Video Instance Segmentation	✓ Link	17.4	34.9	15.0						TeViT (ResNet-50)	2022-04-18
Spatial Feature Calibration and Temporal Fusion for Effective One-stage Video Instance Segmentation	✓ Link	17.3	35.4	15.2	23.7	14.7	8.4	11.1	23.1	STMask(R101-DCN-FPN)	2021-04-06
Mask2Former for Video Instance Segmentation	✓ Link	16.6	36.9	14.1			9.9		24.7	Mask2Former-VIS	2021-12-20
STC: Spatio-Temporal Contrastive Learning for Video Instance Segmentation		15.5	33.5	13.4						STC (ResNet-50)	2022-02-08
Occluded Video Instance Segmentation: A Benchmark	✓ Link	15.4	33.9	13.1	4.1	18.7		28.6		CMaskTrack R-CNN (ResNet-50)	2021-02-02
D2Conv3D: Dynamic Dilated Convolutions for Object Segmentation in Videos	✓ Link	15.2	33.8	13.7						D2Conv3D (ResNet-50)	2021-11-15
Crossover Learning for Fast Online Video Instance Segmentation	✓ Link	14.9	32.7	12.1						CrossVIS (ResNet-50)	2021-04-13
Occluded Video Instance Segmentation: A Benchmark	✓ Link	14.3	29.9	12.5	2.7	12.8		23		CSipMask (ResNet-50)	2021-02-02

OpenCodePapers

video-instance-segmentation-on-ovis-1