video-instance-segmentation-on-youtube-vis-1

Video Instance Segmentation

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	mask AP	AP50	AP75	AR1	AR10	ModelName	ReleaseDate
Context-Aware Video Instance Segmentation	✓ Link	68.9	89.3	76.2	58.3	73.6	CAVIS(ViT-L, Online)	2024-07-03
DVIS++: Improved Decoupled Framework for Universal Video Segmentation	✓ Link	67.7	88.8	75.3	57.9	73.7	DVIS++(ViT-L, Online)	2023-12-20
DVIS: Decoupled Video Instance Segmentation Framework	✓ Link	64.9	88.0	72.7	56.5	70.3	DVIS	2023-06-06
Tube-Link: A Flexible Cross Tube Framework for Universal Video Segmentation	✓ Link	64.6	86.6	71.3	55.9	69.1	Tube-Link	2023-03-22
MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training	✓ Link	61.6	83.3	68.6	54.8	66.6	MinVIS (Swin-L)	2022-08-03
Mask2Former for Video Instance Segmentation	✓ Link	60.4	84.4	67.0			Mask2Former (Swin-L)	2021-12-20
UniVS: Unified and Universal Video Segmentation with Prompts as Queries	✓ Link	60.0	82.1	65.3	54.7	66.8	UniVS(Swin-L)	2024-02-28
MDQE: Mining Discriminative Query Embeddings to Segment Occluded Instances on Challenging Videos	✓ Link	59.9	84.9	67.3	53.5	65.0	MDQE(Swin-L)	2023-03-25
SeqFormer: Sequential Transformer for Video Instance Segmentation	✓ Link	59.3	82.1	66.4	51.7	64.4	SeqFormer (Swin-L)	2021-12-15
DeVIS: Making Deformable Transformers Work for Video Instance Segmentation	✓ Link	57.1	80.8	66.3	50.8	61.0	DeVIS (Swin-L)	2022-07-22
InstanceFormer: An Online Video Instance Segmentation Framework	✓ Link	56.3	78.0	64.2	50.9	61.6	InstanceFormer(Swin-L)	2022-08-22
1st Place Solution for YouTubeVOS Challenge 2021:Video Instance Segmentation		54.3	76.6	65.6	47	57.9	TCIS (Swin-S)	2021-06-12
Video K-Net: A Simple, Strong, and Unified Baseline for Video Segmentation	✓ Link	54.1	79.0	59.6	49.7	59.9	Video K-Net (Swin-Base)	2022-04-10
NOVIS: A Case for End-to-End Near-Online Video Instance Segmentation		52.8	75.7	56.9	50.3	60.6	NOVIS (ResNet-50)	2023-08-29
In Defense of Online Models for Video Instance Segmentation	✓ Link	49.5	74	52.9	47.7	58.7	IDOL (ResNet-50)	2022-07-21
Mask2Former for Video Instance Segmentation	✓ Link	49.2	72.8	54.2			Mask2Former (ResNet-101)	2021-12-20
SeqFormer: Sequential Transformer for Video Instance Segmentation	✓ Link	49.0	71.1	55.7	46.8	56.9	SeqFormer (ResNet-101)	2021-12-15
MSN: Efficient Online Mask Selection Network for Video Instance Segmentation	✓ Link	48.8	69.4	54.9	40.1	55.0	MSN	2021-06-19
SeqFormer: Sequential Transformer for Video Instance Segmentation	✓ Link	47.4	69.8	51.8	45.5	54.8	SeqFormer (ResNet-50)	2021-12-15
Mask2Former for Video Instance Segmentation	✓ Link	46.4	68.0	50.0			Mask2Former (ResNet-50)	2021-12-20
InstanceFormer: An Online Video Instance Segmentation Framework	✓ Link	45.6	68.6	49.6	42.1	53.5	InstanceFormer(ResNet-50)	2022-08-22
SeqFormer: Sequential Transformer for Video Instance Segmentation	✓ Link	45.1	66.9	50.5	45.6	54.6	SeqFormer (ResNet-50)	2021-12-15
DeVIS: Making Deformable Transformers Work for Video Instance Segmentation	✓ Link	44.4	66.7	48.6	42.4	51.6	DeVIS (ResNet-50)	2022-07-22
Video Instance Segmentation using Inter-Frame Communication Transformers	✓ Link	42.8	65.8	46.8	43.8	51.2	IFC (ResNet-50)	2021-06-07
End-to-End Video Instance Segmentation with Transformers	✓ Link	40.1	64.0	45.0	38.3	44.9	VisTR(ResNet-101)	2020-11-30
Video Sparse Transformer With Attention-Guided Memory for Video Object Detection	✓ Link	39.0					VSTAM	2022-06-17
Spatial Feature Calibration and Temporal Fusion for Effective One-stage Video Instance Segmentation	✓ Link	36.8	56.8	38.0	34.8	41.8	STMask(R101-DCN-FPN)	2021-04-06
STC: Spatio-Temporal Contrastive Learning for Video Instance Segmentation		36.7	57.2	38.6	36.9	44.5	STC (ResNet-50)	2022-02-08
Crossover Learning for Fast Online Video Instance Segmentation	✓ Link	36.6	57.3	39.7	36	42	CrossVIS (ResNet-101)	2021-04-13
End-to-End Video Instance Segmentation with Transformers	✓ Link	36.2	59.8	36.9	37.2	42.4	VisTR(ResNet-50)	2020-11-30
Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation	✓ Link	36.1	54.9	39.4	36.3	41.6	PCAN(ResNet-50)	2021-06-22
Object Propagation via Inter-Frame Attentions for Temporally Stable Video Instance Segmentation	✓ Link	36.0	59.4	39.2	39.1	47.7	ObjProp (ResNet-50)	2021-11-15
CompFeat: Comprehensive Feature Aggregation for Video Instance Segmentation	✓ Link	35.3	56.0	38.6	33.1	40.3	CompFeat(ResNet-50)	2020-12-07
Occluded Video Instance Segmentation: A Benchmark	✓ Link	35.1	55.6	38.1			CSipMask	2021-02-02
STEm-Seg: Spatio-temporal Embeddings for Instance Segmentation in Videos	✓ Link	34.6	55.8	37.9	34.4	41.6	STEm-Seg (ResNet-101)	2020-03-18
SipMask: Spatial Information Preservation for Fast Image and Video Instance Segmentation	✓ Link	33.7	54.1	35.8	35.4	40.1	SipMask (ResNet-50, ms-train, single-scale test)	2020-07-29
Track to Detect and Segment: An Online Multi-Object Tracker	✓ Link	32.6	52.6	32.8			TraDeS	2021-03-16
SipMask: Spatial Information Preservation for Fast Image and Video Instance Segmentation	✓ Link	32.5	53	33.3	33.5	38.9	SipMask (ResNet-50, single-scale test)	2020-07-29
Occluded Video Instance Segmentation: A Benchmark	✓ Link	32.1	52.8	34.9			CMaskTrack R-CNN	2021-02-02
STEm-Seg: Spatio-temporal Embeddings for Instance Segmentation in Videos	✓ Link	30.6	50.7	37.9	34.4	41.6	STEm-Seg (ResNet-50)	2020-03-18
Video Instance Segmentation	✓ Link	30.3	51.1	32.6	31	35.5	MaskTrack R-CNN (ResNet-50, single-scale training and test)	2019-05-12
Do Different Tracking Tasks Require Different Appearance Models?	✓ Link	30.1					UniTrack	2021-07-05
Efficient Video Object Segmentation via Network Modulation	✓ Link	29.1	28.6	33.1			OSMN	2018-02-04
Simple Online and Realtime Tracking with a Deep Association Metric	✓ Link	27.8	31.3				DeepSORT	2017-03-21

OpenCodePapers

video-instance-segmentation-on-youtube-vis-1