OpenCodePapers

video-instance-segmentation-on-youtube-vis-2

Video Instance Segmentation
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodemask APAP50AP75AR1AR10ModelNameReleaseDate
Context-Aware Video Instance Segmentation✓ Link65.387.373.249.770.3CAVIS(VIT-L, Offline)2024-07-03
DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries✓ Link64.586.172.249.670.7DVIS-DAQ(VIT-L, Offline)2024-03-29
DVIS++: Improved Decoupled Framework for Universal Video Segmentation✓ Link63.986.771.548.869.5DVIS++(VIT-L, Offline)2023-12-20
DVIS++: Improved Decoupled Framework for Universal Video Segmentation✓ Link62.382.770.249.568.0DVIS++(VIT-L, Online)2023-12-20
RefineVIS: Video Instance Segmentation with Temporal Attention Refinement61.484.168.548.365.2RefineVIS (Swin-L, online)2023-06-07
GRAtt-VIS: Gated Residual Attention for Auto Rectifying Video Instance Segmentation✓ Link60.381.367.148.864.5GRAtt-VIS (Swin-L)2023-05-26
TarViS: A Unified Approach for Target-based Video Segmentation✓ Link60.281.467.647.664.8TarViS (Swin-L)2023-01-06
DVIS: Decoupled Video Instance Segmentation Framework✓ Link60.183.068.447.765.7DVIS(Swin-L)2023-06-06
A Generalized Framework for Video Instance Segmentation✓ Link60.180.966.549.164.7GenVIS (Swin-L)2022-11-16
NOVIS: A Case for End-to-End Near-Online Video Instance Segmentation59.882.066.547.964.4NOVIS (Swin-L)2023-08-29
Tube-Link: A Flexible Cross Tube Framework for Universal Video Segmentation✓ Link58.479.464.347.563.6Tube-Link(Swin-L)2023-03-22
UniVS: Unified and Universal Video Segmentation with Prompts as Queries✓ Link57.979.463.346.263.1UniVS(Swin-L)2024-02-28
VITA: Video Instance Segmentation via Object Token Association✓ Link57.580.661.047.762.6VITA (Swin-L)2022-06-09
In Defense of Online Models for Video Instance Segmentation✓ Link56.180.863.54560.1IDOL (Swin-L)2022-07-21
MDQE: Mining Discriminative Query Embeddings to Segment Occluded Instances on Challenging Videos✓ Link55.580.761.745.460.6MDQE(Swin-L)2023-03-25
MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training✓ Link55.376.66245.960.8MinVIS (Swin-L)2022-08-03
DeVIS: Making Deformable Transformers Work for Video Instance Segmentation✓ Link54.477.759.843.857.8DeVIS (Swin-L)2022-07-22
BoxVIS: Video Instance Segmentation with Box Annotations✓ Link53.976.459.644.861.0BoxVIS(Swin-L & Box-sup)2023-03-26
InstanceFormer: An Online Video Instance Segmentation Framework✓ Link51.073.756.942.856.0InstanceFormer (Swin-L)2022-08-22
TarViS: A Unified Approach for Target-based Video Segmentation✓ Link50.971.656.642.257.2TarViS (Swin-T)2023-01-06
GRAtt-VIS: Gated Residual Attention for Auto Rectifying Video Instance Segmentation✓ Link48.969.253.141.856.0GRAtt-VIS (ResNet-50)2023-05-26
TarViS: A Unified Approach for Target-based Video Segmentation✓ Link48.369.653.240.555.9TarViS (ResNet-50)2023-01-06
NOVIS: A Case for End-to-End Near-Online Video Instance Segmentation47.269.450.041.354.4NOVIS (ResNet-50)2023-08-29
DeVIS: Making Deformable Transformers Work for Video Instance Segmentation✓ Link43.166.846.638.050.1DeVIS (ResNet-50)2022-07-22
InstanceFormer: An Online Video Instance Segmentation Framework✓ Link40.862.443.736.148.1InstanceFormer (ResNet-50)2022-08-22
Spatial Feature Calibration and Temporal Fusion for Effective One-stage Video Instance Segmentation✓ Link34.654.038.029.439.1STMask(R101-DCN-FPN)2021-04-06