OpenCodePapers

video-instance-segmentation-on-ovis-1

Video Instance Segmentation
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodemask APAP50AP75APhoAPmoAR1APsoAR10ModelNameReleaseDate
DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries✓ Link57.183.862.9DVIS-DAQ(VIT-L, Offline)2024-03-29
Context-Aware Video Instance Segmentation✓ Link57.182.663.521.261.8CAVIS(VIT-L, Offline)2024-07-03
DVIS++: Improved Decoupled Framework for Universal Video Segmentation✓ Link53.478.958.5DVIS++(VIT-L,Offline)2023-12-20
General Object Foundation Model for Images and Videos at Scale✓ Link50.455.5GLEE-Pro2023-12-14
DVIS: Decoupled Video Instance Segmentation Framework✓ Link49.975.953.019.455.3DVIS(Swin-L, Offline)2023-06-06
DVIS++: Improved Decoupled Framework for Universal Video Segmentation✓ Link49.672.555.027.156.620.869.954.6DVIS++(VIT-L, Online)2023-12-20
Universal Instance Perception as Object Discovery and Retrieval✓ Link49.072.552.2UNINEXT (ViT-H, Online)2023-03-12
DVIS: Decoupled Video Instance Segmentation Framework✓ Link47.171.949.219.452.5DVIS(Swin-L, Online)2023-06-06
CTVIS: Consistent Training for Online Video Instance Segmentation✓ Link46.971.547.519.152.1CTVIS (Swin-L)2023-07-24
RefineVIS: Video Instance Segmentation with Temporal Attention Refinement4670.448.419.151.2RefineVIS (Swin-L, offline)2023-06-07
GRAtt-VIS: Gated Residual Attention for Auto Rectifying Video Instance Segmentation✓ Link45.769.147.819.249.4GRAtt-VIS (Swin-L)2023-05-26
A Generalized Framework for Video Instance Segmentation✓ Link45.469.247.818.949.0GenVIS (Swin-L)2022-11-16
NOVIS: A Case for End-to-End Near-Online Video Instance Segmentation43.568.343.819.446.9NOVIS (Swin-L)2023-08-29
TarViS: A Unified Approach for Target-based Video Segmentation✓ Link43.267.844.618.050.4TarViS (Swin-L)2023-01-06
MDQE: Mining Discriminative Query Embeddings to Segment Occluded Instances on Challenging Videos✓ Link42.667.844.321.649.318.365.146.5MDQE(SwinL)2023-03-25
In Defense of Online Models for Video Instance Segmentation✓ Link42.665.745.217.949.6IDOL (Swin-L)2022-07-21
Robust Online Video Instance Segmentation with Track Queries✓ Link42.664.742.618.449.1ROVIS (Swin-L)2022-11-16
UniVS: Unified and Universal Video Segmentation with Prompts as Queries✓ Link41.7UniVS(Swin-L)2024-02-28
DVIS++: Improved Decoupled Framework for Universal Video Segmentation✓ Link41.268.940.916.847.3DVIS++(R50, Offline)2023-12-20
BoxVIS: Video Instance Segmentation with Box Annotations✓ Link40.668.439.920.945.859.4BoxVIS(Swin-L & Box-sup)2023-03-26
MinVIS: A Minimal Video Instance Segmentation Framework without Video-based Training✓ Link39.461.541.318.143.3MinVIS (Swin-L)2022-08-03
DVIS++: Improved Decoupled Framework for Universal Video Segmentation✓ Link37.262.837.315.842.9DVIS++(R50, Online)2023-12-20
GRAtt-VIS: Gated Residual Attention for Auto Rectifying Video Instance Segmentation✓ Link36.260.836.816.840.1GRAtt-VIS (ResNet-50)2023-05-26
CTVIS: Consistent Training for Online Video Instance Segmentation✓ Link35.560.834.916.141.9CTVIS (ResNet-50)2023-07-24
DeVIS: Making Deformable Transformers Work for Video Instance Segmentation✓ Link35.559.338.316.639.8DeVIS (Swin-L)2022-07-22
Universal Instance Perception as Object Discovery and Retrieval✓ Link34.055.535.6UNINEXT (ResNet-50, Online)2023-03-12
TarViS: A Unified Approach for Target-based Video Segmentation✓ Link34.055.034.416.140.9TarViS (Swin-T)2023-01-06
NOVIS: A Case for End-to-End Near-Online Video Instance Segmentation32.756.232.615.737.1NOVIS (ResNet-50)2023-08-29
TarViS: A Unified Approach for Target-based Video Segmentation✓ Link31.152.530.415.939.9TarViS (ResNet-50)2023-01-06
In Defense of Online Models for Video Instance Segmentation✓ Link30.251.3301537.5IDOL (ResNet-50)2022-07-21
Tube-Link: A Flexible Cross Tube Framework for Universal Video Segmentation✓ Link29.551.530.215.534.5Tube-Link(ResNet-50)2023-03-22
VITA: Video Instance Segmentation via Object Token Association✓ Link27.751.924.914.933.0VITA (Swin-L)2022-06-09
DeVIS: Making Deformable Transformers Work for Video Instance Segmentation✓ Link23.747.620.812.028.9DeVIS (ResNet-50)2022-07-22
InstanceFormer: An Online Video Instance Segmentation Framework✓ Link22.842.521.6112.929.3InstanceFormer (Swin-L)2022-08-22
InstanceFormer: An Online Video Instance Segmentation Framework✓ Link20.040.718.11227.1InstanceFormer(ResNet-50)2022-08-22
Crossover Learning for Fast Online Video Instance Segmentation✓ Link18.135.516.9CrossVIS (ResNet-50, calibration)2021-04-13
Temporally Efficient Vision Transformer for Video Instance Segmentation✓ Link17.434.915.0TeViT (ResNet-50)2022-04-18
Spatial Feature Calibration and Temporal Fusion for Effective One-stage Video Instance Segmentation✓ Link17.335.415.223.714.78.411.123.1STMask(R101-DCN-FPN)2021-04-06
Mask2Former for Video Instance Segmentation✓ Link16.636.914.19.924.7Mask2Former-VIS2021-12-20
STC: Spatio-Temporal Contrastive Learning for Video Instance Segmentation15.533.513.4STC (ResNet-50)2022-02-08
Occluded Video Instance Segmentation: A Benchmark✓ Link15.433.913.14.118.728.6CMaskTrack R-CNN (ResNet-50)2021-02-02
D2Conv3D: Dynamic Dilated Convolutions for Object Segmentation in Videos✓ Link15.233.813.7D2Conv3D (ResNet-50)2021-11-15
Crossover Learning for Fast Online Video Instance Segmentation✓ Link14.932.712.1CrossVIS (ResNet-50)2021-04-13
Occluded Video Instance Segmentation: A Benchmark✓ Link14.329.912.52.712.823CSipMask (ResNet-50)2021-02-02