Look Before You Match: Instance Understanding Matters in Video Object Segmentation | | 93.4 | 94.2 | 92.5 | | | ISVOS (BL30K, MS) | 2022-12-13 |
XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model | ✓ Link | 93.3 | 94.4 | 92.2 | | | XMem (BL30K, MS) | 2022-07-14 |
BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object Segmentation | | 92.5 | 94.2 | 90.7 | | | BATMAN (val) | 2022-08-01 |
BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object Segmentation | | 91.6 | 92.5 | 90.8 | | | STCN (val) | 2022-08-01 |
XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model | ✓ Link | 91.5 | 92.7 | 90.4 | | | XMem | 2022-07-14 |
MobileVOS: Real-Time Video Object Segmentation Contrastive Learning meets Knowledge Distillation | | 91.4 | 92.6 | 90.3 | | | MobileVOS (val) | 2023-03-14 |
BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object Segmentation | | 91.1 | 92.1 | 90.1 | | | AOT (val) | 2022-08-01 |
BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object Segmentation | | 90.7 | 91.4 | 89.9 | | | LCM (val) | 2022-08-01 |
BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object Segmentation | | 90.6 | 94 | 87.1 | | | RPCMVOS (val) | 2022-08-01 |
BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object Segmentation | | 90.5 | 91.5 | 89.5 | | | KMN (val) | 2022-08-01 |
BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object Segmentation | | 90.5 | 91.2 | 89.8 | | | TransVOS (val) | 2022-08-01 |
BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object Segmentation | | 89.9 | 91.1 | 88.7 | | | CFBI+ (val) | 2022-08-01 |
ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias | ✓ Link | 89.8 | 90.4 | 89.2 | | | ViTAE-T-Stage | 2021-06-07 |
BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object Segmentation | | 89.4 | 90.5 | 88.3 | | | CFBI (val) | 2022-08-01 |
BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object Segmentation | | 88.8 | 88.7 | 88.9 | | | RMN (val) | 2022-08-01 |
Towards Robust Video Object Segmentation with Adaptive Object Calibration | ✓ Link | | 94.7 | 88.5 | | | AOC-MF (val) | 2022-07-02 |
BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object Segmentation | | | 89.9 | 88.7 | | | STM (val) | 2022-08-01 |
Making a Case for 3D Convolutions for Object Segmentation in Videos | ✓ Link | | 84.7 | 84.3 | | | 3DC-Seg | 2020-08-26 |
Learning Discriminative Feature with CRF for Unsupervised Video Object Segmentation | | | 81.8 | | | | DFNet | 2020-08-04 |
Learning Discriminative Feature with CRF for Unsupervised Video Object Segmentation | | | | 83.4 | | | Ours | 2020-08-04 |
Video Object Segmentation with Language Referring Expressions | | | | | 84.5 | | VOSwL (Mask+Language) | 2018-03-21 |
Video Object Segmentation with Language Referring Expressions | | | | | 82.8 | | VOSwL (Language) | 2018-03-21 |
LOCATE: Self-supervised Object Discovery via Flow-guided Graph-cut and Bootstrapped Self-training | ✓ Link | | | | 80.9 | 88.7 | LOCATE | 2023-08-22 |
FODVid: Flow-guided Object Discovery in Videos | | | | | 78.71 | | FODVid | 2023-07-10 |