referring-expression-segmentation-on-j-hmdb

Referring Expression Segmentation

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	AP	IoU overall	IoU mean	Precision@0.5	Precision@0.6	Precision@0.7	Precision@0.8	Precision@0.9	ModelName	ReleaseDate
Spectrum-guided Multi-granularity Referring Video Object Segmentation	✓ Link	0.450	0.737	0.725	0.972	0.917	0.714	0.225	0.003	SgMg (Video-Swin-B)	2023-07-25
SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation	✓ Link	0.446	0.736	0.723	0.969	0.914	0.711	0.213	0.001	SOC (Video-Swin-B)	2023-05-26
Deeply Interleaved Two-Stream Encoder for Referring Video Segmentation		0.441	0.68	0.666	0.874	0.791	0.586	0.182	0.30	VLIDE	2022-03-30
SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation	✓ Link	0.397	0.707	0.701	0.947	0.864	0.627	0.179	0.001	SOC (Video-Swin-T)	2023-05-26
End-to-End Referring Video Object Segmentation with Multimodal Transformers	✓ Link	0.392	0.701	0.698	0.939	0.852	0.616	0.166	0.001	MTTR (w=10)	2021-11-29
End-to-End Referring Video Object Segmentation with Multimodal Transformers	✓ Link	0.366	0.674	0.679	0.91	0.815	0.57	0.144	0.001	MTTR (w=8)	2021-11-29
Cross-Modal Progressive Comprehension for Referring Segmentation	✓ Link	0.342	0.616	0.617	0.813	0.657	0.371	0.07	0.000	CMPC-V	2021-05-15
Collaborative Spatial-Temporal Modeling for Language-Queried Video Actor Segmentation		0.335	0.598	0.604	0.783	0.639	0.378	0.076	0.000	Hui et al.	2021-05-14
Actor and Action Modular Network for Text-based Video Segmentation		0.321	0.583	0.576	0.773	0.627	0.360	0.044	0.000	AAMN	2020-11-02
Context Modulated Dynamic Networks for Actor and Action Video Segmentation with Language Queries		0.301	0.554	0.576	0.742	0.587	0.316	0.047	0.000	CMDy	2020-04-03
Polar Relative Positional Encoding for Video-Language Segmentation		0.294			0.572	0.690	0.319	0.06	0.001	PRPE	2020-07-20
Asymmetric Cross-Guided Attention Network for Actor and Action Video Segmentation From Natural Language Query	✓ Link	0.289	0.576	0.584	0.756	0.564	0.287	0.034	0.000	ACGA	2019-10-01
Actor and Action Video Segmentation from a Sentence	✓ Link	0.267	0.555	0.570	0.712	0.518	0.264	0.030	0.000	Gavrilyuk et al. (Optical flow)	2018-03-20
Visual-Textual Capsule Routing for Text-Based Video Segmentation		0.261	0.535	0.550	0.677	0.513	0.283	0.051	0.000	VT-Capsule	2020-06-01
Actor and Action Video Segmentation from a Sentence	✓ Link	0.233	0.541	0.542	0.699	0.460	0.173	0.014	0.000	Gavrilyuk et al.	2018-03-20
Segmentation from Natural Language Expressions	✓ Link	0.178	0.546	0.528	0.633	0.350	0.085	0.002	0.000	Hu et al.	2016-03-20
Tracking by Natural Language Specification		0.173	0.529	0.491	0.578	0.335	0.103	0.060	0.000	Li et al.	2017-07-01
Hierarchical interaction network for video object segmentation from referring expressions			0.652	0.627	0.819	0.736	0.542	0.168	0.4	HINet	2021-11-22
ClawCraneNet: Leveraging Object-level Relation for Text-based Video Segmentation			0.644	0.655	0.880	0.796	0.566	0.147	0.002	ClawCraneNet	2021-03-19
Referring Segmentation in Images and Videos with Cross-Modal Self-Attention Network			0.628	0.581	0.764	0.625	0.389	0.09	0.001	CMSA+CFSA	2021-02-09
Hierarchical interaction network for video object segmentation from referring expressions			0.606	0.568	0.731	0.62	0.392	0.088	0.0	RefVOS	2021-11-22

OpenCodePapers

referring-expression-segmentation-on-j-hmdb