panoptic-segmentation-on-coco-minival

Panoptic Segmentation

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	PQ	PQst	PQth	RQ	SQ	RQst	RQth	SQst	SQth	AP	mIoU	ModelName	ReleaseDate
HyperSeg: Towards Universal Visual Segmentation with Large Language Model	✓ Link	61.2											HyperSeg (Swin-B)	2024-11-26
OneFormer: One Transformer to Rule Universal Image Segmentation	✓ Link	60.0	49.2	67.1							52.0	68.8	OneFormer (InternImage-H,single-scale)	2022-11-10
A Simple Framework for Open-Vocabulary Segmentation and Detection	✓ Link	59.5									53.2		OpenSeeD (SwinL, single-scale)	2023-03-14
UMG-CLIP: A Unified Multi-Granularity Vision Generalist for Open-World Understanding	✓ Link	59.5									50.7	69.7	UMG-CLIP-E/14	2024-01-12
Mask DINO: Towards A Unified Transformer-based Framework for Object Detection and Segmentation	✓ Link	59.4									50.9		MasK DINO (SwinL,single-scale)	2022-06-06
Your ViT is Secretly an Image Segmentation Model	✓ Link	59.2											EoMT (DINOv2-g, single-scale, 1280x1280)	2025-03-24
UMG-CLIP: A Unified Multi-Granularity Vision Generalist for Open-World Understanding	✓ Link	58.9									49.7	68.9	UMG-CLIP-L/14	2024-01-12
Dilated Neighborhood Attention Transformer	✓ Link	58.5	48.8	64.9							49.2	68.3	DiNAT-L (single-scale, Mask2Former)	2022-09-29
Vision Transformer Adapter for Dense Predictions	✓ Link	58.4	48.4	65.0							48.9		ViT-Adapter-L (single-scale, BEiTv2 pretrain, Mask2Former)	2022-05-17
Visual Attention Network	✓ Link	58.2	48.2	64.8									Visual Attention Network (VAN-B6 + Mask2Former)	2022-02-20
kMaX-DeepLab: k-means Mask Transformer	✓ Link	58.1	48.8	64.3									kMaX-DeepLab (single-scale, pseudo-labels)	2022-07-08
Hierarchical Open-vocabulary Universal Image Segmentation	✓ Link	58.1										66.8	HIPIE (ViT-H, single-scale)	2023-07-03
kMaX-DeepLab: k-means Mask Transformer	✓ Link	58.0	48.6	64.2									kMaX-DeepLab (single-scale, drop query with 256 queries)	2022-07-08
OneFormer: One Transformer to Rule Universal Image Segmentation	✓ Link	58.0	48.4	64.3							49.2	68.1	OneFormer (DiNAT-L, single-scale)	2022-11-10
kMaX-DeepLab: k-means Mask Transformer	✓ Link	57.9	48.6	64.0									kMaX-DeepLab (single-scale)	2022-07-08
OneFormer: One Transformer to Rule Universal Image Segmentation	✓ Link	57.9	48.0	64.4							49.0	67.4	OneFormer (Swin-L, single-scale)	2022-11-10
Focal Modulation Networks	✓ Link	57.9									48.4		FocalNet-L (Mask2Former (200 queries))	2022-03-22
Masked-attention Mask Transformer for Universal Image Segmentation	✓ Link	57.8	48.1	64.2							48.6		Mask2Former (single-scale)	2021-12-02
Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers	✓ Link	55.8	46.9	61.7									Panoptic SegFormer (single-scale)	2021-09-08
CMT-DeepLab: Clustering Mask Transformers for Panoptic Segmentation	✓ Link	55.3	46.6	61.0									CMT-DeepLab (single-scale)	2022-06-17
Per-Pixel Classification is Not All You Need for Semantic Segmentation	✓ Link	52.7	44.0	58.5	63.5	81.8							MaskFormer (single-scale)	2021-07-13
MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers	✓ Link	51.1	42.2	57.0									MaX-DeepLab-L (single-scale)	2020-12-01
Panoptic SegFormer: Delving Deeper into Panoptic Segmentation with Transformers	✓ Link	50.6	43.2	55.5									Panoptic SegFormer (ResNet-101)	2021-09-08
ResNeSt: Split-Attention Networks	✓ Link	47.9	37.0	55.1									PanopticFPN+ResNeSt(single-scale)	2020-04-19
End-to-End Object Detection with Transformers	✓ Link	45.1	37	50.5	55.5	79.9	46	61.7	78.5	80.9	33		DETR-R101 (ResNet-101)	2020-05-26
Fully Convolutional Networks for Panoptic Segmentation	✓ Link	44.3	35.6	50	53	80.7	43.5	59.3	76.7	83.4			Panoptic FCN* (ResNet-50-FPN)	2020-12-01
End-to-End Object Detection with Transformers	✓ Link	44.1	33.6	51.0	53.3	79.5	42.1	60.6	74.0	83.2	39.7		PanopticFPN++	2020-05-26
Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation	✓ Link	43.9											Axial-DeepLab-L (multi-scale)	2020-03-17
Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation	✓ Link	43.4	35.6	48.5									Axial-DeepLab-L (single-scale)	2020-03-17
Axial-DeepLab: Stand-Alone Axial-Attention for Panoptic Segmentation	✓ Link		36.8	48.6									Axial-DeepLab-L(multi-scale)	2020-03-17
Fully Convolutional Networks for Panoptic Segmentation	✓ Link			58.5	61.6	83.2	51.1	68.6	81.1	84.6			Panoptic FCN* (Swin-L, single-scale)	2020-12-01

OpenCodePapers

panoptic-segmentation-on-coco-minival