OpenCodePapers

semantic-segmentation-on-sun-rgbd

Semantic Segmentation
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodeMean IoUMean IoU (test)ModelNameReleaseDate
GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer✓ Link54.6GeminiFusion (Swin-Large)2024-06-03
Diffusion-based RGB-D Semantic Segmentation with Deformable Attention Transformer54.0DiffusionMMS2024-09-23
HDBFormer: Efficient RGB-D Semantic Segmentation with A Heterogeneous Dual-Branch Framework✓ Link53.9%HDBFormer2025-04-18
GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer✓ Link53.3GeminiFusion (MiT-B5)2024-06-03
DFormerv2: Geometry Self-Attention for RGBD Semantic Segmentation✓ Link53.3DFormerv2-L2025-04-07
Multimodal Token Fusion for Vision Transformers✓ Link53.0%TokenFusion (S)2022-04-19
Efficient Multimodal Semantic Segmentation via Dual-Prompt Learning✓ Link52.8%DPLNet 2023-12-01
DFormerv2: Geometry Self-Attention for RGBD Semantic Segmentation✓ Link52.8%DFormerv2-B2025-04-07
GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer✓ Link52.7GeminiFusion (MiT-B3)2024-06-03
DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation✓ Link52.5%DFormer-L2023-09-18
CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers✓ Link52.4%CMX (B5)2022-03-09
CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers✓ Link52.1%CMX (B4)2022-03-09
DFormerv2: Geometry Self-Attention for RGBD Semantic Segmentation✓ Link51.5%DFormerv2-S2025-04-07
Multimodal Token Fusion for Vision Transformers✓ Link51.4%TokenFusion (Ti)2022-04-19
DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation✓ Link51.2%DFormer-B2023-09-18
PanopticNDT: Efficient and Robust Panoptic Mapping✓ Link50.86%EMSANet (2x ResNet-34 NBt1D, PanopticNDT version, finetuned)2023-09-24
Deep feature selection-and-fusion for RGB-D semantic segmentation50.6%FSFNet2021-05-10
Pattern-Structure Diffusion for Multi-Task Learning50.6%PSD-ResNet502020-06-01
DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation✓ Link50.0%TokenFusion (S)2023-09-18
CMX: Cross-Modal Fusion for RGB-X Semantic Segmentation with Transformers✓ Link49.7%DPLNet 2022-03-09
Attention-based Dual Supervised Decoder for RGBD Semantic Segmentation49.6%DFormer-L2022-01-05
DCANet: Differential Convolution Attention Network for RGB-D Semantic Segmentation49.6%CMX (B5)2022-10-13
Pixel Difference Convolutional Network for RGB-D Semantic Segmentation49.6%CMX (B4)2023-02-23
Bi-directional Cross-Modality Feature Propagation with Separation-and-Aggregation Gate for RGB-D Semantic Segmentation✓ Link49.4%TokenFusion (Ti)2020-07-17
AsymFormer: Asymmetrical Cross-Modal Representation Learning for Mobile Platform Real-Time RGB-D Semantic Segmentation✓ Link49.1%DFormer-B2023-09-25
Efficient Multi-Task Scene Analysis with RGB-D Transformers✓ Link48.82%EMSANet (2x ResNet-34 NBt1D, PanopticNDT version, finetuned)2023-06-08
DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation✓ Link48.8%FSFNet2023-09-18
ShapeConv: Shape-aware Convolutional Layer for Indoor RGB-D Semantic Segmentation✓ Link48.6%PSD-ResNet502021-08-24
Spatial Information Guided Convolution for Real-Time RGBD Semantic Segmentation✓ Link48.6%TokenFusion (S)2020-04-09
Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments✓ Link48.47%DPLNet 2022-07-10
Attention-guided Chained Context Aggregation for Semantic Segmentation✓ Link48.3%DFormer-L2020-02-27
Efficient RGB-D Semantic Segmentation for Indoor Scene Analysis✓ Link48.17CMX (B5)2020-11-13
ACNet: Attention Based Network to Exploit Complementary Features for RGBD Semantic Segmentation✓ Link48.1%CMX (B4)2019-05-24
RedNet: Residual Encoder-Decoder Network for indoor RGB-D Semantic Segmentation✓ Link47.8%TokenFusion (Ti)2018-06-04
RDFNet: RGB-D Multi-Level Residual Feature Fusion for Indoor Semantic Segmentation47.7%DFormer-B2017-10-01
Context Contrasted Feature and Gated Multi-Scale Aggregation for Scene Segmentation✓ Link47.1%EMSANet (2x ResNet-34 NBt1D, PanopticNDT version, finetuned)2018-06-01
Multi-Modal Attention-based Fusion Model for Semantic Segmentation of RGB-Depth Images47.0%FSFNet2019-12-25
3D Graph Neural Networks for RGBD Semantic Segmentation✓ Link45.9%PSD-ResNet502017-10-01
Self-Supervised Model Adaptation for Multimodal Semantic Segmentation✓ Link45.73TokenFusion (S)2018-08-11
Recurrent Scene Parsing with Perspective Understanding in the Loop✓ Link45.1%DPLNet 2017-05-20
CI-Net: Contextual Information for Joint Semantic Segmentation and Depth Estimation44.3%DFormer-L2021-07-29
Depth-aware CNN for RGB-D Segmentation✓ Link42.0%TokenFusion (S)2018-03-19
Self-Supervised Model Adaptation for Multimodal Semantic Segmentation✓ Link38.4DPLNet 2018-08-11
Missing Modality Robustness in Semi-Supervised Multi-Modal Semantic Segmentation✓ Link48.17DFormer-L2023-04-21