OmniVec2 - A Novel Transformer based Network for Large Scale Multimodal and Multitask Learning | | 97.2 | | | | | | OmniVec2 | 2024-01-01 |
Parameter-Efficient Fine-Tuning in Spectral Domain for Point Cloud Learning | ✓ Link | 96.18 | | 99.48 | 97.76 | | | PointGST | 2024-10-10 |
OmniVec: Learning robust representations with cross modal sharing | | 96.1 | | | | | | OmniVec | 2023-11-07 |
GPSFormer: A Global Perception and Local Structure Fitting-based Transformer for Point Cloud Understanding | ✓ Link | 95.4 | 93.8 | | | 0.7G | 2.36M | GPSFormer | 2024-07-18 |
ShapeLLM: Universal 3D Object Understanding for Embodied Interaction | ✓ Link | 95.25 | | 98.80 | 97.59 | | | ReCon++ | 2024-02-27 |
Asymmetric Dual Self-Distillation for 3D Self-Supervised Representation Learning | ✓ Link | 93.72 | | 96.73 | 94.32 | | | AsymDSD-B* (no voting) | 2025-06-26 |
PointGPT: Auto-regressively Generative Pre-training from Point Clouds | ✓ Link | 93.4 | | 97.2 | 96.6 | | | PointGPT | 2023-05-19 |
GPSFormer: A Global Perception and Local Structure Fitting-based Transformer for Point Cloud Understanding | ✓ Link | 93.30 | 92.51 | | | | 0.68M | GPSFormer-elite | 2024-07-18 |
Mamba3D: Enhancing Local Features for 3D Point Cloud Analysis via State Space Model | ✓ Link | 92.64 | | 94.49 | 92.43 | 3.9G | 16.9M | Mamba3D | 2024-04-23 |
Mamba3D: Enhancing Local Features for 3D Point Cloud Analysis via State Space Model | ✓ Link | 91.81 | | 92.94 | 92.08 | 3.9G | 16.9M | Mamba3D (no voting) | 2024-04-23 |
ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding | ✓ Link | 91.5 | 91.2 | | | | 1.4M | ULIP-2 + PointNeXt | 2023-05-14 |
Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining | ✓ Link | 91.26 | | 95.35 | 93.80 | | | ReCon | 2023-02-05 |
ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding | ✓ Link | 90.8 | 90.3 | | | | 1.4M | ULIP-2 + PointNeXt (no voting) | 2023-05-14 |
Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining | ✓ Link | 90.63 | | 95.18 | 93.29 | | | ReCon (no voting) | 2023-02-05 |
Asymmetric Dual Self-Distillation for 3D Self-Supervised Representation Learning | ✓ Link | 90.53 | | 94.32 | 91.91 | | | AsymDSD-S (no voting) | 2025-06-26 |
Decoupled Local Aggregation for Point Cloud Learning | ✓ Link | 90.4 | 89.3 | | | 1.5G | 5.3M | DeLA | 2023-08-31 |
PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders | ✓ Link | 90.35 | | 95.52 | 94.32 | | | PCP-MAE | 2024-08-16 |
Point Cloud Classification Using Content-based Transformer via Clustering in Feature Space | ✓ Link | 90.3 | 88.5 | | | | | PointConT | 2023-03-08 |
Regress Before Construct: Regress Autoencoder for Point Cloud Self-supervised Learning | ✓ Link | 90.28 | | 95.53 | 93.63 | | | Point-RAE (no voting) | 2023-09-25 |
Towards Compact 3D Representations via Point Feature Enhancement Masked Autoencoders | ✓ Link | 90.22 | | 95.18 | 93.29 | | | Point-FEMAE | 2023-12-17 |
Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders | ✓ Link | 90.11 | | 94.15 | 91.57 | | | I2P-MAE (no voting) | 2022-12-13 |
ULIP: Learning a Unified Representation of Language, Images, and Point Clouds for 3D Understanding | ✓ Link | 89.7 | 88.6 | | | | 1.4M | ULIP + PointNeXt | 2022-12-10 |
Positional Prompt Tuning for Efficient 3D Representation Learning | ✓ Link | 89.52 | | 95.01 | 93.28 | | | ReCon+PPT | 2024-08-21 |
3D-JEPA: A Joint Embedding Predictive Architecture for 3D Self-Supervised Representation Learning | | 89.52 | | 93.63 | 94.49 | | | 3D-JEPA | 2024-09-24 |
Beyond First Impressions: Integrating Joint Multi-modal Cues for Comprehensive 3D Representation | ✓ Link | 89.5 | 88.7 | | | | | PointMLP∗ + JM3D | 2023-08-06 |
ULIP: Learning a Unified Representation of Language, Images, and Point Clouds for 3D Understanding | ✓ Link | 89.4 | 88.5 | | | | | ULIP + PointMLP | 2022-12-10 |
KPConvX: Modernizing Kernel Point Convolution with Kernel Attention | ✓ Link | 89.3 | 88.1 | | | | | KPConvX-L | 2024-05-21 |
P2P: Tuning Pre-trained Image Models for Point Cloud Analysis with Point-to-Pixel Prompting | ✓ Link | 89.3 | | | | | 195.8M | P2P | 2022-08-04 |
Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning? | ✓ Link | 89.17 | | | | | | ACT | 2022-12-16 |
Beyond local patches: Preserving global–local interactions by enhancing self-attention via 3D point cloud tokenization | | 89.0 | 87.2 | | | | | Ours | 2024-11-01 |
Rethinking Masked Representation Learning for 3D Point Cloud Understanding | ✓ Link | 89.0 | | 92.9 | 92.3 | 6.29 | | OTMae3D | 2024-12-26 |
ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding | ✓ Link | 89.0 | | | | | | ULIP-2 + Point-BERT | 2023-05-14 |
Local Neighborhood Features for 3D Classification | ✓ Link | 88.6 | 87.4 | | | | | PointNeXt+Local | 2022-12-09 |
Self-positioning Point-based Transformer for Point Cloud Understanding | ✓ Link | 88.6 | 86.8 | | | | | SPoTr | 2023-03-29 |
Instance-aware Dynamic Prompt Tuning for Pre-trained Point Cloud Models | ✓ Link | 88.51 | | 93.12 | | | | IDPT | 2023-04-14 |
Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud Models | ✓ Link | 88.5 | | | | | | PointMLP+TAP | 2023-07-27 |
$(0, 4)$ dualities | | 88.4 | | | | 1.64G | 1.4M | PointNeXt+GAM | 2015-12-14 |
Rethinking the compositionality of point clouds through regularization in the hyperbolic space | ✓ Link | 88.3 | 87.0 | | | | | PointNeXt+HyCoRe | 2022-09-21 |
Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning? | ✓ Link | 88.21 | | 93.29 | 91.91 | | | ACT (no voting) | 2022-12-16 |
PointNeXt: Revisiting PointNet++ with Improved Training and Scaling Strategies | ✓ Link | 88.2 | 86.8 | | | 1.64G | 1.4M | PointNeXt | 2022-06-09 |
Point Cloud Classification Using Content-based Transformer via Clustering in Feature Space | ✓ Link | 88.0 | 86.0 | | | | | PointConT (no voting) | 2023-03-08 |
PointVector: A Vector Representation In Point Cloud Analysis | ✓ Link | 87.8 | 86.2 | | | | | PointVector-S | 2022-05-21 |
Point2Vec for Self-Supervised Representation Learning on Point Clouds | ✓ Link | 87.5 | 86.0 | 91.2 | 90.4 | | | point2vec | 2023-03-29 |
Advanced Feature Learning on Point Clouds using Multi-resolution Features and Learnable Pooling | ✓ Link | 87.2 | 86.2 | | | | | PointStack | 2022-05-20 |
Parameter is Not All You Need: Starting from Non-Parametric Networks for 3D Point Cloud Analysis | ✓ Link | 87.1 | | | | | 0.8M | Point-PN | 2023-03-14 |
Let Images Give You More:Point Cloud Cross-Modal Training for Shape Analysis | ✓ Link | 86.7 | 84.8 | | | | 12.6M | PointCMT | 2022-10-09 |
Point-JEPA: A Joint Embedding Predictive Architecture for Self-Supervised Learning on Point Cloud | ✓ Link | 86.6 | | 92.9±0.4 | | | | Point-JEPA | 2024-04-25 |
ModelNet-O: A Large-Scale Synthetic Dataset for Occlusion-Aware Point Cloud Classification | ✓ Link | 86.6 | | | | | | PointMLS | 2024-01-16 |
Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training | ✓ Link | 86.43 | | 91.22 | 88.81 | | | Point-M2AE | 2022-05-28 |
ULIP: Learning a Unified Representation of Language, Images, and Point Clouds for 3D Understanding | ✓ Link | 86.4 | | | | | | ULIP + PointBERT | 2022-12-10 |
DualMLP: a two-stream fusion model for 3D point cloud classification | ✓ Link | 86.4 | | | | | | DualMLP | 2023-10-10 |
Surface Representation for Point Clouds | ✓ Link | 86.0 | | | | 2.43G | 6.80M | RepSurf-U (2x) | 2022-05-11 |
Rethinking Network Design and Local Geometry in Point Cloud: A Simple Residual MLP Framework | ✓ Link | 85.7 | 84.4 | | | | | PointMLP | 2022-02-15 |
Point-LGMask: Local and Global Contexts Embedding for Point Cloud Pre-training with Multi-Ratio Masking | ✓ Link | 85.3 | | 89.8 | 89.3 | | | Point-LGMask | 2023-06-08 |
Masked Autoencoders for Point Cloud Self-supervised Learning | ✓ Link | 85.2 | | 90.02 | 88.29 | | | Point-MAE | 2022-03-13 |
SimpleView++: Neighborhood Views for Point Cloud Classification | ✓ Link | 84.8 | | | | | | MVTN+SimpleView++ | 2022-09-08 |
DeltaConv: Anisotropic Operators for Geometric Deep Learning on Point Clouds | ✓ Link | 84.7 | | | | | | DeltaConv | 2021-11-16 |
Surface Representation for Point Clouds | ✓ Link | 84.6 | | | | 0.81G | 1.48M | RepSurf-U | 2022-05-11 |
APP-Net: Auxiliary-point-based Push and Pull Operations for Efficient Point Cloud Classification | ✓ Link | 84.1 | | | | | | APP-Net | 2022-05-02 |
Rethinking Network Design and Local Geometry in Point Cloud: A Simple Residual MLP Framework | ✓ Link | 83.8 | 81.8 | | | | | PointMLP-elite | 2022-02-15 |
SageMix: Saliency-Guided Mixup for Point Clouds | ✓ Link | 83.7 | | | | | | PointNet++ + SageMix | 2022-10-13 |
SageMix: Saliency-Guided Mixup for Point Clouds | ✓ Link | 83.6 | | | | | | DGCNN + SageMix | 2022-10-13 |
Points to Patches: Enabling the Use of Self-Attention for 3D Shape Recognition | ✓ Link | 83.5 | 81.0 | | | 1.19G | 3.9M | Point-TnT | 2022-04-08 |
Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point Modeling | ✓ Link | 83.1 | | 87.43 | 88.12 | | | Point-BERT | 2021-11-29 |
MVTN: Multi-View Transformation Network for 3D Shape Recognition | ✓ Link | 82.8 | | 92.6 | 92.3 | | | MVTN | 2020-11-26 |
PRA-Net: Point Relation-Aware Network for 3D Point Cloud Analysis | ✓ Link | 82.1 | 79.1 | | | | | PRA-Net | 2021-12-09 |
Dynamic Local Geometry Capture in 3D PointCloud Classification | ✓ Link | 82.0 | | | | | | DynamicScale | 2021-10-19 |
PatchAugment: Local Neighborhood Augmentation in Point Cloud Classification | ✓ Link | 81.0 | 79.7 | | | | | PatchAugment | 2021-10-19 |
Geometric Back-projection Network for Point Cloud Classification | ✓ Link | 80.5 | 77.8 | | | | | GBNet | 2019-11-28 |
Revisiting Point Cloud Classification with a Simple and Effective Baseline | ✓ Link | 80.5 | | | | | | SimpleView | 2021-01-01 |
Dense-Resolution Network for Point Cloud Classification and Segmentation | ✓ Link | 80.3 | 78.0 | | | | | DRNet | 2020-05-14 |
PointCNN: Convolution On $\mathcal{X}$-Transformed Points | ✓ Link | 78.5 | 75.1 | 86.1 | 85.5 | | | PointCNN | 2018-01-23 |
Dynamic Graph CNN for Learning on Point Clouds | ✓ Link | 78.1 | 73.6 | 82.8 | 86.2 | | | DGCNN | 2018-01-24 |
PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space | ✓ Link | 77.9 | 75.4 | 82.3 | 84.3 | | | PointNet++ | 2017-06-07 |
SpiderCNN: Deep Learning on Point Sets with Parameterized Convolutional Filters | ✓ Link | 73.7 | 69.8 | | | | | SpiderCNN | 2018-03-30 |
PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation | ✓ Link | 68.2 | 63.4 | | | | | PointNet | 2016-12-02 |
ExpPoint-MAE: Better interpretability and performance for self-supervised point cloud transformers | ✓ Link | | | 90.88 | 90.02 | | | ExpPoint-MAE | 2023-06-19 |