| CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model | ✓ Link | 99.1 | CLIP4STR-H (DFN-5B) | 2023-05-23 |
| DTrOCR: Decoder-only Transformer for Optical Character Recognition | ✓ Link | 98.9 | DTrOCR 105M | 2023-08-30 |
| An Empirical Study of Scaling Law for OCR | ✓ Link | 98.76 | CLIP4STR-B* | 2023-12-29 |
| Multi-Granularity Prediction for Scene Text Recognition | ✓ Link | 98.6 | MGP-STR | 2022-09-08 |
| CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model | ✓ Link | 98.6 | CLIP4STR-L (DataComp-1B) | 2023-05-23 |
| Context Perception Parallel Decoder for Scene Text Recognition | ✓ Link | 98.5 | CPPD | 2023-07-23 |
| CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model | ✓ Link | 98.5 | CLIP4STR-L | 2023-05-23 |
| CLIP4STR: A Simple Baseline for Scene Text Recognition with Pre-trained Vision-Language Model | ✓ Link | 98.3 | CLIP4STR-B | 2023-05-23 |
| Scene Text Recognition with Permuted Autoregressive Sequence Models | ✓ Link | 97.9±0.2 | PARSeq | 2022-07-14 |
| Self-supervised Character-to-Character Distillation for Text Recognition | ✓ Link | 97.8 | CCD-ViT-Base(ARD_2.8M) | 2022-11-01 |
| Self-supervised Character-to-Character Distillation for Text Recognition | ✓ Link | 96.4 | CCD-ViT-Small(ARD_2.8M) | 2022-11-01 |
| Self-supervised Character-to-Character Distillation for Text Recognition | ✓ Link | 96.0 | CCD-ViT-Tiny(ARD_2.8M) | 2022-11-01 |
| Visual Semantics Allow for Textual Reasoning Better in Scene Text Recognition | ✓ Link | 95.8 | S-GTR | 2021-12-24 |
| Self-supervised Implicit Glyph Attention for Text Recognition | ✓ Link | 95.1 | SIGA_T | 2022-03-07 |
| Multi-modal Text Recognition Networks: Interactive Enhancements between Visual and Semantic Features | ✓ Link | 95 | MATRN | 2021-11-30 |
| Why You Should Try the Real Data for the Scene Text Recognition | ✓ Link | 94.7 | Yet Another Text Recognizer | 2021-07-29 |
| TPS++: Attention-Enhanced Thin-Plate Spline for Scene Text Recognition | ✓ Link | 94.6 | NRTR+TPS++ | 2023-05-09 |
| Look Back Again: Dual Parallel Attention Network for Accurate and Robust Scene Text Recognition | ✓ Link | 93.9 | DPAN | 2021-08-01 |
| CDistNet: Perceiving Multi-Domain Character Distance for Robust Text Recognition | ✓ Link | 93.82 | CDistNet (Ours) | 2021-11-22 |
| DiffusionSTR: Diffusion Model for Scene Text Recognition | | 93.6 | DiffusionSTR | 2023-06-29 |
| Representation and Correlation Enhanced Encoder-Decoder Framework for Scene Text Recognition | ✓ Link | 91.8 | RCEED | 2021-06-13 |
| Towards Accurate Scene Text Recognition with Semantic Reasoning Networks | ✓ Link | 91.5 | SRN | 2020-03-27 |
| On Recognizing Texts of Arbitrary Shapes with 2D Self-Attention | ✓ Link | 91.3 | SATRN | 2019-10-10 |
| Revisiting Classification Perspective on Scene Text Recognition | ✓ Link | 90.6 | CSTR | 2021-02-22 |
| TextScanner: Reading Characters in Order for Robust Scene Text Recognition | | 90.1 | TextScanner | 2019-12-28 |
| SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition | ✓ Link | 89.6 | SEED | 2020-05-22 |
| ASTER: An Attentional Scene Text Recognizer with Flexible Rectification | ✓ Link | 89.5 | ASTER | 2018-06-25 |
| Decoupled Attention Network for Text Recognition | ✓ Link | 89.2 | DAN | 2019-12-21 |
| SAFL: A Self-Attention Scene Text Recognizer with Focal Loss | ✓ Link | 88.6 | SAFL | 2022-01-01 |
| Vision Transformer for Fast and Efficient Scene Text Recognition | ✓ Link | 87.7 | ViTSTR | 2021-05-18 |
| What Is Wrong With Scene Text Recognition Model Comparisons? Dataset and Model Analysis | ✓ Link | 87.5 | Baek et al. | 2019-04-03 |
| Scene Text Recognition from Two-Dimensional Perspective | | 86.4 | CA-FCN | 2018-09-18 |
| Show, Attend and Read: A Simple and Strong Baseline for Irregular Text Recognition | ✓ Link | 84.5 | SAR | 2018-11-02 |
| Star-net: A spatial attention residue network for scene text recognition. | ✓ Link | 83.6 | STAR-Net | 2016-09-20 |
| Robust Scene Text Recognition with Automatic Rectification | ✓ Link | 81.9 | RARE | 2016-03-12 |
| An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition | ✓ Link | 80.8 | CRNN | 2015-07-21 |
| Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition | ✓ Link | 68.0 | CHAR | 2014-06-09 |