CoLT5: Faster Long-Range Transformers with Conditional Computation | | 43.51 | 61.3/32.2/33.8 | 36.4/10.2/21.7 | 36.2/12.9/24.3 | 53.9 | 31.1 | 48.1/43.8 | 88.4 | CoLT5 XL | 2023-03-17 |
LongT5: Efficient Text-To-Text Transformer for Long Sequences | ✓ Link | 42.53 | 61.1 / 32.3 / 33.7 | 35.8 / 9.6 / 21.1 | 34.9 / 11.8 / 23.5 | 53.1 | 29.3 | 46.0 / 42.1 | 88.2 | LongT5 XL | 2021-12-15 |
LongT5: Efficient Text-To-Text Transformer for Long Sequences | ✓ Link | 41.03 | 61.3/32.2/33.8 | 60.3 / 31.1 / 32.8 | 35.1 / 12.0 / 23.3 | 52.3 | 27.2 | 40.6 / 38.6 | 87.3 | LongT5 Large | 2021-12-15 |
Adapting Pretrained Text-to-Text Models for Long Text Sequences | ✓ Link | 39.76 | 59.4 / 29.8 / 30.8 | 37.7 / 10.2 / 21.5 | 35.1 / 11.0 / 22.0 | 48.7 | 26.2 | 37.8 / 34.0 | 87.1 | BART-LS | 2022-09-21 |
LongT5: Efficient Text-To-Text Transformer for Long Sequences | ✓ Link | 38.6 | 57.7 / 30.0 / 31.4 | 34.8 / 9.6 / 21.1 | 33.9 / 11.0 / 22.8 | 46.6 | 23.0 | 37.9 / 36.6 | 85.6 | LongT5 Base | 2021-12-15 |
Efficient Long-Text Understanding with Short-Text Models | ✓ Link | 37.99 | 57.5 / 26.3 / 27.4 | 35.2 / 8.7 / 19.4 | 34.2 / 11.0 / 22.0 | 46.9 | 24.1 | 34.8 / 34.8 | 87.3 | BART-large SLED | 2022-08-01 |
UL2: Unifying Language Learning Paradigms | ✓ Link | 37.87 | 53.6 / 26.1 / 28.8 | 32.9 / 7.8 / 19.4 | 31.1 / 8.5 / 20.4 | 37.6 | 24.2 | 45.8 / 40.7 | | UL2 | 2022-05-10 |
SCROLLS: Standardized CompaRison Over Long Language Sequences | ✓ Link | 29.16 | 56.2 / 26.6 / 28.8 | 24.2 / 4.5 / 15.4 | 25.1 / 6.7 / 18.8 | 26.6 | 18.5 | 25.8 / 25.4 | 71.5 | LED Base | 2022-01-10 |
SCROLLS: Standardized CompaRison Over Long Language Sequences | ✓ Link | 29.01 | 47.9 / 18.6 / 22.7 | 27.2 / 4.9 / 16.7 | 30.2 / 8.7 / 20.7 | 26.3 | 15.4 | 26.0 / 25.9 | 77.4 | BART Base | 2022-01-10 |
SCROLLS: Standardized CompaRison Over Long Language Sequences | ✓ Link | 19.35 | 45.3 / 17.9 / 20.8 | 19.6 / 1.8 / 11.0 | 14.2 / 2.0 / 9.3 | 3.4 | 1.5 | 25.2 / 26.1 | 66 | Naive | 2022-01-10 |
Investigating Efficiently Extending Transformers for Long Input Summarization | ✓ Link | | 60.3 / 30.0 / 31.5 | 35.7 / 9.1 / 20.6 | 33.2 / 9.6 / 21.6 | | | | | PEGASUS-X | 2022-08-08 |
Investigating Efficiently Extending Transformers for Long Input Summarization | ✓ Link | | 59.3 / 29.3 / 30.9 | 35.0 / 8.9 / 20.4 | 32.9 / 9.8 / 21.4 | | | | | PEGASUS-X-Base | 2022-08-08 |
UL2: Unifying Language Learning Paradigms | ✓ Link | | | | | | | | 88.7 | UL2 20B | 2022-05-10 |