Lyrics Transcription for Humans: A Readability-Aware Benchmark | ✓ Link | 16.1 | 20.1 | | 57.0 | 29.4 | 84.4 | 73.9 | AudioShake v3 | 2024-07-30 |
Jam-ALT: A Formatting-Aware Lyrics Transcription Benchmark | ✓ Link | 26.0 | | 3.4 | 50.5 | 29.4 | 82.3 | 72.1 | AudioShake v1 | 2023-11-23 |
Lyrics Transcription for Humans: A Readability-Aware Benchmark | ✓ Link | 27.9 | 32.6 | | 45.0 | | 70.4 | 3.7 | Whisper v2 +lang | 2024-07-30 |
Lyrics Transcription for Humans: A Readability-Aware Benchmark | ✓ Link | 32.6 | 37.2 | | 43.7 | | 73.9 | 0.6 | Whisper v3 +lang | 2024-07-30 |
Lyrics Transcription for Humans: A Readability-Aware Benchmark | ✓ Link | 33.5 | 39.3 | | 39.4 | | 60.6 | | Whisper v2 +demucs +lang | 2024-07-30 |
Lyrics Transcription for Humans: A Readability-Aware Benchmark | ✓ Link | 35.5 | 39.7 | | 43.0 | | 73.5 | 1.0 | Whisper v3 | 2024-07-30 |
Jam-ALT: A Formatting-Aware Lyrics Transcription Benchmark | ✓ Link | 35.5 | | 4.3 | 41.6 | | 73.5 | 1.0 | Whisper v3 | 2023-11-23 |
Jam-ALT: A Formatting-Aware Lyrics Transcription Benchmark | ✓ Link | 35.7 | | 4.5 | 41.7 | | 69.3 | 3.3 | Whisper v2 | 2023-11-23 |
Lyrics Transcription for Humans: A Readability-Aware Benchmark | ✓ Link | 37.8 | 42.1 | | 44.2 | | 69.3 | 3.3 | Whisper v2 | 2024-07-30 |
Jam-ALT: A Formatting-Aware Lyrics Transcription Benchmark | ✓ Link | 44.0 | | 5.3 | 28.0 | | 61.2 | | Whisper v2 +demucs | 2023-11-23 |
Lyrics Transcription for Humans: A Readability-Aware Benchmark | ✓ Link | 44.5 | 49.8 | | 41.6 | | 61.2 | | Whisper v2 +demucs | 2024-07-30 |
Lyrics Transcription for Humans: A Readability-Aware Benchmark | ✓ Link | 46.6 | 50.4 | | 33.7 | | 65.8 | | Whisper v3 +demucs +lang | 2024-07-30 |
Jam-ALT: A Formatting-Aware Lyrics Transcription Benchmark | ✓ Link | 47.9 | | 3.8 | 29.0 | | 65.7 | | Whisper v3 +demucs | 2023-11-23 |
Lyrics Transcription for Humans: A Readability-Aware Benchmark | ✓ Link | 48.0 | 51.6 | | 33.0 | | 65.7 | | Whisper v3 +demucs | 2024-07-30 |
Lyrics Transcription for Humans: A Readability-Aware Benchmark | ✓ Link | 66.5 | 72.6 | | 20.0 | 0.0 | 41.1 | | OWSM v3.1 +demucs +lang | 2024-07-30 |
Lyrics Transcription for Humans: A Readability-Aware Benchmark | ✓ Link | 69.3 | 75.0 | | 22.5 | 0.6 | 37.8 | | OWSM v3.1 +lang | 2024-07-30 |