Lyrics Transcription for Humans: A Readability-Aware Benchmark | ✓ Link | 12.6 | 17.7 | | 56.7 | 4.2 | 81.5 | 66.4 | AudioShake v3 | 2024-07-30 |
Lyrics Transcription for Humans: A Readability-Aware Benchmark | ✓ Link | 21.9 | 27.7 | | 52.5 | | 71.5 | 3.1 | Whisper v2 +lang | 2024-07-30 |
Lyrics Transcription for Humans: A Readability-Aware Benchmark | ✓ Link | 22.4 | 28.0 | | 44.5 | | 74.5 | 0.0 | Whisper v3 +lang | 2024-07-30 |
Jam-ALT: A Formatting-Aware Lyrics Transcription Benchmark | ✓ Link | 22.5 | | 4.1 | 47.8 | 38.0 | 82.7 | 69.6 | AudioShake v1 | 2023-11-23 |
Jam-ALT: A Formatting-Aware Lyrics Transcription Benchmark | ✓ Link | 25.7 | | 6.5 | 50.0 | | 71.7 | 3.1 | Whisper v2 | 2023-11-23 |
Lyrics Transcription for Humans: A Readability-Aware Benchmark | ✓ Link | 25.8 | 31.5 | | 52.8 | | 71.7 | 3.1 | Whisper v2 | 2024-07-30 |
Lyrics Transcription for Humans: A Readability-Aware Benchmark | ✓ Link | 28.6 | 33.6 | | 42.5 | | 73.7 | | Whisper v3 | 2024-07-30 |
Jam-ALT: A Formatting-Aware Lyrics Transcription Benchmark | ✓ Link | 28.6 | | 5.0 | 41.9 | | 73.7 | | Whisper v3 | 2023-11-23 |
Lyrics Transcription for Humans: A Readability-Aware Benchmark | ✓ Link | 34.9 | 42.2 | | 34.3 | | 52.6 | | Whisper v2 +demucs +lang | 2024-07-30 |
Jam-ALT: A Formatting-Aware Lyrics Transcription Benchmark | ✓ Link | 38.8 | | 7.1 | 17.2 | | 56.4 | | Whisper v2 +demucs | 2023-11-23 |
Lyrics Transcription for Humans: A Readability-Aware Benchmark | ✓ Link | 39.6 | 46.5 | | 40.4 | | 56.6 | | Whisper v2 +demucs | 2024-07-30 |
Lyrics Transcription for Humans: A Readability-Aware Benchmark | ✓ Link | 58.6 | 62.1 | | 34.4 | | 54.7 | | Whisper v3 +demucs +lang | 2024-07-30 |
Lyrics Transcription for Humans: A Readability-Aware Benchmark | ✓ Link | 61.5 | 64.9 | | 32.4 | | 52.3 | | Whisper v3 +demucs | 2024-07-30 |
Jam-ALT: A Formatting-Aware Lyrics Transcription Benchmark | ✓ Link | 61.5 | | 3.6 | 28.7 | | 52.4 | | Whisper v3 +demucs | 2023-11-23 |
Lyrics Transcription for Humans: A Readability-Aware Benchmark | ✓ Link | 70.8 | 76.0 | | 9.0 | | 33.5 | | OWSM v3.1 +demucs +lang | 2024-07-30 |
Lyrics Transcription for Humans: A Readability-Aware Benchmark | ✓ Link | 73.3 | 78.5 | | 8.8 | 0.0 | 30.2 | | OWSM v3.1 +lang | 2024-07-30 |