OpenCodePapers

audio-visual-active-speaker-detection-on-ava

Action DetectionAudio-Visual Active Speaker Detection
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
PaperCodevalidation mean average precisionModelNameReleaseDate
TalkNCE: Improving Active Speaker Detection with Talk-Aware Contrastive Learning✓ Link95.5%LoCoNet+TalkNCE2023-09-21
LASER: Lip Landmark Assisted Speaker Detection for Robustness✓ Link95.3%LoCoNet + Laser2025-01-21
LoCoNet: Long-Short Context Network for Active Speaker Detection✓ Link95.2%LoCoNet2023-01-19
Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection✓ Link94.9%SPELL+2022-07-15
UniCon+: ICTCAS-UCAS Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 202294.5%UniCon+2022-06-22
Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection✓ Link94.2%SPELL2022-07-15
End-to-End Active Speaker Detection✓ Link94.1%EASEE-502022-03-27
A Light Weight Model for Active Speaker Detection✓ Link94.1%Light-ASD2023-03-08
ICTCAS-UCAS-TAL Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 202193.6%Extended UniCon2021-06-01
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild✓ Link93.5%ASDNet2021-06-07
Audio-Visual Activity Guided Cross-Modal Identity Association for Active Speaker Detection✓ Link92.86%GSCMIA2022-12-01
NUS-HLT Report for ActivityNet Challenge 2021 AVA (Speaker)✓ Link92.3%TalkNet2021-06-01
UniCon: Unified Context Network for Robust Active Speaker Detection92.0%UniCon2021-08-05
Active Speaker Detection as a Multi-Objective Optimization with Uncertainty-based Multimodal Fusion91.9%SA-uncertainty Fusion2021-06-07
Sub-word Level Lip Reading With Visual Attention89.2%VTP (visual only)2021-10-14
MAAS: Multi-modal Assignation for Active Speaker Detection✓ Link88.8%MAAS-TAN2021-01-11
Naver at ActivityNet Challenge 2019 -- Task B Active Speaker Detection (AVA)87.8%VGG-{LSTM+TCN} (ensemble)2019-06-25
Active Speakers in Context✓ Link87.1%Active Speakers in Context2020-05-20
MAAS: Multi-modal Assignation for Active Speaker Detection✓ Link85.1%MAAS-LAN2021-01-11
Multi-Task Learning for Audio Visual Active Speaker Detection84.0%3D-ResNet-GRU2019-06-01