OpenCodePapers
audio-visual-active-speaker-detection-on-ava
Action Detection
Audio-Visual Active Speaker Detection
Dataset Link
Results over time
Click legend items to toggle metrics. Hover points for model names.
Leaderboard
Show papers without code
Paper
Code
validation mean average precision
↕
ModelName
ReleaseDate
↕
TalkNCE: Improving Active Speaker Detection with Talk-Aware Contrastive Learning
✓ Link
95.5%
LoCoNet+TalkNCE
2023-09-21
LASER: Lip Landmark Assisted Speaker Detection for Robustness
✓ Link
95.3%
LoCoNet + Laser
2025-01-21
LoCoNet: Long-Short Context Network for Active Speaker Detection
✓ Link
95.2%
LoCoNet
2023-01-19
Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection
✓ Link
94.9%
SPELL+
2022-07-15
UniCon+: ICTCAS-UCAS Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2022
94.5%
UniCon+
2022-06-22
Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection
✓ Link
94.2%
SPELL
2022-07-15
End-to-End Active Speaker Detection
✓ Link
94.1%
EASEE-50
2022-03-27
A Light Weight Model for Active Speaker Detection
✓ Link
94.1%
Light-ASD
2023-03-08
ICTCAS-UCAS-TAL Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2021
93.6%
Extended UniCon
2021-06-01
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild
✓ Link
93.5%
ASDNet
2021-06-07
Audio-Visual Activity Guided Cross-Modal Identity Association for Active Speaker Detection
✓ Link
92.86%
GSCMIA
2022-12-01
NUS-HLT Report for ActivityNet Challenge 2021 AVA (Speaker)
✓ Link
92.3%
TalkNet
2021-06-01
UniCon: Unified Context Network for Robust Active Speaker Detection
92.0%
UniCon
2021-08-05
Active Speaker Detection as a Multi-Objective Optimization with Uncertainty-based Multimodal Fusion
91.9%
SA-uncertainty Fusion
2021-06-07
Sub-word Level Lip Reading With Visual Attention
89.2%
VTP (visual only)
2021-10-14
MAAS: Multi-modal Assignation for Active Speaker Detection
✓ Link
88.8%
MAAS-TAN
2021-01-11
Naver at ActivityNet Challenge 2019 -- Task B Active Speaker Detection (AVA)
87.8%
VGG-{LSTM+TCN} (ensemble)
2019-06-25
Active Speakers in Context
✓ Link
87.1%
Active Speakers in Context
2020-05-20
MAAS: Multi-modal Assignation for Active Speaker Detection
✓ Link
85.1%
MAAS-LAN
2021-01-11
Multi-Task Learning for Audio Visual Active Speaker Detection
84.0%
3D-ResNet-GRU
2019-06-01