audio-visual-active-speaker-detection-on-ava

Action DetectionAudio-Visual Active Speaker Detection

Results over time

Click legend items to toggle metrics. Hover points for model names.

Leaderboard

Paper	Code	validation mean average precision	ModelName	ReleaseDate
TalkNCE: Improving Active Speaker Detection with Talk-Aware Contrastive Learning	✓ Link	95.5%	LoCoNet+TalkNCE	2023-09-21
LASER: Lip Landmark Assisted Speaker Detection for Robustness	✓ Link	95.3%	LoCoNet + Laser	2025-01-21
LoCoNet: Long-Short Context Network for Active Speaker Detection	✓ Link	95.2%	LoCoNet	2023-01-19
Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection	✓ Link	94.9%	SPELL+	2022-07-15
UniCon+: ICTCAS-UCAS Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2022		94.5%	UniCon+	2022-06-22
Learning Long-Term Spatial-Temporal Graphs for Active Speaker Detection	✓ Link	94.2%	SPELL	2022-07-15
End-to-End Active Speaker Detection	✓ Link	94.1%	EASEE-50	2022-03-27
A Light Weight Model for Active Speaker Detection	✓ Link	94.1%	Light-ASD	2023-03-08
ICTCAS-UCAS-TAL Submission to the AVA-ActiveSpeaker Task at ActivityNet Challenge 2021		93.6%	Extended UniCon	2021-06-01
How to Design a Three-Stage Architecture for Audio-Visual Active Speaker Detection in the Wild	✓ Link	93.5%	ASDNet	2021-06-07
Audio-Visual Activity Guided Cross-Modal Identity Association for Active Speaker Detection	✓ Link	92.86%	GSCMIA	2022-12-01
NUS-HLT Report for ActivityNet Challenge 2021 AVA (Speaker)	✓ Link	92.3%	TalkNet	2021-06-01
UniCon: Unified Context Network for Robust Active Speaker Detection		92.0%	UniCon	2021-08-05
Active Speaker Detection as a Multi-Objective Optimization with Uncertainty-based Multimodal Fusion		91.9%	SA-uncertainty Fusion	2021-06-07
Sub-word Level Lip Reading With Visual Attention		89.2%	VTP (visual only)	2021-10-14
MAAS: Multi-modal Assignation for Active Speaker Detection	✓ Link	88.8%	MAAS-TAN	2021-01-11
Naver at ActivityNet Challenge 2019 -- Task B Active Speaker Detection (AVA)		87.8%	VGG-{LSTM+TCN} (ensemble)	2019-06-25
Active Speakers in Context	✓ Link	87.1%	Active Speakers in Context	2020-05-20
MAAS: Multi-modal Assignation for Active Speaker Detection	✓ Link	85.1%	MAAS-LAN	2021-01-11
Multi-Task Learning for Audio Visual Active Speaker Detection		84.0%	3D-ResNet-GRU	2019-06-01

OpenCodePapers

audio-visual-active-speaker-detection-on-ava