Paper | Code | Average Accuracy | Mean class accuracy | ModelName | ReleaseDate |
---|---|---|---|---|---|
Learning Video Representations from Large Language Models | ✓ Link | 81.75 | 76 | LaViLa (Finetuned, TimeSformer-L) | 2022-12-08 |
Integrating Human Gaze into Attention for Egocentric Activity Recognition | ✓ Link | 69.58 | 62.84 | Min et al. | 2020-11-08 |
Group Contextualization for Video Recognition | ✓ Link | 65.1 | GC-TSM | 2022-03-18 | |
Symbiotic Attention with Privileged Information for Egocentric Action Recognition | 62.7 | - | SAP | 2020-02-08 | |
LSTA: Long Short-Term Attention for Egocentric Action Recognition | ✓ Link | 61.9 | - | LSTA | 2018-11-26 |
Attention is All We Need: Nailing Down Object-centric Attention for Egocentric Activity Recognition | ✓ Link | 60.8 | - | Ego-RNN | 2018-07-31 |