Paper | Code | Accuracy (%) | ModelName | ReleaseDate |
---|---|---|---|---|
HERMES: temporal-coHERent long-forM understanding with Episodes and Semantics | ✓ Link | 93.5 | HERMES | 2024-08-30 |
MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding | ✓ Link | 93.2 | MA-LMM | 2024-04-08 |
Selective Structured State-Spaces for Long-Form Video Understanding | 90.8 | S5 | 2023-03-25 | |
Learning To Recognize Procedural Activities with Distant Supervision | ✓ Link | 90.0 | D-Sprv. | 2022-01-26 |
Efficient Movie Scene Detection using State-Space Transformers | ✓ Link | 89.3 | TranS4mer | 2022-12-29 |
Long Movie Clip Classification with State-Space Video Models | ✓ Link | 88.4 | ViS4mer | 2022-04-04 |
Temporal Segment Networks for Action Recognition in Videos | ✓ Link | 73.4 | TSN | 2017-05-08 |