The Arcade Learning Environment: An Evaluation Platform for General Agents | ✓ Link | 0 | Best Learner | 2012-07-19 |
The Arcade Learning Environment: An Evaluation Platform for General Agents | ✓ Link | 0 | Full Tree | 2012-07-19 |
Implicit Quantile Networks for Distributional Reinforcement Learning | ✓ Link | -9289 | IQN | 2018-06-14 |
Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model | ✓ Link | -29968.36 | MuZero | 2019-11-19 |
Recurrent Experience Replay in Distributed Reinforcement Learning | ✓ Link | -30021.7 | R2D2 | 2019-05-01 |
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures | ✓ Link | -10180.38 | IMPALA (deep) | 2018-02-05 |
Fully Parameterized Quantile Function for Distributional Reinforcement Learning | ✓ Link | -9085.3 | FQF | 2019-11-05 |
Noisy Networks for Exploration | ✓ Link | -7550 | NoisyNet-Dueling | 2017-06-30 |
Evolving simple programs for playing Atari games | ✓ Link | -9011 | CGP | 2018-06-14 |
Distributional Reinforcement Learning with Quantile Regression | ✓ Link | -9324 | QR-DQN-1 | 2017-10-27 |
Distributed Prioritized Experience Replay | ✓ Link | -10789.9 | Ape-X | 2018-03-02 |
Increasing the Action Gap: New Operators for Reinforcement Learning | ✓ Link | -13264.51 | Advantage Learning | 2015-12-15 |
First return, then explore | ✓ Link | -3660 | Go-Explore | 2020-04-27 |
Agent57: Outperforming the Atari Human Benchmark | ✓ Link | -4202.6 | Agent57 | 2020-03-30 |
Mastering Atari with Discrete World Models | ✓ Link | -9299 | DreamerV2 | 2020-10-05 |
Adaptive Rational Activations to Boost Deep Reinforcement Learning | ✓ Link | -23582 | Recurrent Rational DQN Average | 2021-02-18 |
Adaptive Rational Activations to Boost Deep Reinforcement Learning | ✓ Link | -23487 | Rational DQN Average | 2021-02-18 |
Online and Offline Reinforcement Learning by Planning with a Learned Model | ✓ Link | -30000 | MuZero (Res2 Adam) | 2021-04-13 |
GDI: Rethinking What Makes Reinforcement Learning Different From Supervised Learning | | -6774 | GDI-I3 | 2021-06-11 |
Generalized Data Distribution Iteration | | -6774 | GDI-I3 | 2022-06-07 |
Generalized Data Distribution Iteration | | -6025 | GDI-H3 | 2022-06-07 |
DNA: Proximal Policy Optimization with a Dual Network Architecture | ✓ Link | -29974 | DNA | 2022-06-20 |
Train a Real-world Local Path Planner in One Hour via Partially Decoupled Reinforcement Learning and Vectorized Diversity | ✓ Link | -8295.4 | ASL DDQN | 2023-05-07 |