✅ Proximal Policy Gradient (PPO) |
ppo.py , docs |
|
ppo_atari.py , docs |
|
ppo_continuous_action.py , docs |
|
ppo_atari_lstm.py |
|
ppo_procgen.py |
✅ Deep Q-Learning (DQN) |
dqn.py |
|
dqn_atari.py |
✅ Categorical DQN (C51) |
c51.py |
|
c51_atari.py |
✅ Apex Deep Q-Learning (Apex-DQN) |
apex_dqn_atari.py |
✅ Soft Actor-Critic (SAC) |
sac_continuous_action.py |
✅ Deep Deterministic Policy Gradient (DDPG) |
ddpg_continuous_action.py |
✅ Twin Delayed Deep Deterministic Policy Gradient (TD3) |
td3_continuous_action.py |