Run 01
This is a run of an random agent with a random policy
Run 02
This is a run of an On-policy agent using PPO (Midterm)
Run 03
This is a run of an Off-policy agent using SAC (Midterm)
PPO | Full Dim
PPO with Full Dim and New Rewards
PPO | Reduced Dim
PPO with Reduced Dim and New Rewards
SAC | Full Dim
SAC with Full Dim and New Rewards
SAC | Reduced Dim
SAC with Reduced Dim and New Rewards
## Sid fill these in when once you have the videos for the model-based RL runs.