Intro to Robot Learning

16-831• Spring 2026 • Carnegie Mellon University • Alexa Noto, Avi Dube, Sid Qian

Run 01

This is a run of an random agent with a random policy

Run 02

This is a run of an On-policy agent using PPO (Midterm)

Run 03

This is a run of an Off-policy agent using SAC (Midterm)

PPO | Full Dim

PPO with Full Dim and New Rewards

PPO | Reduced Dim

PPO with Reduced Dim and New Rewards

SAC | Full Dim

SAC with Full Dim and New Rewards

SAC | Reduced Dim

SAC with Reduced Dim and New Rewards

## Sid fill these in when once you have the videos for the model-based RL runs.