| Date | Topic | Slides | Readings | Assignment |
|---|---|---|---|---|
| Jan 6 | Class Intro | Slides | Optional: Python Notes (Alan Kuntz) | Optional: Python Tutorial (Berkeley AI Class) |
| Jan 8 | Behavioral Cloning | Slides | Optional: Behavioral Cloning from Observation, DAgger, ThriftyDAgger | Behavior Cloning in PyTorch (due Friday Jan 17) |
| Jan 13 | Intro to Advanced Behavior Cloning | Slides | Choose one and submit reading report before class: Implicit Behavioral Cloning, Action Chunking Transformer, Diffusion Policy | |
| Jan 15 | More Advanced Behavior Cloning | Slides | ||
| Jan 22 | Multi-Armed Bandits and Evaluative Feedback | Slides | Sutton and Barto 2.1-2.5 | Multi-Armed Bandits (due Fri Jan 31) |
| Jan 27 | More Bandits | Slides | ||
| Jan 29 | Intro to Markov Decision Processes | Slides | ||
| Feb 3 | Solving MDPs | Slides | Exact Solution Methods for MDPs | Homework 3 (Due Feb 10) |
| Feb 5 | Value-Based RL and Temporal Difference Leanring | Slides | Intro to RL, TD methods | |
| Feb 10 | Q-Learning and DQN | Slides | Q-Learning, Nature DQN | |
| Feb 12 | Intro to Policy-Gradients for RL | Slides | Read Intro Parts 1-3 | Homework 4: Q-Learning and DQN (due Feb 25) |
| Feb 19 | Policy Gradients and REINFORCE | |||
| Feb 24 | Alpha Go | Slides | Alpha Go | |
| Feb 26 | No Class: Do Reading Assignment Instead | Submit reading report by midnight Feb 26 via Canvas on Alpha Go Zero. | ||
| Mar 3 | Special Topics: Shared Control, Early Failure Detection for Robot Surgery | VOSA, Early Failure Detection | Pick one paper from the readings for today and submit reading report before class. | |
| Mar 5 | Special Topics: RLHF for Robot Surgery, Explainable Reward Learning, Adversarial Attacks on Behavioral Cloning | RLHF for Surgery, Reward DDTs, Adversarial Attacks | Pick one paper from the readings for today and submit reading report before class. | |
| Mar 10-14 | Spring Break | Final project team selection due. | ||
| Mar 17 | Actor Critic Algorithms | Slides | A3C, PPO | Final project pitch due on Canvas. Instructions here. |
| Mar 19 | PPO | Slides | A3C, PPO | |
| Mar 24 | DDPG, TD3, and SAC | Slides | DDPG, TD3, SAC | |
| Mar 26 | Multi-Agent RL | Slides | MARL book (chapter 5),VDN,QMIX,MAPPO | Homework 5 due March 28 |
| Mar 31 | Model Based RL | Slides | World Models, PlaNet Paper or PlaNet Blog, Dreamer | Read one of the above papers/posts and submit a reading report before class. |
| Apr 2 | Inverse RL and Reward Learning | Slides | Final Project Lit Review and Full Proposal due April 4. | |
| Apr 7 | RLHF | Slides | ||
| Apr 9 | LLM Agents and RL | Slides | ||
| Apr 14 | Final project presentations | Schedule | Apr 14 Shared Slide Deck | |
| Apr 16 | Final project presentations | Schedule | Apr 16 Shared Slide Deck | |
| Apr 21 | Final project presentations | Schedule | Apr 21 Shared Slide Deck | |
| Apr 30 | Final project reports due | Instructions | Use this template (simply go to menu and select copy project). |
Here you can find supplementary materials, links, etc.
PyTorch Tutorials
RL Code Resources