| Date | Topic | Slides | Readings | Assignment |
|---|---|---|---|---|
| Jan 5 | Class Intro | Slides | ||
| Jan 7 | Behavioral Cloning | Slides | Sections 1-2 of this survey, Behavioral Cloning from Observation | HW1: Behavior Cloning in PyTorch (due Jan 14) |
| Jan 12 | Interactive Behavioral Cloning | Slides | DAgger (you can skim/skip the math), ThriftyDAgger | |
| Jan 14 | Advanced Behavior Cloning 1 | Slides | Implicit Behavioral Cloning, Action Chunking Transformer (just abstract and intro), Diffusion Policy (just abstract and intro) | HW2: Advanced BC Homework (due Jan 26) |
| Jan 19 | Holiday | |||
| Jan 21 | Advanced Behavior Cloning 2 | Action Chunking Transformer (full paper), Diffusion Policy (full paper) | ||
| Jan 26 | Multi-Armed Bandits: Epsilon greedy and UCB1 | Sutton and Barto 2.1-2.5, UCB1 | HW3: Multi-Armed Bandits (due Feb 2) | |
| Jan 28 | Contextual Bandits | Overview, EXP4 | ||
| Feb 2 | Q-Learning, SARSA, and DQN | Slides | Q-Learning, SARSA, Nature DQN | Homework 4: Q-Learning and DQN (due Feb 9) |
| Feb 4 | Policy-Gradients and REINFORCE | Slides | Read Intro Parts 1-3 | |
| Feb 9 | Actor Critic RL and PPO | A3C, PPO | ||
| Feb 11 | Inverse RL Intro | Slides | Ng & Russell, Abbeel and Ng | HW5: Policy Gradients (due Feb 20) |
| Feb 16 | Holiday | |||
| Feb 18 | Interactive RL Intro | Slides | TAMER, DeepTAMER | |
| Feb 23 | RLHF Intro | Slides | Christiano, T-REX | |
| Feb 25 | RLHF 2 | Slides | Learning to Summarize, InstructGPT | |
| Mar 2 | Reward Learning from Multiple Feedback Types | Slides | RRIC, Losey Unifying | |
| Mar 4 | Catch up and Project Ideas | Slides | Final project team selection due March 6. | |
| Mar 7-15 | Spring Break | |||
| Mar 16 | Inverse RL 2 | Slides | MaxEnt, GuidedCost | Final project pitch due on Canvas. Instructions here. |
| Mar 18 | Inverse RL 3 | Slides | Bayesian IRL, Bayesian Pref Learning | |
| Mar 23 | Interactive RL 2 | Slides | Bayesian IRL, Bayesian Pref Learning | |
| Feb 25 | No Class: Do Reading Assignment Instead | Final Project Lit Review and Full Proposal due April 4. | ||
| Mar 2 | Special Topics: Shared Control, Early Failure Detection for Robot Surgery | VOSA, Early Failure Detection | Pick one paper from the readings for today and submit reading report before class. | |
| Mar 4 | Special Topics: RLHF for Robot Surgery, Explainable Reward Learning, Adversarial Attacks on Behavioral Cloning | RLHF for Surgery, Reward DDTs, Adversarial Attacks | Pick one paper from the readings for today and submit reading report before class. | |
| Mar 18 | PPO | Slides | A3C, PPO | |
| Mar 23 | DDPG, TD3, and SAC | Slides | DDPG, TD3, SAC | |
| Mar 25 | Multi-Agent RL | Slides | MARL book (chapter 5),VDN,QMIX,MAPPO | Homework 5 due March 28 |
| Mar 30 | Model Based RL | Slides | World Models, PlaNet Paper or PlaNet Blog, Dreamer | Read one of the above papers/posts and submit a reading report before class. |
| Apr 1 | Inverse RL and Reward Learning | Slides | Final Project Lit Review and Full Proposal due April 4. | |
| Apr 8 | LLM Agents and RL | Slides | ||
| Apr 13 | Final project presentations | Schedule | Apr 14 Shared Slide Deck | |
| Apr 15 | Final project presentations | Schedule | Apr 16 Shared Slide Deck | |
| Apr 20 | Final project presentations | Schedule | Apr 21 Shared Slide Deck | |
| Apr 29 | Final project reports due | Instructions | Use this template (simply go to menu and select copy project). |
Here you can find supplementary materials, links, etc.
PyTorch Tutorials
RL Code Resources