Date | Topic | Slides | Readings | Assignment |
---|---|---|---|---|
Jan 6 | Class Intro | Slides | Optional: Python Notes (Alan Kuntz) | Optional: Python Tutorial (Berkeley AI Class) |
Jan 8 | Behavioral Cloning | Slides | Optional: Behavioral Cloning from Observation, DAgger, ThriftyDAgger | Behavior Cloning in PyTorch (due Friday Jan 17) |
Jan 13 | Intro to Advanced Behavior Cloning | Slides | Choose one and submit reading report before class: Implicit Behavioral Cloning, Action Chunking Transformer, Diffusion Policy | |
Jan 15 | More Advanced Behavior Cloning | Slides | ||
Jan 22 | Multi-Armed Bandits and Evaluative Feedback | Slides | Sutton and Barto 2.1-2.5 | Multi-Armed Bandits (due Fri Jan 31) |
Jan 27 | More Bandits | Slides | ||
Jan 29 | Intro to Markov Decision Processes | Slides | ||
Feb 3 | Solving MDPs | Slides | Exact Solution Methods for MDPs | Homework 3 (Due Feb 10) |
Feb 5 | Value-Based RL and Temporal Difference Leanring | Slides | Intro to RL, TD methods | |
Feb 10 | Q-Learning and DQN | Slides | Q-Learning, Nature DQN | |
Feb 12 | Intro to Policy-Gradients for RL | Slides | Read Intro Parts 1-3 | Homework 4: Q-Learning and DQN (due Feb 25) |
Feb 19 | Policy Gradients and REINFORCE | |||
Feb 24 | Alpha Go | Slides | Alpha Go | |
Feb 26 | No Class: Do Reading Assignment Instead | Submit reading report by midnight Feb 26 via Canvas on Alpha Go Zero. | ||
Mar 3 | Special Topics: Shared Control, Early Failure Detection for Robot Surgery | VOSA, Early Failure Detection | Pick one paper from the readings for today and submit reading report before class. | |
Mar 5 | Special Topics: RLHF for Robot Surgery, Explainable Reward Learning, Adversarial Attacks on Behavioral Cloning | RLHF for Surgery, Reward DDTs, Adversarial Attacks | Pick one paper from the readings for today and submit reading report before class. | |
Mar 10-14 | Spring Break | Final project team selection due. | ||
Mar 17 | Actor Critic Algorithms | Slides | A3C, PPO | Final project pitch due on Canvas. Instructions here. |
Mar 19 | PPO | Slides | A3C, PPO | |
Mar 24 | DDPG, TD3, and SAC | Slides | DDPG, TD3, SAC | |
Mar 26 | Multi-Agent RL | Slides | MARL book (chapter 5),VDN,QMIX,MAPPO | Homework 5 due March 28 |
Mar 31 | Model Based RL | Slides | World Models, PlaNet Paper or PlaNet Blog, Dreamer | Read one of the above papers/posts and submit a reading report before class. |
Apr 2 | Inverse RL and Reward Learning | Slides | Final Project Lit Review and Full Proposal due April 4. | |
Apr 7 | RLHF | Slides | ||
Apr 9 | LLM Agents and RL | |||
Apr 14 | Final project presentations | Schedule | ||
Apr 16 | Final project presentations | Schedule | ||
Apr 21 | Final project presentations | Schedule | ||
Apr 30 | Final project reports due | Use this template (simply go to menu and select copy project). |
Here you can find supplementary materials, links, etc.
PyTorch Tutorials
RL Code Resources