CMPSCI 687

Reinforcement Learning

Fall 2003


Course Schedule

(subject to minor revision; click on lecture number to see view graphs)

Lecture Date Class topic Reading Homework assigned Homework Due
Lecture 1 Th Sept 4 Introduction and course overview Chapter 1 Exercise Set 1: ex. 1.1-1.5 (PDF)
Lecture 2 Tu Sept 9 Evaluative Feedback Chapter 2 Exercise Set 2: ex. 2.3, 2.4, 2.6, 2.8, 2.13, 2.16 Exercise Set 1
Lecture 3 Th Sept 11 Evaluative feedback continued, policy gradient methods (PS, PDF) Chapter 2, REINFORCE Programming Exercise 1
Lecture 4 Tu Sept 16 The RL Problem Chapter 3 Exercise Set 3: ex. 3.2-3.5, modified 3.6, 3.7-3.17 Exercise Set 2
Lecture 5 Th Sept 18 The RL problem continued Chapter 3
Lecture 6 Tu Sept 23 The RL problem continued Chapter 3, modified slides: PDF, PS. Exercise Set 4: ex. 4.1, 4.2, 4.3, 4.5, 4.6, 4.7, 4.9 Exercise Set 3
Lecture 7 Th Sept 25 Dynamic Programming Chapter 4 Programming Exercise 2 Programming Exercise 1
Lecture 8 Tu Sept 30 Dynamic Programming continued Chapter 4 Exercise Set 4
Lecture 9 Th Oct 2 Monte-Carlo Methods, importance sampling Chapter 5, Importance Sampling Exercise Set 5: ex. 5.1, 5.2, 5.3, 5.5, 5.6, 5.7
Lecture 10 Tu Oct 7 Monte-Carlo methods continued, rollouts, policy gradient algorithms Chapter 5, Policy Gradient, GPOMDP
Lecture 11 Th Oct 9 Temporal-Difference learning Chapter 6, Convergence of Sarsa(0) Exercise Set 6: ex. 6.1, 6.2, 6.4, 6.5, 6.8, 6.9, 6.10, 6.12 Exercise Set 5
Lecture 12 Tu Oct 14 TD learning continued Chapter 6
Lecture 13 Th Oct 16 TD learning continued, actor-critic methods and policy gradient algorithms Chapter 6 Programming Exercise 2, Exercise Set 6
Lecture 14 Tu Oct 21 Eligibility traces Chapter 7 Exercise Set 7. ex: 7.2, 7.4, 7.5, 7.6, 7.8, 7.9, 7.10
Lecture 15 Th Oct 23 Midterm Review
Lecture 16 Tu Oct 28 In class midterm exam. Chapters 1-6.
Lecture 17 Th Oct 30 Eligibility traces continued, Generalization and function approximation Chapter 7, Chapter 8
Lecture 18 Tu Nov 4 Generalization and function approximation continued Chapter 8 Programming Exercise 3, Exercise Set 8. ex: 8.1-8.7 Exercise Set 7
Lecture 19 Th Nov 6 Generalization and function approximation continued, structured representations, neural networks, convergence results Chapter 8,
Tu Nov 11 Holiday - Veteran's day
Lecture 20 Th Nov 13 Generalization and function approximation continued Chapter 8 Exercise Set 8
Lecture 21 Tu Nov 18 Function approximation, Policy gradient, Actor-critic Chapter 8, Actor-Critic: Algorithm, Papers: 1, 2 Exercise Set 9: 9.1-9.3, 9.5, Programming Exercise 4 Programming Exercise 3
Lecture 22 Th Nov 20 Planning and Learning, model based methods, E^3 algorithm Chapter 9
Lecture 23 Tu Nov 25 Hierarchical reinforcement learning, Options framework Recent Advances in Hierarchical Reinforcement Learning. Only read sections 1-4. Exercise Set 10 Checkpoint for Programming Exercise 4, Exercise Set 9
Thanksgiving Recess
Lecture 24 Tu Dec 2 Hierarchical reinforcement learning continued, MAXQ value function decomposition, HAM framework
Lecture 25 Th Dec 4 Case studies Chapter 11, Exercise Set 10,
Lecture 26 Tu Dec 9 Case studies continued, Open problems in RL Chapter 11, Programming Exercise 4
Lecture 27 Th Dec 11 Review for final exam
TBA Final Exam.
Sa Dec 20 Last day of exams