BGDIA704 | Moodle

Options d’inscription

Reinforcement Learning

This course presents the main concepts and algorithms of reinforcement learning:

Markov decision process
Dynamic programming (policy iteration, value iteration)
Online control (Q-learning, Monte-Carlo Tree Search)
Bandit algorithms

Teacher: Thomas Bonald

References:

Course of Olivier Sigaud
Course of David Silver
Book of Richard Sutton and Andrew Barto

Enseignant: Thomas Bonald