2017-2018 / INFO8003-1

Optimal decision making for complex problems

Duration

25h Th, 10h Pr, 45h Proj.

Number of credits

 Master in data science (120 ECTS)5 crédits 
 Master of science in computer science and engineering (120 ECTS)5 crédits 
 Master in data science and engineering (120 ECTS)5 crédits 

Lecturer

Damien Ernst

Language(s) of instruction

English language

Organisation and examination

Teaching in the second semester

Units courses prerequisite and corequisite

Prerequisite or corequisite units are presented within each program

Learning unit contents

There are numerous decision-making problems that can be formalised as problems for which one needs to maximize a numerical reward (or equivalently minimize a cost) when playing with an environment which is stochastic or (partially) unknown, exhibits little structure (e.g., it is not linear/convex), has a sequential nature (e.g., a sequence of decisions needs to be taken to reach an objective) and/or is adversarial (e.g., an opponent takes its decisions so as to minimize your payoff as it is the cas for example when you play poker).
Typical examples of such problems are:

  • The design of artificial intelligences able to learn to play computer games,
  • The placement of advertisements on webpages to maximize the number of clicks,
  • Controlling a rocket so as to safely reach a target with minimum fuel costs,
  • The synthesis of winning strategies for playing with the stock market,
  • The design of artificial intelligences for autonomous robots,
  • The design of clinical experiences.
The goal of this class is to teach the techniques for taking optimal decisions for such complex problems. These techniques will borrow from results from system theory, probability theory, information theory, supervised learning as well as linear and convex optimisation.

Learning outcomes of the learning unit

At the end of the class the student should be able  (i) to be familiar with a broad class of techniques for solving optimal control problems (ii)   to use these  techniques for solving optimal control problems and to understand their main characteristics (iii) to have the ability to read and understand  a significant amount of the scientific papers dedicated to this field of research and, in particular, those that relate to the reinforcement learning based  approaches (also known as sampling based approaches)  for solving optimal sequential decision making problems. 
Among the different techniques that will be covered by this class, we can mention:
a. Dynamic programming and policy search techniques for Markov Decision Processes (MDPs)
b. Reinforcement learning techniques for MDPs
c. Techniques for solving the Exploration/Exploitation tradeoff, with a special focus on those that apply to multi-armed bandit problems.
d. Monte-Carlo Tree Search techniques for single-player and multi-player environments.
e. Multi-stage stochastic programming techniques for problems with large action spaces.
 

Prerequisite knowledge and skills

Basic knowledge in system theory, statistics, optimisation and machine learnng.
Good coding skills are required.

Planned learning activities and teaching methods

The classes will include different parts: theoretical courses, analyzes of scientific articles and exercises. The theoretical material will be taught  mainly through inverse teaching.
Students will also have to work throughout the year on projects designed to implement the methodologies learned during the year on fairly simple examples.

Mode of delivery (face-to-face ; distance-learning)

Face-to-face learning

Recommended or required readings

The teaching material will be accessible on the class website, see: http://blogs.ulg.ac.be/damien-ernst/teaching/
 

Assessment methods and criteria

The evaluation consists of two parts: a continuous assessment during the year which will count for 50% of the points and an oral examination at the end of the year.

Work placement(s)

Possibility for motivated students to do a research internship in this exciting field of artificial intelligence.

Organizational remarks

Contacts

See: http://blogs.ulg.ac.be/damien-ernst/contact/