INFO8003-1

Duration

25h Th, 10h Pr, 45h Proj.

Number of credits

	Master of Science (MSc) in Data Science	5 crédits
	Master of Science (MSc) in Electrical Engineering	5 crédits
	Master of Science (MSc) in Computer Science and Engineering	5 crédits
	Master of Science (MSc) in Computer Science and Engineering (double degree programme with HEC)	5 crédits
	Master of Science (MSc) in Data Science and Engineering	5 crédits
	Master of Science (MSc) in Computer Science	5 crédits
	Master of Science (MSc) in Computer Science (joint-degree programme with HEC)	5 crédits

Lecturer

Damien Ernst

Language(s) of instruction

English language

Organisation and examination

Teaching in the second semester

Schedule

Schedule online

Units courses prerequisite and corequisite

Prerequisite or corequisite units are presented within each program

Learning unit contents

There are numerous decision-making problems that can be formalised as problems for which one needs to maximize a numerical reward (or equivalently minimize a cost) when playing with an environment which is stochastic or (partially) unknown, exhibits little structure (e.g., it is not linear/convex), has a sequential nature (e.g., a sequence of decisions needs to be taken to reach an objective) and/or is adversarial (e.g., an opponent takes its decisions so as to minimize your payoff as it is the cas for example when you play poker).
Typical examples of such problems are:

The design of artificial intelligences able to learn to play computer games,
The placement of advertisements on webpages to maximize the number of clicks,
Controlling a rocket so as to safely reach a target with minimum fuel costs,
The synthesis of winning strategies for playing with the stock market,
The design of artificial intelligences for autonomous robots,
The design of clinical experiences.

The goal of this class is to teach the techniques for taking optimal decisions for such complex problems. These techniques will borrow from results from system theory, probability theory, information theory, supervised learning as well as linear and convex optimisation.

Learning outcomes of the learning unit

At the end of the class the student should be able (i) to be familiar with a broad class of techniques for solving optimal control problems (ii) to use these techniques for solving optimal control problems and to understand their main characteristics (iii) to have the ability to read and understand a significant amount of the scientific papers dedicated to this field of research and, in particular, those that relate to the reinforcement learning based approaches (also known as sampling based approaches) for solving optimal sequential decision making problems.

Among the different techniques that will be covered by this class, we can mention:

a. Dynamic programming and policy search techniques for Markov Decision Processes (MDPs)

b. Reinforcement learning techniques for MDPs

c. Techniques for solving the Exploration/Exploitation tradeoff, with a special focus on those that apply to multi-armed bandit problems.

d. Monte-Carlo Tree Search techniques for single-player and multi-player environments.

e. Multi-stage stochastic programming techniques for problems with large action spaces.

This course contributes to the learning outcomes I.2, II.1, II.2, II.3, III.1, IV.1, IV.3, VI.1, VI.2, VI.3, VII.2, VII.5 of the MSc in electrical engineering.

This course contributes to the learning outcomes I.2, II.1, II.2, II.3, III.1, IV.1, VI.1, VI.2, VI.3, VII.2, VII.5 of the MSc in computer science and engineering.

Prerequisite knowledge and skills

Basic knowledge in system theory, statistics, optimisation and machine learnng.
Good coding skills are required.

Planned learning activities and teaching methods

The classes will include different parts: theoretical courses, analyzes of scientific articles and exercises. The theoretical material will be taught mainly through inverse teaching.
Students will also have to work throughout the year on projects designed to implement the methodologies learned during the year on fairly simple examples.

Mode of delivery (face to face, distance learning, hybrid learning)

Face-to-face learning

Work placement(s)

Possibility for motivated students to do a research internship in this exciting field of artificial intelligence.

Organisational remarks and main changes to the course

Contacts

See: http://blogs.ulg.ac.be/damien-ernst/contact/

Optimal decision making for complex problems