2022-2023 / MATH2021-1

High-dimensional statistics

Duration

30h Th, 15h Pr, 30h Proj.

Number of credits

 Master of Science (MSc) in Data Science5 crédits 
 Master of Science (MSc) in Data Science and Engineering5 crédits 

Lecturer

Gentiane Haesbroeck

Language(s) of instruction

English language

Organisation and examination

Teaching in the first semester, review in January

Schedule

Schedule online

Units courses prerequisite and corequisite

Prerequisite or corequisite units are presented within each program

Learning unit contents

The course is devoted to the following themes:

- Exploratory data analysis

- Dimension reduction technique: Principal Component Analysis, Multidimensional Scaling, tSNE
- Multivariate estimation, with a particular emphasis on the estimation of the covariance matrix (classic technique under normality, penalized version and robust version)
- Multiple regression and generalized linear modeals
- Supervised classification: discriminant analysis and logistic regression
- Independent Component Analysis

Learning outcomes of the learning unit

The student will gain sufficient knowledge to be able to select the appropriate multivariate statistical technique to reduce the dimension of the problem or construct classification rules,...

Prerequisite knowledge and skills

A strong background in univariate statistics is required. Moreover, even though the mathematical justifications are not developped in details, the students must be familiar with the basic notions of linear algebra (vector, matrix, determinant, eigen values and eigen vectors...).

Planned learning activities and teaching methods

The theory is exposed in an ex-cathedra way. During the practicals, the students work by themselves before an overall discussion on the results/approaches. It is the statistical software R which has to be used in this course.

Mode of delivery (face to face, distance learning, hybrid learning)

Blended learning


Additional information:

The course is mainly scheduled in a face-to-face way but two lectures will be given via videos (see the on-line calendar of the the lectures as well as the schedule on line on eCampus). 


Additional information:

Recommended or required readings

There are no lecture notes. The slides will be available from eCampus. Moreover, for each theme, a reference book will be notified in order to suggest additionnal reading.
 

Exam(s) in session

Any session

- In-person

written exam ( open-ended questions )

Written work / report


Additional information:

Exam(s) in session

Any session

- In-person

written exam ( open-ended questions )

Written work / report


Additional information:

The final grade is a weighted mean computed on the grades obtained for

- the personal projects given during the semester: the statement of the first projet will be available on 21/09 (deadline for submission of the project: 26/10); the statement of the second project will be available on 9/11 (deadline: 14/12)

- the written exam consisting of a data analysis and the detailed analysis/use of a specific technique taught in the lectures.

Work placement(s)

Organizational remarks

The lectures are taught in English.

The lecture room does not provide a podcast equipment by default, the lectures given in a face-to-face way will not be available under another form.

Following some feedbacks written in the survey EVALENS about the duration of the exam, the professor wishes to emphasize that it is expected that all students have used at least once all the techniques taught during the semester (this is the goal of the practical) before coming to the exam. On the day of the exam, all the commands (of the software R) must be readily available, in order to be slightly adapted to new data or to a new situation. 

Contacts

Lecturer: Gentiane HAESBROECK, Institute of Mathematics (B37), g.haesbroeck@ulg.ac.be

Association of one or more MOOCs