2021-2022 / SDOC0030-1

Multivariate statistics

Duration

20h Th

Number of credits

 Doctoral training in sciences (BMCB)3 crédits 

Lecturer

Gentiane Haesbroeck

Language(s) of instruction

French language

Organisation and examination

Teaching in the second semester

Schedule

Schedule online

Units courses prerequisite and corequisite

Prerequisite or corequisite units are presented within each program

Learning unit contents

The four following themes of multivariate statistics are developped in the course:
Theme 1: mean vectors, covariance matrices, normal distribution and classical hypothesis tests (tests on the mean vectors, homoscedasticity test, normality test)
Theme 2: Dimension reduction techniques (Principal Component Analysis and tSNE)
Theme3: Clustering(unsupervised classification)
Theme 4: Multiple regression and some of its generalizations
 
The techniques are explained without insisting on the mathematical justifications.

Learning outcomes of the learning unit

At the end of the course, the PhD candidates are expected to be able to
- find out if one of the taught methods would be appropriate for analysing a multivariate data set in their own research field.
- apply the appropriate technique.
- interpret the results of the analyses.
The PhD candidates should also be able to detect situations in which the techniques cannot be applied (due to some violations of the hypotheses like lack of normality or lack of independence).

Prerequisite knowledge and skills

The students must have attended a basic course on descriptive and inferential statistics. The concepts of summary statistics, normal distribution and hypothesis tests are considered as known and will be exploited without further explanation. 
The methods are presented without emphasizing the mathematical justifications. Nevertheless, the students must have some background in basic linear algebra (vectors, matrices, orthogonal projection, determinant and inverses).
Finally, as far as the fsoftware R is concerned, the basics are briefly introduced in the provided lecture materials.

Planned learning activities and teaching methods

12h of ex-cathedra lectures (face-to-face or by means of videos) and about 8h of self-learning of the software R.

Mode of delivery (face to face, distance learning, hybrid learning)

Blended learning


Additional information:

This year, the course will be given face-to-face in the mathematical institute (B37- Sart-Tilman) from Monday 24 January to Thursday 27 January (from 9:00 to 10:30 and from 11:00 to 12:30). In the event that it is forbidden to oranise the course in a face-to-face way, the lectures will be replaced by videos explaining the content of the provided slides and discussion in virtual classes.
In addition to the ex-cathedra lectures/videos-virtual classes, written materials, on line on eCampus, will be provided in order to help the participants to apply the taught techniques with the statistical software R (in self learning). The scripts will be provided, as well as detailed explanations on the inputs/outputs of the procedures.
The course is taught in English.

Recommended or required readings

There are no lecture notes but the slides used during the lectures will be available and put on line on eCampus in January.
 The participants might also find the following references useful:
Applied Multivariate Statistical Analysis, RA Johnson and DA Wichern, 6th edition 2014
Applied Multivariate Statistics with R, D. Zelterman, Springer.
 

Assessment methods and criteria

Exam(s) in session

Any session

- Remote

written exam

Other : Attendance form


Additional information:

Most students registered for this course of 3rd cycle will consider that the course is part of their PhD training. No evaluation will be organised (despite the "default" information written above) but an official attendance form will be transmitted by the research administration to the participants.

 

Work placement(s)

Organizational remarks

The course is included in the PhD training folder made by the Administration of Research and Development.
The students who wish to follow the course must register via the Administration of Research and Development. 
Officially registered students will then be given access to the doctoral course SDOC0030 on eCampus where they will find all the necessary documents (slides, scripts,...), as will have potential access to the virtual classes in case face-to-face teaching is problematic.  
The course is aimed in priority to PhD candidats who are in their two first years of research. It is a "generalist" course, with the objective of showinghow  the most basic techniques work, without exposing refinements used in common application fileds (medicine, agronomy...).
In order to be aware of what can be expected from a particiation to the course, it is important to note that it is not possible to combine the teaching of the course with a kind of "consultancy service" consisting of analysing the concrete statistical problems encountered by the participants in their own research. There are too many participants and too may different problems to do that in an efficient way! 
NB: An attendance form will be signed each half-day of the course in order to monitor the follow-up and in order to help the professor writte the potential attendance forms.

Contacts

G.HAESBROECK, Institute of mathematics, Building B37, room 0/60, tel: 04/366-95-94, email: G.Haesbroeck@uliege.be