Duration
20h Th
Number of credits
| Doctoral training in sciences (BBMC) | 3 crédits |
Lecturer
Language(s) of instruction
French language
Organisation and examination
Teaching in the second semester
Schedule
Units courses prerequisite and corequisite
Prerequisite or corequisite units are presented within each program
Learning unit contents
The four following themes of multivariate statistics are developped in the course:
Theme 1: mean vectors, covariance matrices, normal distribution and classical hypothesis tests (tests on the mean vectors, homoscedasticity test, normality test)
Theme 2: Exploratory multivariate analysis by means of Principal Component Analysis and clustering
Theme 3: discrimination
Theme 4: Multiple regression and some of its generalizations
The techniques are explained without insisting on the mathematical justifications.
Learning outcomes of the learning unit
At the end of the course, the PhD candidates are expected to be able to
- find out if one of the taught methods would be appropriate for analysing a multivariate data set in their own research field.
- apply the appropriate technique.
- interpret the results of the analyses.
The PhD candidates should also be able to detect situations in which the techniques cannot be applied (due to some violations of the hypotheses like lack of normality or lack of independence).
Prerequisite knowledge and skills
The students must have attended a basic course on descriptive and inferential statistics. The concepts of summary statistics, normal distribution and hypothesis tests are considered as known and will be exploited without further explanation.
The methods are presented without emphasizing the mathematical justifications. Nevertheless, the students must have some background in basic linear algebra (vectors, matrices, orthogonal projection, determinant and inverses).
Finally, as far as the fsoftware R is concerned, the basics are briefly introduced in the provided lecture materials.
Planned learning activities and teaching methods
12h of ex-cathedra lectures and about 8h of self-learning of the software R.
Mode of delivery (face to face, distance learning, hybrid learning)
This year, the course will be given face-to-face in the mathematical institute (B37- Srta-Tilman) from Monday 25 January to Thursday 28 January (from 9:00 to 10:30 and from 11:00 to 12:30). In the event that it is forbidden to oranise the course in a face-to-face way, the lectures will be replaced by videos explaining the content of the provided slides and discussion in virtual classes.
In addition to the ex-cathedra lectures/videos-virtual classes, written materials, on line on eCampus, will be provided in order to help the participants to apply the taught techniques with the statistical software R (in self learning). The scripts will be provided, as well as detailed explanations on the inputs/outputs of the procedures.
The course is taught in English by default but the professor will switch to French if all participants are French-speakers.
Organisational adjustments related to the current health context
Recommended or required readings
There are no lecture notes but the slides used during the lectures will be available and put on line on eCampus in January.
The participants might also find the following references useful:
Applied Multivariate Statistical Analysis, RA Johnson and DA Wichern, 6th edition 2014
Applied Multivariate Statistics with R, D. Zelterman, Springer.
Assessment methods and criteria
Below you will find information on the evaluation methods planned for in-person and remote exams as well as those planned for hybrid sessions. Depending on how the health crisis evolves, the chosen method will be communicated to you no later than one month before the start of the exam session.
Most students registered for this course of 3rd cycle will consider that the course is part of their PhD training. Depending on the constraints imposed by the corresponding PhD colleges, the following possibilites are offered to the students:
- Simple attendance form
- Evaluation based on a personal homework of data analysis based on the application of the techniques taught during the course and on the use ot the software R or any other statistical software. It is not the use of R which will be evaluated but the good application of the techniques and the quality of the interpretation of the results.
Work placement(s)
Organizational remarks
The course is included in the PhD training folder made by the Administration of Research and Development.
The students who wish to follow the course must register via the Administration of Research and Development.
Officially registered students will then be given access to the doctoral course SDOC0030 on eCampus where they will find all the necessary documents (slides, scripts,...), as will have potential access to the virtual classes in case face-to-face teaching is problematic.
The course is aimed in priority to PhD candidats who are in their two first years of research. It is a "generalist" course, with the objective of showinghow the most basic techniques work, without exposing refinements used in common application fileds (medicine, agronomy...).
In order to be aware of what can be expected from a particiation to the course, it is important to note that it is not possible to combine the teaching of the course with a kind of "consultancy service" consisting of analysing the concrete statistical problems encountered by the participants in their own research. There are too many participants and too may different problems to do that in an efficient way!
NB: An attendance form will be signed each half-day of the course in order to monitor the follow-up and in order to help the professor writte the potential attendance forms.
Contacts
G.HAESBROECK, Institute of mathematics, Building B37, room 0/60, tel: 04/366-95-94, email: G.Haesbroeck@uliege.be