2020-2021 / STAT0750-1

Multivariate statistical analysis (software R)

Duration

10h Th, 10h Pr

Number of credits

 Bachelor in biology3 crédits 
 Bachelor in geography : general2 crédits 

Lecturer

Arnout Van Messem

Language(s) of instruction

French language

Organisation and examination

Teaching in the second semester

Schedule

Schedule online

Units courses prerequisite and corequisite

Prerequisite or corequisite units are presented within each program

Learning unit contents

The course is a general introduction to the most often used methods in multivariate statistics (i.e. when one studies several variables simultaneously) in biology. The course entails the following chapters:
- Graphical display and statistical summary of multivariate data
- Multivariate exploratory techniques: principal component analysis, clustering, principal coordinates analysis
- Multiple regression and generalized linear models

Learning outcomes of the learning unit

The methods of multivariate data analysis are taught based on a pragmatic approach. At the end of the course, the sudent should be capable of
- defining a multivariate problem,
- analysing the data,
- interpreting the results.
He/she should also be aware of the limitations of application of the methods.

Prerequisite knowledge and skills

The students must have attended a basic course on descriptive and inferential statistics. The concepts of normal distribution, confidence interval and hypothesis tests are considered as known. Moreover, basic knowledge of the software R is expected.
The methods are presented without emphasizing the mathematical justifications. Nevertheless, the students must have the following background in mathematics: basic linear algebra (vectors, matrices, including the notions of determinant and inverses), linear, exponential and logarithmic functions.

Planned learning activities and teaching methods

Together with the ex-cathedra courses focusing on a theoretical approach, the students will be asked to apply the techniques following the learning process describred below:
- Personal preparation at home in order to get familiar with the script constructed by the professor and her assistants;
- Discussion on the script and the interpretation of the results
- Group discussion on data analyses 

Mode of delivery (face to face, distance learning, hybrid learning)

The course counts 20 hours of face-to-face teaching, 10 of which are devoted to ex-cathedra lectures for the theory. During the 10 hours of practicals, the students will first be invited to ask all the questions they have on the scripts and the intepretation of the results. Then, they will discuss in small groups in order to analyse some data. A brief correction will be detailed at the end of the practical, before being put on line.  

Organisational adjustments related to the current health context

Distance-learning through prerecorded videos and regular Q&A sessions

Recommended or required readings

There are no lecture notes but the slides that will be used for the lectures will be available on eCampus in advance. Also, the  scripts of the software R and the statement of the data analyses (and their corrections) will also be displayed on line.
The following textbook (available on-line from the web site of the libraries of ULiège) will be used for most parts of the course (PCA, association measures and principal coordinates analysis, multiple regression and generalized regression):
A.F. Zuur, E.N. Ieno et G.M. Smith, Analysing ecological data, Springer serie (statistics for biology and health)

Assessment methods and criteria

Below you will find information on the evaluation methods planned for in-person and remote exams as well as those planned for hybrid sessions. Depending on how the health crisis evolves, the chosen method will be communicated to you no later than one month before the start of the exam session.

Any session :

- In-person

written exam

- Remote

written exam AND written work

- If evaluation in "hybrid"

preferred in-person


Additional information:

The examination consists in the analysis of some data with the software R. The focus in the marking will be on the interpretation of the results and the appropriate use of the techniques but some attention will also be given to the use of the software R.
During the exam, the students may either use their own laptop or a laptop of the computer room of the maths department. 

Work placement(s)

Organizational remarks

The course is organised on the time slots indicated on Celcat. Two groups will be constructed for the practical sessions. The students whose group has an available laptop will have the practical session in a clasic classroom while the others will be invited to work in the computer room of the Maths Departement.

Contacts

Professeur: Arnout Van Messem
 
Assistant: Carole Baum, Jimmy Keydener