2018-2019 / STAT0750-1

Multivariate statistical analysis (software R)

Duration

10h Th, 10h Pr

Number of credits

 Master in bio-informatics and modelling (120 ECTS)2 crédits 
 Master in biology of organisms and ecology (120 ECTS)2 crédits 

Lecturer

Gentiane Haesbroeck

Language(s) of instruction

French language

Organisation and examination

Teaching in the first semester, review in January

Schedule

Schedule online

Units courses prerequisite and corequisite

Prerequisite or corequisite units are presented within each program

Learning unit contents

The course is a general introduction to the most often used methods in multivariate statistics (i.e. when one studies several variables simultaneously) in biology. The course entails the following chapters:
- Graphical display of multivariate data
- Multivariate tests
- Multivariate exploratory techniques: principal component analysis, clustering, principal coordinates analysis
- Multiple regression and generalized linear models

Learning outcomes of the learning unit

The methods of multivariate data analysis are taught based on a pragmatic approach. At the end of the course, the sudent should be capable of
- defining a multivariate problem,
- analysing the data,
- interpreting the results.
He/she should also be aware of the limitations of application of the methods.

Prerequisite knowledge and skills

The students must have attended a basic course on descriptive and inferential statistics. The concepts of normal distribution, confidence interval and hypothesis tests are considered as known. Moreover, basic knowledge of the software R is expected.
The methods are presented without emphasizing the mathematical justifications. Nevertheless, the students must have the following background in mathematics: basic linear algebra (vectors, matrices, including the notions of determinant and inverses), linear, exponential and logarithmic functions.

Planned learning activities and teaching methods

Together with the ex-cathedra courses focusing on a theoretical approach, the students will be asked to apply the techniques following the learning process describred below:
- Personal preparation at home in order to get familiar with the script constructed by the professor and her assistants;
- Discussion on the script and the interpretation of the resultats in the computer room of the maths department;
- Personal homework (data analyses to perfom by oneself; corrections are provided later on in order to provide a self assesment)

Mode of delivery (face-to-face ; distance-learning)

The course counts 20 hours of face-to-face teaching. The lectures are decomposed into two parts: theory (ex-cathedra) and practicals in a computer room. The planning of the lectures will be put on line on eCampus at the beginning of the academic year. The data analyses that aren't completed during the practical have to be completed at home by the student.

Recommended or required readings

There are no lecture notes but the slides that will be used for the lectures will be available on eCampus in advance. Also, the  scripts of the software R and the exercises sheets (and their corrections) will be displayed on line following the evolution of the lectures.
The following textbook (available on-line from the web site of the libraries of ULiège) will be used for most parts of the course (PCA, association measures and principal coordinates analysis, multiple regression and generalized regression):
A.F. Zuur, E.N. Ieno et G.M. Smith, Analysing ecological data, Springer serie (statistics for biology and health)

Assessment methods and criteria

Written examination in the computer room.

Work placement(s)

Organizational remarks

The course will take place on Tuesday morning of the fisrt quadrimester in Building B37.

Contacts

Lecturer
Gentiane Haesbroeck Département de Mathématique (B37, bureau 0/60) Tél: 04/366.95.94 Email: G. Haesbroeck@ulg.ac.be
Assistant
Marie Ernst Tél: 04/366. 366.94.02  Email: m.ernst@ulg.ac.be