2023-2024 / SPAT0086-1

Advanced data analysis in python and introduction to machine learning

Duration

15h Th, 25h Pr

Number of credits

 Master in space sciences (120 ECTS)4 crédits 

Lecturer

Valentin Christiaens, Maxime Fays, Guy Munhoven, Dominique Sluse

Coordinator

Maxime Fays

Language(s) of instruction

English language

Organisation and examination

Teaching in the second semester

Schedule

Schedule online

Units courses prerequisite and corequisite

Prerequisite or corequisite units are presented within each program

Learning unit contents

This course builds on, and expands, the topics covered in SPAT0002. With this course, the student will sharpen their Python programming skills and become familiar with some of the most popular techniques used for data analysis in space sciences. Specifically, the focus will be on techniques related to machine learning, time series and image processing. The course will be supported by slides and Jupyter notebooks which will provide a concise explanation of the scrutinized technique(s), as well as one (or several) concrete exercises generally inspired by real scientific problems.

The lectures will be divided in three main parts:

  • Introduction to Machine Learning: this section aims at covering important machine learning concepts (e.g., bias, underfitting/overfitting, cross validation, confusion matrices, ...), as well as providing an overview of some important methods used for supervised and unsupervised learning. In particular some algorithms among the most popular used for dimensionality reduction, classification, clustering and regression will be presented and experimented.
  • Advanced data analysis: this section extends the analysis of time series introduced in SPAT0002 (analysis of periodic and non periodic signals), and cover some basic concepts of image processing (denoising, filtering, deconvolution).
  • Advanced usage of Python: this includes the use of classes and getting familiar with development tools.

Learning outcomes of the learning unit

  • Proficiency with the important concepts and "jargon" used in machine learning (bias, overfitting, confusion matrix, ...).
  • Use and apply standard machine learning techniques to data sets, and identify which technique is best suited to a given data set.
  • Masterise tools needed to manipulate, clean and vet large data sets.
  • Use advanced techniques for manipulating and interpreting scientific signal present in images and in time series.
  • Build a solid know how and understanding of python programming to explore, identify, and understand numerical solutions to problems not taught during the lecture.
  • Use adequate tools for project development and python programing. 

Prerequisite knowledge and skills

SPAT0002-1 or similar course.

Planned learning activities and teaching methods

The course materials include Jupyter notebooks (http://jupyter.org) that contain, in addition to the theory, examples and small intereactive exercises that provide the students with a direct experience of the methods and concepts presented during the lecture. The practical classes are dedicated to the study of more advanced problems with the help of Python libraries that implement several of the algorithms teached during the lecture.

Mode of delivery (face to face, distance learning, hybrid learning)

Face-to-face course

Recommended or required readings

The lecture is based on material from various sources whose references will be given at the end of each notebook.

In addition, the following books will also be used

 

Exam(s) in session

Any session

- In-person

written exam ( open-ended questions ) AND oral exam

- Remote

written exam ( open-ended questions ) AND oral exam


Additional information:

Exams will, if only possible, organized face-to-face, on site.

Work placement(s)

There is no internship related to this course.

Organisational remarks and main changes to the course

Lectures will be given in weekly b4h-blocks during the spring term.

Contacts

Dominique Sluse
University of Liège
Institute of Astrophysics and Geophysics (B5c build.)
17, allée du Six-Août
B-4000 Liège
Phone: (+32) (4) 366 9797  (D. Sluse)

Association of one or more MOOCs

There is no MOOC associated with this course.

Items online

Github repository where the notebooks are posted
https://github.com/SPAT0086

The Ongoing repository contains the notebooks of the "ongoing" académic year.