10h Th, 180h Proj.
Number of credits
|Master in data science (120 ECTS)||7 crédits|
|Master in data science and engineering (120 ECTS)||7 crédits|
Language(s) of instruction
Organisation and examination
All year long
Units courses prerequisite and corequisite
Prerequisite or corequisite units are presented within each program
Learning unit contents
The purpose of this course/project is for the students to apply knowledge acquired in the Data
Science and Engineering program to a project involving actual data in a realistic setting.
During the project, the students will engage in the entire process of solving a real-word data
science problem: formalizing the problem, collecting and processing data, applying appropriate
analytical methods and algorithms, deploying a solution and presenting the results of their study.
The course will offer a number of seminars given by industry experts and covering specific topics relevant for big data solutions: large-scale data storage systems, distributed computing frameworks, data science software libraries, specialized machine learning and statistics topics.
The students will work in groups to carry out a practical project over a big dataset, aiming at using the available software and hardware systems for retrieving a specific kind of information from the dataset. The project will be carried out within modern distributed computing and storage environments, using state-of-the-art analytical methods.
Learning outcomes of the learning unit
The project aims at developing the students' ability to carry out a realistic, complex and incompletely defined data science project from the conceptual to the operational phase.
The students will also learn and practice actively project management, including project and team leadership, reporting, oral presentations and defence, thereby improving their autonomy, their abilities to work efficiently in teams, and their communication and writing skills.
Prerequisite knowledge and skills
Planned learning activities and teaching methods
- Seminars by local and external speakers;
- Monthly project reviews, including oral presentations and short reports;
- Feedback on technical progress and project management;
- Writing of a final report;
- Defence of the project.
Mode of delivery (face-to-face ; distance-learning)
- Face-to-face seminars;
- Monthly review meetings;
- The project is mainly carried out remotely.
Recommended or required readings
Slides used during the seminars.
Assessment methods and criteria
The evaluation will be based on:
- the intermediate review meetings (progress achieved, quality of project management) [30%],
- the quality of the final report [15%],
- the quality of the final oral defense [15%],
- the overall solution [40%], where the originality, methodology, clarity, reproducibility and technological choices of the solution will be mainly assessed.
Typically, grades are assigned to the whole group. However, in some particular cases (e.g., when there are evidences that a member of a group has not participated enough in the project), the grade may be assigned more individually, reflecting the personal involvement of each member of a group. Finally, as the course is essentially a single team project, no resit will be provided. This means that, no second chance to improve the grade will be given to students in case of failure in June.
- Teams of 3 students.
- Presence at the seminars and intermediate reviews is mandatory.
- The final report must be submitted by mid-May.
- The defences will be scheduled mid-May.
- Intermediate deadlines will be announced throughout the year.
- Gilles Louppe, firstname.lastname@example.org
- Pierre Geurts, email@example.com
- Bertrand Cornélusse, firstname.lastname@example.org