2021-2022 / PROJ0016-1

Big data project


25h Th, 180h Proj.

Nombre de crédits

 Master en science des données, à finalité10 crédits 
 Master : ingénieur civil en science des données, à finalité10 crédits 


Bertrand Cornélusse, Pierre Geurts, Gilles Louppe, Gilles Louppe

Langue(s) de l'unité d'enseignement

Langue anglaise

Organisation et évaluation

Enseignement durant l'année complète


Horaire en ligne

Unités d'enseignement prérequises et corequises

Les unités prérequises ou corequises sont présentées au sein de chaque programme

Contenus de l'unité d'enseignement

The purpose of this course/project is for the students to apply the knowledge acquired in the Data Science and Engineering program to a project involving actual data in a realistic setting. During the project, the students will engage in the entire process of solving a real-world data science problem: formalizing the problem, collecting and processing data, applying appropriate analytical methods and algorithms, deploying a solution and presenting the results of their study.
The students will work in groups to carry out a practical project over a big dataset, aiming at using the available software and hardware systems for retrieving a specific kind of information from the dataset. The project will be carried out within modern distributed computing and storage environments, using state-of-the-art analytical methods.

Acquis d'apprentissage (objectifs d'apprentissage) de l'unité d'enseignement

The project aims at developing the students' ability to carry out a realistic, complex and incompletely defined data science project from the conceptual to the operational phase.

The students will also learn and practice actively project management, including project and team leadership, reporting, oral presentations and defence, thereby improving their autonomy, their abilities to work efficiently in teams, and their communication and writing skills.

Ce cours contribue aux acquis d'apprentissage I.1, I.2, I.3, II.1, II.2, II.3, III.1, III.2, III.3, III.4, IV.1, IV.2, IV.3, IV.4, VI.1, VI.2, VI.3, VI.4, VII.1, VII.2, VII.3, VII.4, VII.5, VII.6 du programme d'ingénieur civil en science des données.

Savoirs et compétences prérequis

Activités d'apprentissage prévues et méthodes d'enseignement

  • Regular project reviews, including oral presentations and short reports;
  • Feedback on technical progress and project management;
  • Writing of a final report;
  • Defence of the project.

Mode d'enseignement (présentiel, à distance, hybride)

Combinaison d'activités d'apprentissage en présentiel et en distanciel

Explications complémentaires:

  • Monthly review meetings;
  • The project is mainly carried out remotely.

Lectures recommandées ou obligatoires et notes de cours

Modalités d'évaluation et critères

Examen(s) en session

Toutes sessions confondues

- En présentiel

évaluation orale

Travail à rendre - rapport

Evaluation continue

Explications complémentaires:

The evaluation will be based on:

  • the intermediate review meetings (progress achieved, quality of project management) 
  • the quality of the final report, the quality of the final oral defence, and the overall solution where the originality, methodology, clarity, reproducibility and technological choices of the solution will be mainly assessed. 
The project defence consists of an oral presentation, followed by a question/answer session. The final grade takes account of the amount and quality of the achieved work, the quality of the written report and of the oral presentation, as well as the relevance of the provided answers. 
Typically, grades are assigned to the whole group.  However, in some particular cases (e.g., when there is evidence that a member of a group has not participated enough in the project), the grade may be assigned more individually, reflecting the personal involvement of each member of a group. Finally, as the course is essentially a single team project, no resit will be provided. This means that no second chance to improve the grade will be given to students in case of failure in June.


Remarques organisationnelles

  • Teams of up to 4 students.
  • Presence at the intermediate reviews is mandatory.
  • The final report must be submitted by mid-May.
  • The defences will be scheduled in mid-May.
  • Intermediate deadlines will be announced throughout the year.



Teaching assistants: Arnaud Delaunoy, Selmane Dakir.
Materials can be found on eCampus.