18/06/2026 - 9h00 - Sala calcolo - CU033
02/07/2026 - 9h00 - Sala calcolo - CU033
16/07/2026 - 9h00 - Sala calcolo - CU033
28/09/2026 - 9h00 - Aula Informatizzata CU010
17/11/2026 (S) - 9h00 - Aula Informatizzata CU010
21/01/2027 - 9h00 - Aula Informatizzata CU010
Achievement of the course learning outcomes (LOs) will be evaluated through two components.
The first component is a data science project discussion. Students who attend the course are expected to discuss their project progressively during the course and, where possible, before the written exam. Students whose project does not meet the required standard will discuss it immediately after the written exam. The project discussion is designed to assess the achievement of the Machine Learning LOs, as well as students' critical thinking in the choice of a dataset, its pre-processing, and the use of data across all phases of a machine learning protocol.
The second component is a written exam, which assesses the achievement of the Python programming learning outcomes.
Each component is described in detail below.
You will carry out a data science project in which you choose a dataset, analyse it, and apply a complete machine learning (ML) workflow. The project must be developed in a Colab or Jupyter notebook and uploaded to a dedicated Google Drive folder before the discussion with the teacher.
1. Dataset selection Choose a dataset you are genuinely interested in exploring. You may use Kaggle or any other suitable resource. Before starting your analysis, share your choice with the teacher to confirm that the dataset is appropriate in terms of complexity and scope.
2. Notebook structure and content Your notebook must include the following sections, in order:
Objective — clearly state the aim of your study and explain why you chose this particular dataset
Dataset description — provide a thorough analytical description of the data (variables, size, source, known limitations, etc.)
Pre-processing — describe and justify all steps taken to clean, transform, and organise the data
Exploratory data analysis — visualise and describe the main features and patterns in the data
ML workflow — apply a complete ML protocol (a reference workflow is available here); for each step, from model selection to performance evaluation, explain your choices and the strategy adopted
Results and critical discussion — present the outcomes of your analysis and critically discuss your findings, including limitations and possible improvements
3. Documentation Every step of your analysis must be clearly described and explained. Every choice must be justified. The notebook should read as a coherent, self-contained report, understandable to someone who was not present at the discussion.
The written exam consists of two exercises assessing your Python programming skills and your ability to work critically with AI-generated code.
Exercise 1 — From pseudocode to Python
You will be given a pseudocode — a sequence of instructions written in plain English — and asked to translate it into working Python code. The exercise has two stages:
Stage 1 (pen and paper): Write your Python translation by hand, without running it.
Stage 2 (notebook): Transcribe your handwritten code into a notebook and run it. Document all errors produced by the Python interpreter directly in the notebook, explaining what caused each error and how you resolved it. This stage also assesses your debugging skills.
Exercise 2 — Prompt engineering and critical code comparison
You will be given a printed Python script written by the instructor. Your task is to:
Write a ChatGPT prompt designed to reproduce that script as precisely as possible
Submit your prompt to ChatGPT and compare its output with the original script
Annotate and explain any discrepancies between the two versions, discussing why they may have occurred
This exercise assesses both your ability to communicate programming tasks effectively to an AI tool and your capacity to critically evaluate AI-generated code.