30 Days After Introducing Programming: Which of My Students Will Fail?
At the end of this webpage, you can find the script, data, and materials we use in our research.
We now describe each file:
- Script.r:
- The R script we use to fetch the data from the database, execute the strategy, and save the results as HTML files
- This script also has a function to execute the hypotheses tests
- Clusters.zip
- This zile contains all files describing the User - Submissions - Correct Submissions
- 2 Groups.zip / 3 Groups.zip:
- These zip files contain all files generated by our script. Everything is summarized in HTML files
- The HTML files contain data regarding our two metrics for each student
- The k-means results are summarized in the files
- They also contain mappings between each student and the cluster group
All tables in the HTML files have the following structure:
We detail the structure in what follows. For each student, we have:
- An internal number used by our database (ID)
- The metrics Number of Correct Submissions and Number of Submissions are normalized (between 0 and 1)
- We omit the student name (column Students). To do so, we use A - Z letters
- Cluster Number: the group that k-means mapped to the student
- Possible numbers:
- For two groups: 1 or 2
- For three groups, 1, 2, or 3
- Possible numbers:
- Reproved: true or false (0 or 1)
We have IDs to represent each course. Here is the mapping between the IDs and the semesters: