Overview of Premise
Children will get to learn about decision trees and the Random Forest machine learning algorithm by constructing multiple decision trees of their own. As they move through different modules, they will learn about different aspects of Random Forest (like depth, number of trees, bagging, etc) and use the new things they learn to make their Random Forest have a higher accuracy.
What Ideas in AI will be Taught
What machine learning is
Supervised learning
Random Forest Classifier
how decision trees work
some hyperparameters (max_depth, n_estimators, max_sample, and max_features)
max_depth (defines the maximum depth of the decision trees) and n_estimators (the number of decision trees) seem to be the easiest to include/teach
whether or not we include max_sample and max_features will depend on whether we find a simplified way to include them
bagging
max voting
hyperparameter tuning
Drawings Depicting Project Interface and User Interaction
How Children Interact with the System
The children will play through a series of modules where they see a sample of data and create decision trees. They can then test the accuracy of their own Random Forest in each module. As they progress through the modules, the decision trees the children construct will have to adhere to certain restrictions (like depth of the tree, number of trees, etc) while a character explains the aspect and what increasing/decreasing it will do.
Discussion on Data Collection
pretest/posttest on machine learning, decision trees, and Random Forest
interviews
the tool we make could collect screenshots of the decision trees the child constructs whenever they click "test accuracy"
we could process the screenshots to determine characteristics of the decision trees the children make (depth, number of trees, accuracy, etc)
it would also be timestamped and collect the child's player id
so we can see if a player's accuracy improves over time and if the player is incorporating in more concepts they've learned
it would also include any restrictions the decision trees had to adhere to
so we can see how their accuracy compares to the number of restrictions