Kaggle In Class Prediction Competition
View current scoreboard rankings
Due Saturday February 27 at midnight (11:59pm).
Datasets (click here)
Submission Guidelines
1) Submit your predictions to Kaggle In Class CME 250 Prediction Competition.
When signing up for Kaggle In Class, use your Stanford email.
If you already have a Kaggle account you can change your email to Stanford in your Kaggle profile settings after signing in.
If you are an auditor without a Stanford email, you can email the instructors for a private competition invite code.
You may only submit once per day. (Restriction removed on the final day.)
You may only submit a total of three times. Your best submission will determine your final ranking.
Extra Credit I encourage you to make your first Kaggle In Class competition upload by midnight Monday February 15. This will help you view how you are performing relative to your peers, as you further refine your method. Students who have made at least one submission (online) to the competition by February 15 will be given extra credit, worth one problem on their homework.
2) Turn in a printed hard copy report by noon Monday February 29 containing:
Your Kaggle username for this competition (and your full name and SUNetID)
One paragraph describing your prediction method(s).
If you used any techniques such as feature selection, feature engineering, cross-validation, imputation, etc. please include this in your description.
One to two paragraphs describing why you chose the prediction method you settled on. (Describe what other methods you tried.)
A sentence or two describing something interesting you noticed about the data and/or your method.
An appendix with your code.
Ranking
Your grade for the mini-project will be based on the content and quality of your write-up, not on your rank in the Kaggle competition.
We are using a Mean Absolute Error Loss for scoring your prediction performance.
Collaboration policy
You are *encouraged* to discuss strategies on the CME 250 Competition Forum.
Every student must make their own submissions to the Kaggle competition (multiple students cannot submit together under a single username), and independently write-up their methods and observations to receive credit.
As on the homework, students should include the names of their collaborators in the project report.
Hint for Missing Values
The training set has some missing values. You can omit these samples using the na.omit command in R, but it would be even better if you impute them, or use a method like CART that can handle them.