Data Mining Contest (Dec 2012)

This BADM data mining contest is run in collaboration with and hosted on

General Guidelines:
  • To upload a submission, each students should create a free account on (or login using your Facebook, Google or Yahoo user). 
  • Go to the contest page.
  • Read the competition details, then click Get the Data. Download files training.csv, test.csv and test-submission.csv.
  • Create a submission file as described below (see "Submission format")
  • When you are ready to submit, click Make a Submission.
  • Now is the time to create your Kaggle team. Add each of your group members to the team on Kaggle.
  • Please use "BADM Group xx" in your team name, where xx is your Group number, so that we can identify submissions by course members.
  • Check the leaderboard to see how well you've done!
  • For any queries, please post to the competition's forum on Kaggle.
Submission Deadline
Dec 23, 2012 at 11 pm IST
We strongly recommend trying a simple submission very early, to make sure you understand the process.

The contest goal is to develop a data mining solution that addresses the supervised task of classifying whether the price of an electronic gadget will go up or down on an online website. Contestants will upload the predicted probabilities generated by their approach to the platform. The leaderboard will reflect their performance compared to other submissions.

Submission format
You will be given three datasets: a training dataset that includes both input (X) and output (Y) variables, and a test dataset that only includes input variables (X). In this contest, Y is a binary (0/1) output. The third file shows an example of a valid submission. Your submission will be a CSV file with
  • the RowID column from test.csv, 
  • a column with value "1" in each cell (name this column "group") 
  • and a column of predicted probabilities (P(Y=1)) for the test set (name this column "PriceUp").
The dataset, contributed by, includes information on various electronics items that were available on online Indian websites for several months in 2011-12. The larger dataset will be available for download from A sample of 30 records and the data dictionary are available below:

Electronics Online Shopping: Sample