In PBL-1, we've used three methods: Perceptron, Logistic regression, and SVM to create machine learning models to classify handwritten digits in the MNIST dataset. According to our experience, SVM showed good classification performance, but training an SVM was taking too much time, compared to the other methods.
The company M, who asked us to build the classifier in PBL-1, is now interested using an SVM classifier for the digit classification problem. But now they wonder if one can implement a faster algorithm, which can handle large number of training examples better than the code in scikit-learn. The company has provided us the list of requirements of the algorithm they want to have:
To evaluate the prediction performance of the new algorithms, the company prepared 50,000 new handwritten digits dataset from the MNIST data. Now we're taking about three datasets:
The company will evaluate your code with the prediction accuracy on the D3 dataset.
1|
0|
3|
4|
....
where | stands for '\n' as in C/C++.