In this project, you will design two classifiers: a perceptron classifier for number recognition and a slightly modified perceptron classifier for behavioral cloning. You will test the first classifier on a set of scanned handwritten digit images, and the second on sets of recorded pacman games from various agents. Even with simple features, your classifiers will be able to do quite well on these tasks when given enough training data.
Optical character recognition (OCR) is the task of extracting text from image sources. The data set on which you will run your classifiers is a collection of handwritten numerical digits (0-9). This is a very commercially useful technology, similar to the technique used by the US post office to route mail by zip codes. There are systems that can perform with over 99% classification accuracy (see LeNet-5 for a classic example system in action, and more modern CNN implementations here).
Behavioral cloning is the task of learning to copy a behavior simply by observing examples of that behavior. In this project, you will be using this idea to mimic various Pacman agents by using recorded games as training examples. Your agent will then run the classifier at each action in order to try and determine which action would be taken by the observed agent.
The code for this project includes the following files and data, available as a zip file.
Data file
data.zip Data file, including the digit and face data.
Files you will edit
perceptron.py The location where you will write your perceptron classifier.
perceptron_pacman.py The location where you will write your pacman perceptron classifier.
dataClassifier.py The wrapper code that will call your classifiers. You will also write your enhanced feature extractor here. You will also use this code to analyze the behavior of your classifier.
answers.py Answer to Question 2 goes here.
Files you should read but NOT edit
classificationMethod.py Abstract super class for the classifiers you will write.
(You should read this file carefully to see how the infrastructure is set up.)
samples.py I/O code to read in the classification data.
util.py Code defining some useful tools. You may be familiar with some of these by now, and they will save you a lot of time.
mostFrequent.py A simple baseline classifier that just labels every instance as the most frequent class.
Files to Edit and Submit: You will fill in portions of perceptron.py, answers.py,perceptron_pacman.py, and dataClassifier.py (only) during the assignment, and submit them. You should submit these files with your code and comments. Please do not change the other files in this distribution or submit any of our original files other than this file.
Evaluation: Your code will be autograded for technical correctness. Please do not change the names of any provided functions or classes within the code, or you will wreak havoc on the autograder.
Academic Dishonesty: We will be checking your code against other submissions in the class for logical redundancy. If you copy someone else's code and submit it with minor changes, we will know. These cheat detectors are quite hard to fool, so please don't try. We trust you all to submit your own work only; please don't let us down. If you do, we will pursue the strongest consequences available to us.
Getting Help: You are not alone! If you find yourself stuck on something, contact the course staff for help. Office hours, section, and the discussion forum are there for your support; please use them. If you can't make our office hours, let us know and we will schedule more. We want these projects to be rewarding and instructional, not frustrating and demoralizing. But, we don't know when or how to help unless you ask.
Discussion: Please be careful not to post spoilers.
A skeleton implementation of a perceptron classifier is provided for you in perceptron.py. In this part, you will fill in the train function.
Unlike the naive Bayes classifier, a perceptron does not use probabilities to make its decisions. Instead, it keeps a weight vector wy of each class/label y (y is an identifier, not an exponent). Given a feature list f, the perceptron compute the class y whose weight vector is most similar to the input vector f. Formally, given a feature vector f (in our case, a map from pixel locations to indicators of whether they are on), we score each class with:
Each feature vector represents on data point. Then we choose the class with highest score as the predicted label for that data instance. In the code, we will represent each wy as a Counter (Counter is defined in util.py). This code stores the an array of the weight for each label as self.weights.
Learning weights
In the basic multi-class perceptron, we scan over the data, one instance at a time. When we come to an instance (f, y), we find the label with highest score:
We compare y', (the label that your classifier predicts) to the true label y. (During training, you will receive the training data along with their true labels.) If y'=y, we've gotten the instance correct, and we do nothing. Otherwise, we guessed y' but we should have guessed y. That means that wy should have scored f higher, and wy' should have scored f lower, in order to prevent this error in the future. We update these two weight vectors accordingly:
Using the addition, subtraction, and multiplication functionality of the Counter class in util.py, the perceptron updates should be relatively easy to code. Certain implementation issues have been taken care of for you in perceptron.py, such as handling iterations over the training data and ordering the update trials. Furthermore, the code sets up the weights data structure for you. Each legal label needs its own Counter full of weights.
Question
Fill in the train method in perceptron.py. Run your code with:
python dataClassifier.py -c perceptron
Hints and observations:
The command above should yield validation accuracies in the range between 40% to 70% and test accuracy between 40% and 70% (with the default 3 iterations). These ranges are wide because the perceptron is a lot more sensitive to the specific choice of tie-breaking than naive Bayes.
One of the problems with the perceptron is that its performance is sensitive to several practical details, such as how many iterations you train it for, and the order you use for the training examples (in practice, using a randomized order works better than a fixed order). The current code uses a default value of 3 training iterations. You can change the number of iterations for the perceptron with the -i iterations option. Try different numbers of iterations and see how it influences the performance. In practice, you would use the performance on the validation set to figure out when to stop training, but you don't need to implement this stopping criterion for this assignment.
Though not intended to be directly called by you in your train() method, the classify method may be helpful in showing you how to implement the first equation above (i.e. you don't explicitly have to write out the sum because the counter takes care of it for you). You need to implement the other equations (y' = argmax ..., and the weight updates) yourself.
Visualizing weights
Perceptron classifiers, and other discriminative methods, are often criticized because the parameters they learn are hard to interpret. To see a demonstration of this issue, we can write a function to find features that are characteristic of one class. (Note that, because of the way perceptrons are trained, it is not as crucial to find odds ratios.)
Question
Fill in findHighWeightFeatures(self, label) in perceptron.py. It should return a list of the 100 features with highest weight for that label. You can display the 100 pixels with the largest weights using the command:
python dataClassifier.py -c perceptron -w
Use this command to look at the weights, and answer the following question. Which of the following sequence of weights is most representative of the perceptron?
Answer the question answers.py in the method q2, returning either 'a' or 'b'.
You have just built a perceptron classifier. You will now use a modified version of perceptron in order to learn from pacman agents. In this question, you will fill in the classify and train methods in perceptron_pacman.py. This code should be similar to the methods you've written in perceptron.py.
For this application of classifiers, the data will be states, and the labels for a state will be all legal actions possible from that state. Unlike perceptron for digits, all of the labels share a single weight vector w, and the features extracted are a function of both the state and possible label (here, the label is an action).
Question
Fill in the train method in perceptron_pacman.py. Run your code with:
python dataClassifier.py -c perceptron -d pacman
This command should yield validation and test accuracy of over 70%.
HINT 1: To get started, try printing out self.features and trainingData[i] to orient yourself.
In this part you will write your own features in order to allow the classifier agent to clone the behavior of observed agents. We have provided several agents for you to try to copy behavior from:
StopAgent: An agent that only stops
FoodAgent: An agent that only aims to eat the food, not caring about anything else in the environment.
DestructAgent: An agent that only moves towards the closest ghost, regardless of whether it is scared or not scared.
ContestAgent: A staff agent from p2 that smartly avoids ghosts, eats power capsules and food.
We've placed files containing multiple recorded games for each agent in the data/pacmandata directory. Each agent has 15 games recorded and saved for training data, and 10 games for both validation and testing.
Question
Add new features for behavioral cloning in the EnhancedPacmanFeatures function in dataClassifier.py.
Upon completing your features, you should get at least 85% accuracy on each of the provided agents. You can directly test this using the --agentToClone <Agent name>, -g <Agent name> option for dataClassifier.py:
python dataClassifier.py -c perceptron -d pacman -f -g ContestAgent -t 1000 -s 1000
HINT 1: This question is very similar to the in the multiagent project. There's many different ways to solve it, and you need to think intuitively about which features are relavent.
HINT 2: After adding each feature you think might help, test to see if actually does! Adding complex features or features that aren't especially useful may decrease the accuracy.
HINT 3: Try printing out the features. If some have a huge magnitude or very small magnitude, it maybe be more difficult for you perceptron to learn from them.
HINT 4: Some functions that may be useful (among others) are generateSuccessor, util.manhattanDistance, getGhostPositions, getCapsules, getPacmanPosition, and getFood().asList().
Other helpful options:
We have also provided a new ClassifierAgent, in pacmanAgents.py, for you that uses your implementation of perceptron_pacman. This agent takes in training, and optionally validation, data and performs the training step of the classifier upon initialization. Then each time it makes an action it runs the trained classifier on the state and performs the returned action. You can run this agent with the following command:
python pacman.py -p ClassifierAgent --agentArgs trainingData=<path to training data>
You can also use the --agentToClone <Agent Name> option to use one of the four agents specified above to train on:
python pacman.py -p ClassifierAgent --agentArgs agentToClone=<Agent Name>
Congratulations! You're finished with the CSE 415 projects.