Summary: This project was a part of our coursework for the subject: Machine Learning. The AIM of this project was to build a system that can identify colors using machine learning. This project can also be done using some threshold based decision making program. The scope of this project get introduced to machine learning - i.e. beginner level. The knowledge shared is limited to the project. I am not a pro in machine learning.
HW: Arduino Nano - link - This micro controller is used to take data from color sensor i.e. R, G and B values and send it via USB serial port
Color sensor - TCS3200 - link - This color sensor is used to generate learning data and test data for the Machine learning.
What is machine learning to my understanding ? machine learning is a branch of computer science. It can be used to train a computer to a do some tasked based on its learning. In this case identify colors.
How does the system work ?
For training - The arduino sends color - Red, Green and Blue data via USB serial port to the the Linux Ubuntu System. The Ubuntu Linux system stores all this training data into a .csv file.
For testing - A new color which is out of the training set is provided to the system as the input. The ML is trained and the predicted color is given as a output.
In the following steps I will take you through my journey in detail to develop this system.
1> Arduino program to send RGB value via serial port to the PC: Setting up Arduino can be a pain in Linux. Arduino for Ubuntu can be downloaded from -
here. Instructions to install them can be found here. Once installed plugin the Arduino Nano board. To check if the Arduino Nano is detected by the Linux use command : 'lsusb' in Ubuntu terminal. Try plugging in and out the Arduino board to see the change. The Arduino USB port needs to be enabled. To check if the device is detected check it in /dev folder. /dev/ttyUSB0 is shown in my PC. The port needs to be enabled once Arduino is plugged in. This can be done using the command: sudo chmod a+rw /dev/ttyUSB0. Make sure that folder in which you run this command has /dev folder i.e. the root$ folder.
Now open the Arduino and select the board Arduino NANO from the tools menu. Upload the following program to the Arduino. Once you load this program, it will keep dumping the R,G.B comma (,) separated value data at 9600 baud rate to the serial port. Use this program to generate the learning data.
Once this learning data is generated, separate it to two files which will be used for learning. One file will have a matrix which will be given as a input features.csv which will have the R, G and B values (,) comma separated. Another file will have all the outputs. Sample files are available here.
We have used scikit-learn for machine learning. To download all dependencies for scikit-learn, run the the following three command in the terminal:
I have referred this site for installation
$ wget https://repo.continuum.io/archive/Anaconda3-4.2.0-Linux-x86_64.sh
#Run the script,
$ bash Anaconda3-4.2.0-Linux-x86_64.sh
#Finally install Graphviz for visualing the flow of the Program
$ sudo apt-get install graphviz
I could not install graphviz due to some errors.
Once the learning data set is ready, I ran the following python code. This code opens the features file and the output file and generates the model. For a given input (RGB values) the algorithm predicts the output. This output is printed on the terminal.
Hope you enjoyed the project. You can try various algorithms to generate accurate models.
The exact procedure is as follows:
#define S0 4
#define S1 5
#define S2 6
#define S3 7
#define sensorOut 8
int frequency = 0;
void setup() {
pinMode(S0, OUTPUT);
pinMode(S1, OUTPUT);
pinMode(S2, OUTPUT);
pinMode(S3, OUTPUT);
pinMode(sensorOut, INPUT);
// Setting frequency-scaling to 20%
digitalWrite(S0,HIGH);
digitalWrite(S1,LOW);
Serial.begin(9600);
}
void loop() {
// Setting red filtered photodiodes to be read
digitalWrite(S2,LOW);
digitalWrite(S3,LOW);
// Reading the output frequency
frequency = pulseIn(sensorOut, LOW);
//Remaping the value of the frequency to the RGB Model of 0 to 255
//requency = map(frequency, 25,72,255,0);
// Printing the value on the serial monitor
//Serial.print("R= ");//printing name
Serial.print(frequency);//printing RED color frequency
Serial.print(",");
//Serial.print(" ");
delay(100);
// Setting Green filtered photodiodes to be read
digitalWrite(S2,HIGH);
digitalWrite(S3,HIGH);
// Reading the output frequency
frequency = pulseIn(sensorOut, LOW);
//Remaping the value of the frequency to the RGB Model of 0 to 255
//frequency = map(frequency, 30,90,255,0);
// Printing the value on the serial monitor
//Serial.print("G= ");//printing name
Serial.print(frequency);//printing green color frequency
Serial.print(",");
//Serial.print(" ");
delay(100);
// Setting Blue filtered photodiodes to be read
digitalWrite(S2,LOW);
digitalWrite(S3,HIGH);
// Reading the output frequency
frequency = pulseIn(sensorOut, LOW);
//Remaping the value of the frequency to the RGB Model of 0 to 255
//frequency = map(frequency, 25,70,255,0);
// Printing the value on the serial monitor
//Serial.print("B= ");//printing name
Serial.print(frequency);//printing blue color frequency
Serial.print(",");
Serial.println("2"); // While training you need to change this number as per the color cards as shown in the video. While predicting dont care
delay(100);
}
The Arduino sketch used to collect RGB data can be found here. Burn it into the Arduino to collect RGB data. Follow the comments for wiring it with TCS3200 sensor.
The RGB data is logged using the serial port to create a file as here
The file above is used as a training set. The same set is split into two file as 'features' and 'labels'. Basically the inputs which has the Red, Green and Blue data in one file and the expected output data.
The features file can be found here. Label it as 'feature_1.csv' and paste it your Linux/ubuntu home directory.
The labes file can be found here. Label it as 'label_1.csv' and paste it in your Linux/ubuntu home directory.
Once both the files are of features and labels is ready. Show a different color to the sensor which was not included in the training set and note down its RGB data. Now edit this RGB data in line ->
prediction = classifier.predict([130,210,170])
number of the attached sampleML_5.py. And run the python script. The python script will take the 'feature_1.csv' and 'label_1.csv' file, generate the decision tree model and give the predicted output i.e the output color.
The decision tree can be generated from the .dot file which is logged in your Linux/Ubuntu home directory.
The video showing the demo is as follows:
Hope you enjoyed the project.
The decision tree model generated from the .dot file is as follows