Give your arduino a brain

Predict the value of the leftmost sensor with the brightness of a LED:

Four photocells will act as sensors so one can easily influence the values by either covering them with your finger or pointing a torch at them.

This article focuses on letting the technology interact with the physical world using a simple example. One can experience "intelligent" machines using his body and get immediate feedback. I will not focus on what machine learning, artificial intelligence, deep learning etc. is all about (although I love to discuss about it).

In fact it's about collecting data from sensors attached to an Arduino micro-controller in order to predict the future state of a sensor. In fact it is not to predict the future really, it's more about what is likely to happen next, based on the patterns learned in the past.

Basically some smart piece of software is looking at the data the sensors record and then try to find reoccurring patterns over space and time.

So what do we need:
  • A machine running Ubuntu 14.04 operating system  (others might work, but you have to figure out yourself)
  • An Arduino UNO micro-controller, including a USB cable (others might work, but your have to figure out on yourself)
  • 10 bucks or some photocells, LEDs, resistors (details below)
  • Some time (depending on your skills and your doggedness to finish  a project)

By the way, my written English is not the best, so please apologize.

First of all one needs to install nupic on his computer, I highly recommend this video. Depending on your Unix/Linux skills, this will take about 1 to 5 hours. If you want to play around with the framework, just visit the wiki (this can take you weeks). As well, consider to install a virtual machine if you do not have a physical Ubuntu machine available.

Second prepare your physical device. Attach 4 photocells to the analog inputs A0 to A3 of your UNO and a light emitting diode (LED) to digital output 9. Instructions on how to attach an photocell to your Arduino can be found here. Instruction on how to attach an LED can be found here.

So these were the standard bits and pieces, now we need to tackle the custom ones.
So basically we need to do 4 steps in order to make our "smart" device to work:
  • Recording, get some sample date
  • Swarming, get a configuration for your "smart" engine
  • Teaching, get a model of your recorded data
  • Get online, get immediate feedback on what you are feeding into the "smart" engine
Recording some sample data means that you basically record what your sensors are sensing, translate them into some representation (e.g. a number) and then make it persistent for later use (e.g. in a file). Ideally you record sample data that one suspects to contain some interesting patterns that could be "learned" by the smart engine

Swarming is needed  because your "smart" engine needs to be configured in order to get best results for a given problem (more on that later). The "swarming" process helps you to find this configuration based on sample data. Obviously that is why you need to have some sample data as explained above.

Teaching is the process of feeding your configured engine with sample data. The outcome is a "model" the engine built up based on the sample data.

Finally getting online let's one to get nearly immediate prediction based on the activity recorded by the sensors and the trained "brain" of the engine (in fact in this case its called a "hierarchical temporal memory").

All the of the four steps can be called from a script called "" which, including other components, can be downloaded here.
These scripts run on the Ubuntu machine, where they accept sensor data from the Arduino over a serial interface and depending on the active step, sending back data to back to the Arduino.

Furthermore a script is needed on the Arduino, it can be downloaded here.

Technical Notes:
  • In order to keep the amount of code as small as possible, I rely on the so called opf framework of nupic. This api is supposed to hide details from how the whole engine works (other versions using another api might follow)
  • The scripts are optimized for having as little code as possible and showing essential calls only, fore the sake of simplicity and being self-explanatory
  • I'm not a professional programmer
  • I have other versions of this experiment with much more sophisticated data processing

Congratulations if you made it so far, I’ll give some complementary informations to the four steps here:

This script basically collects values from the sensors and saves it into a out.csv file. It will be used to for swarming and teaching. Try to repeat some patterns in order to let nupic find patterns.

You can use Excel or LibreOffice Calc to visualize the values as showed in the image above. It shows an excerpt of a recording session whereas  each colour represents a different photocell. I covered the photocells with my finger in a repeating sequence, first the green one, then the yellow one, and then the red one and finally the blue one. Then, after a short break, I started over again.
Repeat it a least a few dozen times or even better a few hundert times. You can copy sequences in the csv file itself rather that touching the cells for hours! Furthermore you could download a sample out.csv here.
The swarming process picks the search_def.json file and runs a so-called swarm on your out.csv. The search_def.json contains some basic configuration about your data and what you want to do with it. The swarming process does a refinement of this configuration in order to better prepare your learning engine (you will see the additional files that will be produced while swarming runs).
This script simply takes the refined configuration and the sample data (out.csv) and builds a "memory" of what it is looking at. This memory is saved in folder /model_save for later use. Furthermore it permanently shows its progress in learning in teach.csv.

This chart shows how good the prediction is after approx. 800 learning steps. The blue line shows the leftmost sensor and the brown line what nupic is predicting (4 steps ahead). In fact, teaching is some sort of catchup the past. Because you saved your data, you don't to wait for new data to come. Instead feed all your raw data in one go.
Online learning is basically the same as teaching, except you do online. This means you pick the latest input in some sort of real time and immediately try to predict the future based on the past. If you are at t=0, this might give you the impression one can predict the future!

oye at swissonline dot ch