Base B

Various efforts to build an automatic base-B for F5B et al.
See for the discussions around it.

Zipped FxxTrack.exe for WindowsXP/Vista/Win7 is attached at the end of this page.
  • Minimum requirement is machine that supports the SSE2 instruction set. So any core-2 based intel processor or later should work fine.
  • I recommend a decent webcam. Known working ones are the Logitech 9000, and the Microsoft lifecam cinema. (Chosen for their ability to do at least 640x480 at 25 frames per second or better in uncompressed modes).
  • Program uses little memory; 100MB or less. If you save the video, it will use amazing amounts of disk in a very short time as it writes losslessly compressed video at around 6 - 8 megabytes per second. (I encourage people to save videos and tell me where I can download them from :)

2009-12-27 Update

Program is now looking like it's in fairly reasonable shape. Random brain dump of assorted changes:
  • Runs on Windows Vista without immediately crashing. Hint: remember to catch the exceptions you throw.
  • Revamped probability model. It's now MUCH better at dealing with noise frames. Out of the box, it will cope with 300 - 600 noise blobs per frame (and a single real blob) and happily build a useful trace from it.
  • Image is displayed in true color by default (frame to change it to false YUV which uses less CPU).
  • Supports rotating baseline by 90 degrees (to handle cameras mounted on their side).
  • Image processing generally runs much faster. On my 3 year old desktop, stripped of the debugging, it handles better than 60 frames per second @ 640 x 480, including decompressing recorded video, and drawing it on the screen.
  • Custom compression format. It previously used HuffYUV, but that's a pain in the butt to get operating cleanly on Win32, so I hacked together a trivial huffman compressor that compresses simple frame difference. This is obviously much faster and does better compression than HuffYUV (which only does intra-frame compression). I looked around for a simple format to standardise on, but failed to find one, so custom it is. :(
  • Fit a 3rd order polynomial to current trace. By default, it uses the last 7 points of the trace and fits the polynomial to predict where the next point should show up.
  • Internally uses probability instead of log probability everywhere. This is a mild point in handling under/over flow conditions, but it's much faster without all the log() calls everywhere.
  • Configurable detector. By default it uses a 31x31 pixel patch about each blob into a neural network to guess if this blob is a plane. There's a haar-like feature detector that's available as an alternative (the haar-like detector is much faster per-blob, but has higher per-frame setup overhead).
  • Switched from using the median of the differences between previous frames to the minimum of the differences between previous frames in an effort to reduce noise bleed-through.
After all this, it's now fairly reliable as tracking planes when they're 3 - 5 pixels in size in the image, and very reliable when they're above 5 pixels in size.

A couple of sample videos: The first shows it in debug mode, mostly single-stepping through the frames. Each little blue cross-type thing is a point of detected motion. Yellow ones are points that the neural network guesses is likely to be a plane. The 'difference' mode displays the current image with the previous image subtracted from it (to only show changes), without which is basically impossible to see the plane being tracked.

Fxxtrack sample

A longer run, showing it building traces as model planes come into the field of view.

Fxxtrack sample #2

2009-10-30 Update

Version 1.04 release (attached at end of page).
1. Some of the crasher bugs fixed, but apparently not all.
2. Somewhat faster.
3. Has the Ding/Dong sounds back instead of the beeps (louder, longer sounds).

2009-10-04 Update

Version 1.00 release (see attachments at the end of this page).

This is working reasonable well; Certainly useful for practice sessions. The biggest remaining problems are more organization than sofware (i.e. how to arrange the camera so it can see the top of a 6-leg run while still picking up the last turn).

The algorithm has changed a fair bit. The stack is now (approximately):
1. Compute the pixel-by-pixel median of this frame and the previous 2 frames.
2. Subtract the median from this frame and take the absolute value.
3. Sample the absolute differences to find the 99th centile value for Y, U and V.
4. Build a mask image by subtracting the 99th centile values from the absolute difference image.
5. Build blobs on the mask image, joining blobs that are separated by less than 4 pixels (configurable).
6. For each blob, extract the 31 x 31 image section surrounding it and feed it to a neural network detector. The neural network has 2883 inputs, 10 hidden neurons, and 1 output.
6.1 Build a blob probability model by remembering where in the image blobs 'normally' appear.
7. Use bayesian particle filtering to build traces (trajectories) from blobs that the neural network has said are plausible planes.
8. If adding a blob to an existing traces would reduce the probability, clone the traces and consider both possibilities separately.
9. If the traces are extended with the same blob, then consider merging them.
10. If a trace has a log probability greater than -4 and it crosses the center line from left to right (or vice vera), then play a sound.
11. Done!

Step 7. has a fair bit of complexity around the physics model for plane motion, but this is running fairly simplified at the moment until I get enough data to build a more realistic model.

The execution time is now dominated by the neural network, with about half the total CPU time being spent there.

2009-09-04 Update

Added (see attachments).

Bad bits: It's huge (10MB). It's a debug build, because I'm lost in linker hell trying to build a release build. argh!

Good bits:
1. It has the new bayesian tracker and copes with better with noisy frames.
2. Mostly doesn't crash.
3. The really good bit. It had the analysis code exposed. So if you run it with a HuffYUV .avi file as an argument (i.e. what it produces when saving), it will start replaying the file. Left-click to single-step frame-by-frame, right-click to fast forward. (left-click stops the fast forward). No rewind yet. I just restart it.
4. It's ported to the Qt toolkit., so OSX support is a tiny bit more likely.
(Oh, and the File menu doesn't work; I haven't hooked it up yet).

2009-08-27 Update

Added (see attachments).
Beware! This was compiled off a branch that's a couple of days old and is utterly untested. I just wanted to a couple of new features out:
1. Higher frame rate. Runs 800x600 @ 15 fps.
2. Slightly lower latency on sound.
3. Different sounds for right-to-left v left-to-right crossings.

2009-08-20 Update

See the attachments at the end of this page for (version 2) (version 3).

This is basically working. With the emphasis on 'basically'.

Key change: It plays a sound when a moving object crosses the middle of the image. :) The sound is currently a doorbell for no good reason. Anyone want to dig up a couple of suitable .wav files? (one for crossing left-to-right and one for crossing right-to-left?)

Which was all the trivial bit. For some mad reason I decided that it needed to be able to save lossless compressed video, which was a MUCH harder proposition. If one clicks on the window, 'Capturing: true' will appear in the status and it will start writing HuffYUV compressed video out to c:\fxxvid-*.avi at about 3 - 5 megabytes per second. I.e. You'd better have a LOT of disk space if you want lots of video time. Next step is to only write the frames with something in them which should make it more reasonable. Clicking again will stop it.

The .avi files are viewable as long as you have a HuffYUV codec installed (most ffmpeg codec packs will have this).

Under the hood, it's now operating as a DirectShow filter graph, with the window being painted by the window manager, instead of directly from the filter graph.

CPU usage is fairly reasonable: about 40% ish on my laptop, climbing to about 55% ish when compressing to disk as well.

2009-08-16 Update

There is a very small chance that the attached (attached at the end of the page) may actually do something useful.
It's in an extremely primitive state as befits my lack of windows programming skills.

In particular:
  • You need a webcam that does YUY2. Most UVC webcams will do this, unless they load a custom driver that buggers things up. In particular, Logitech bugger up the Logitech Pro 9000 camera. You need to do into the Device Manager, and change the driver to be the Microsoft UVC driver.
  • Can't usefully change the window size.
  • No sound yet. It tracks objects, but does nothing with them.
  • It's specialized to small flying things in the sky. If you wave at the camera, it will get horribly confused and spit out all sorts of junk tracks.
  • A semi-reasonable way to test it is point it at a still scene, and roll a ball through the area. Or point it at the sky and wait for birds/airliners to pass overhead.
  • Must have a CPU with SSE2 instruction set support (This is basically core2 or better).
My test camera is a Logitech Pro 9000 with the microsoft UVC driver attached to it. it works well. I make no promises for anything else!

It's entirely possible the EXE relies on some .DLL that's only on my computer and breaks immediately. If so, feel free keep all the pieces :)

2009-07-29 Update

Had been out on Sunday and got some more video, this time with the brightness turned way down. It was interesting, but more interesting was huge set of videos done by AlanF. Who rocks!

The videos are again compressed, so the noise floor is extreme. Given how small and faint planes are in the videos, it's a HUGE effort to get rid of most of the compression noise without getting rid of the planes. We're working very, very close to the noise floor here. In a finished bit of software this won't be nearly so much trouble as we can get uncompressed frames from the camera which with much less noise.

Taking some of his videos at random and chew on them a bit produced frames like this:

The stippled purple line is the trace the plane has made. The calculated trace is drawn one frame behind the actual video, so the plane is always one step off the end of the trace. The white circle is the 'search area', being where the program was expecting to see the plane. red crosses are 'moving object candidates'. i.e. potential planes.

This is also a fly object. Just in this case, I think it's an insect. Either that, or there were some _serious_ drugs floating around that F5B competition! Note that the white circle is huge in this case. That's because it's lost track of the bug for a couple of frames so it's searching a wider area reflecting the increased uncertainty of position.

This is the good bit. It's tracking two objects, the one above is a plane, and the one below is an insect (I think).

The short summary of how I'm getting to this point:
  1. Take the absolute diff of this frame against the previous one.
  2. Add 5% of that diff to the noise estimator.
  3. Subtract 6 times (!!) the noise estimator from the diff.
  4. Run a 5x5 median filter over the diff
  5. Threshold the diff at 4%
  6. Compare the thresholded diff with the previous 4 frames to kill areas that are just noise.
  7. Build the list of blobs in the diff, filtering blobs that are too big, too small, too wide or too thin.
  8. For each new blob, compare with the previous 2 frames to see if they roughly match up into an equidistant line. If so, convert them into a trace.
  9. For each trace, calculate the velocity, accelation, and jerk.
  10. For each trace, use the model parameters to build a search area.
  11. Search for a blob in the new frame that falls into the search area.
  12. If there are multiple blobs, take the one closest to the predicted position.
  13. If there are multiple plausible blobs, split the trace and track all of them in parallel.
  14. If a trace hasn't had a matching blob in 5 frames, discard it.
It's pretty much that simple. Less than 700 lines of (very slow) C++ to do the whole lot. See the attachments at the end for the x-13.avi, x-14.avi and x-15.avi videos.

On the whole I'm rather encouraged. I had some nasty surprises with the acceleration F5B planes do (Seriously! wtf!? I had to dial the max acceleration values _way_ up to let it keep tracking the planes) but the physics model is straight tractable, and the speed is stunning given that it's running O(n^3) algorithms at the moment. So think that getting a practise suitable system is very easily doable. Competition suitable may be a little harder.

(I'm assuming that for practise, having a false positive/negative rate of a few percent is fine).

Note that x-13.avi has the very special treat of having a truck drive through the frame of view sending the poor program absolutely beserk! Note though, that it keeps track of the F5B plane in the background still :)


Went out on Sunday and took some video with a random webcam of F5B.
Some things worked well, some didn't. This is a sample image after running through a hacked up image filter

All the green are pixels that are being ignored by the plane detection (really, motion detection). The non-green are pixels that are 'plane candidates'. This image has the plane positions from the previous frames superimposed on it.

So this bit works well. Here's a few frames later:

Wait a minute! Why are there only 3 plane positions on this frame? Because the video was being compressed to MPEG as it recorded. And the compression algorithm helpfully decided that no-one can possibly see the faint image of the white plane on the white sky, and discarded it. So the plane positions following the last one aren't in the data. Doh!

So I need to take some more video with less compression :)

On the other hand, it's far far better than me at finding the plane in the sky. A few frames later gives:

That's the top of the loop previous. Even when I know it's there, and going frame by frame, I can't find that in the video.

The software is basically doing:

For each frame in turn:
 reference image = referenceimage * (15/16) + newimage * (1/16)
 deltaimage = newimage - referenceimage
 deltaimage = deltaimage * deltaimage
 3x3 median filter over delta image.
 noisefilter = noisefilter * (15/16) + deltaimage * (1/16)

 if (delta < noisefilter * 2)
    newimage = green

 show newimage

Roughly. There's some extra bits in there to do brightness compensation and the noisefilter decays quicker than it grows etc.

Michael O,
Aug 21, 2009, 2:26 PM
Michael O,
Jul 5, 2009, 4:47 PM
Michael O,
Jul 5, 2009, 4:17 PM
Michael O,
Sep 13, 2009, 5:54 AM
Michael O,
Sep 14, 2009, 9:20 PM
Michael O,
Sep 19, 2009, 6:03 AM
Michael O,
Aug 21, 2009, 2:08 PM
Michael O,
Sep 7, 2009, 4:59 AM
Michael O,
Sep 9, 2009, 5:17 PM
Michael O,
Oct 3, 2009, 7:41 AM
Michael O,
Oct 4, 2009, 6:16 PM
Michael O,
Oct 22, 2009, 2:53 AM
Michael O,
Oct 23, 2009, 9:13 PM
Michael O,
Oct 30, 2009, 3:40 AM
Michael O,
Oct 31, 2009, 5:20 AM
Michael O,
Oct 31, 2009, 5:19 AM
Michael O,
Nov 14, 2009, 2:05 PM
Michael O,
Dec 24, 2009, 3:51 AM
Michael O,
Dec 26, 2009, 12:57 PM
Michael O,
Dec 27, 2009, 2:25 AM
Michael O,
Oct 21, 2010, 2:57 AM
Michael O,
Jul 5, 2009, 10:11 PM
Michael O,
Jul 17, 2010, 4:47 AM
Michael O,
Jul 28, 2009, 11:38 PM
Michael O,
Jul 28, 2009, 11:42 PM
Michael O,
Jul 28, 2009, 11:59 PM