Various efforts to build an automatic base-B for F5B et al.All the green are pixels that are being ignored by the plane detection (really, motion detection). The non-green are pixels that are 'plane candidates'. This image has the plane positions from the previous frames superimposed on it.
See http://www.rcgroups.com/forums/showthread.php?t=1070146 for the discussions around it.
Zipped FxxTrack.exe for WindowsXP/Vista/Win7 is attached at the end of this page.
A couple of sample videos: The first shows it in debug mode, mostly single-stepping through the frames. Each little blue cross-type thing is a point of detected motion. Yellow ones are points that the neural network guesses is likely to be a plane. The 'difference' mode displays the current image with the previous image subtracted from it (to only show changes), without which is basically impossible to see the plane being tracked.
A longer run, showing it building traces as model planes come into the field of view.
1. Some of the crasher bugs fixed, but apparently not all.
2. Somewhat faster.
3. Has the Ding/Dong sounds back instead of the beeps (louder, longer sounds).
This is working reasonable well; Certainly useful for practice sessions. The biggest remaining problems are more organization than sofware (i.e. how to arrange the camera so it can see the top of a 6-leg run while still picking up the last turn).
The algorithm has changed a fair bit. The stack is now (approximately):
1. Compute the pixel-by-pixel median of this frame and the previous 2 frames.
2. Subtract the median from this frame and take the absolute value.
3. Sample the absolute differences to find the 99th centile value for Y, U and V.
4. Build a mask image by subtracting the 99th centile values from the absolute difference image.
5. Build blobs on the mask image, joining blobs that are separated by less than 4 pixels (configurable).
6. For each blob, extract the 31 x 31 image section surrounding it and feed it to a neural network detector. The neural network has 2883 inputs, 10 hidden neurons, and 1 output.
6.1 Build a blob probability model by remembering where in the image blobs 'normally' appear.
7. Use bayesian particle filtering to build traces (trajectories) from blobs that the neural network has said are plausible planes.
8. If adding a blob to an existing traces would reduce the probability, clone the traces and consider both possibilities separately.
9. If the traces are extended with the same blob, then consider merging them.
10. If a trace has a log probability greater than -4 and it crosses the center line from left to right (or vice vera), then play a sound.
Step 7. has a fair bit of complexity around the physics model for plane motion, but this is running fairly simplified at the moment until I get enough data to build a more realistic model.
The execution time is now dominated by the neural network, with about half the total CPU time being spent there.
Bad bits: It's huge (10MB). It's a debug build, because I'm lost in linker hell trying to build a release build. argh!
1. It has the new bayesian tracker and copes with better with noisy frames.
2. Mostly doesn't crash.
3. The really good bit. It had the analysis code exposed. So if you run it with a HuffYUV .avi file as an argument (i.e. what it produces when saving), it will start replaying the file. Left-click to single-step frame-by-frame, right-click to fast forward. (left-click stops the fast forward). No rewind yet. I just restart it.
4. It's ported to the Qt toolkit., so OSX support is a tiny bit more likely.
(Oh, and the File menu doesn't work; I haven't hooked it up yet).
Added fxxtrack-0.5.zip (see attachments).
Beware! This was compiled off a branch that's a couple of days old and is utterly untested. I just wanted to a couple of new features out:
1. Higher frame rate. Runs 800x600 @ 15 fps.
2. Slightly lower latency on sound.
3. Different sounds for right-to-left v left-to-right crossings.
This is basically working. With the emphasis on 'basically'.
Key change: It plays a sound when a moving object crosses the middle of the image. :) The sound is currently a doorbell for no good reason. Anyone want to dig up a couple of suitable .wav files? (one for crossing left-to-right and one for crossing right-to-left?)
Which was all the trivial bit. For some mad reason I decided that it needed to be able to save lossless compressed video, which was a MUCH harder proposition. If one clicks on the window, 'Capturing: true' will appear in the status and it will start writing HuffYUV compressed video out to c:\fxxvid-*.avi at about 3 - 5 megabytes per second. I.e. You'd better have a LOT of disk space if you want lots of video time. Next step is to only write the frames with something in them which should make it more reasonable. Clicking again will stop it.
The .avi files are viewable as long as you have a HuffYUV codec installed (most ffmpeg codec packs will have this).
Under the hood, it's now operating as a DirectShow filter graph, with the window being painted by the window manager, instead of directly from the filter graph.
CPU usage is fairly reasonable: about 40% ish on my laptop, climbing to about 55% ish when compressing to disk as well.
There is a very small chance that the attached FxxTrack.zip (attached at the end of the page) may actually do something useful.
It's in an extremely primitive state as befits my lack of windows programming skills.
It's entirely possible the EXE relies on some .DLL that's only on my computer and breaks immediately. If so, feel free keep all the pieces :)
The videos are again compressed, so the noise floor is extreme. Given how small and faint planes are in the videos, it's a HUGE effort to get rid of most of the compression noise without getting rid of the planes. We're working very, very close to the noise floor here. In a finished bit of software this won't be nearly so much trouble as we can get uncompressed frames from the camera which with much less noise.
Taking some of his videos at random and chew on them a bit produced frames like this:
The stippled purple line is the trace the plane has made. The calculated trace is drawn one frame behind the actual video, so the plane is always one step off the end of the trace. The white circle is the 'search area', being where the program was expecting to see the plane. red crosses are 'moving object candidates'. i.e. potential planes.
This is also a fly object. Just in this case, I think it's an insect. Either that, or there were some _serious_ drugs floating around that F5B competition! Note that the white circle is huge in this case. That's because it's lost track of the bug for a couple of frames so it's searching a wider area reflecting the increased uncertainty of position.
This is the good bit. It's tracking two objects, the one above is a plane, and the one below is an insect (I think).
The short summary of how I'm getting to this point:
On the whole I'm rather encouraged. I had some nasty surprises with the acceleration F5B planes do (Seriously! wtf!? I had to dial the max acceleration values _way_ up to let it keep tracking the planes) but the physics model is straight tractable, and the speed is stunning given that it's running O(n^3) algorithms at the moment. So think that getting a practise suitable system is very easily doable. Competition suitable may be a little harder.
(I'm assuming that for practise, having a false positive/negative rate of a few percent is fine).
Note that x-13.avi has the very special treat of having a truck drive through the frame of view sending the poor program absolutely beserk! Note though, that it keeps track of the F5B plane in the background still :)
Some things worked well, some didn't. This is a sample image after running through a hacked up image filter
So this bit works well. Here's a few frames later:
Wait a minute! Why are there only 3 plane positions on this frame? Because the video was being compressed to MPEG as it recorded. And the compression algorithm helpfully decided that no-one can possibly see the faint image of the white plane on the white sky, and discarded it. So the plane positions following the last one aren't in the data. Doh!
So I need to take some more video with less compression :)
On the other hand, it's far far better than me at finding the plane in the sky. A few frames later gives:
That's the top of the loop previous. Even when I know it's there, and going frame by frame, I can't find that in the video.
The software is basically doing:
For each frame in turn:
reference image = referenceimage * (15/16) + newimage * (1/16)
deltaimage = newimage - referenceimage
deltaimage = deltaimage * deltaimage
3x3 median filter over delta image.
noisefilter = noisefilter * (15/16) + deltaimage * (1/16)
if (delta < noisefilter * 2)
newimage = green
Roughly. There's some extra bits in there to do brightness compensation and the noisefilter decays quicker than it grows etc.