Imagine a room full of perfectly matte white objects. If you put a projector and a camera in the room, and project a pattern such that every column (or row) has a unique color, then you can create a correspondence between what the projector "sees", and what the camera sees. This correspondence allows you to triangulate the position of every projected pixel and determine its depth. If you project one frame for every frame the camera captures, you can extrapolate 3D information at your camera's framerate.

Unfortunately, rooms are not generally filled with matte white objects. Whats more, camera-projector synchronization can be difficult outside of hardware, and most cameras distort the scene to some degree. But there are lots of techniques for projecting more information onto the scene in order to overcome these limitations.

Introduction: Binary Codes

Consider the problem of determining which column (the x coordinate) the projector sees a given point in. If we project a pattern with black on the left and white on the right, we can rule out half of the x coordinates by checking whether the point appears black or white. If we do this again within the region we expect the point to be, we can further narrow the position down. This process mirrors binary search, and it therefore takes log2(width) subdivisions to exactly find the x position using this technique. For example, 1024x768 projector takes log2(1024) = 10 subdivisions. If you do this for an entire scene, and look at each pixel over the 10 frames, it will have a binary code like 010011010012 = 6172.

Common Techniques

Gray Codes

Another enumeration of binary numbers exists that is better for structured light applications called gray codes. The first step of the subdivision similarly divides left and right into black and white, but the next step is different. For an 8-pixel wide projector, the three patterns would be: 000011112, 001111002, 011001102. The first three columns are thus 0002, 0012, and 0112. Gray codes have the advantage that adjacent columns are only different by one bit, which means there will be fewer errors in boundary cases. Wikipedia has a simple recursive algorithm for translating gray ordered binary to regular binary.

Three Phase-Shifting

The disadvantage of gray code scanning is how much time it takes (10 frames). One method that overcomes this is called three phase-shift scanning. It relies on using three 120-degrees out-of-phase sine waves to determine the column. Theoretically, you could use a single sine wave, but you would have poor resolution in the peaks and valleys of the wave. Given three amplitude values at a given point, you can compute the overall phase by looking at the order of the three values (described here on page 4, implemented here as PhaseUnwrap.pde). Once you have the overall phase, you have to propagate across the 2π discontinuities (implemented here as PropagatePhases.pde). This technique is further described in "Fast three-step phase-shifting algorithm" by Zhang, et al. One disadvantage of phase unwrapping techniques is that large depth discontinuities can cause inaccuracies when the sine wave frequency is low.

Two+One Phase-Shifting

An improvement on three-phase shifting, the first two frames consist of sine waves that are 90-degrees out-of-phase followed by a third flat image. The flat image can be used to provide local illumination information that smooths out the phase unwrapping step. This technique is further described in "High-speed three-dimensional shape measurement system using a modified two-plus-one phase-shifting algorithm" by Zhang, et al.

Experimental Techniques

Projector Sliding

Sliding a fixed sine wave pattern across a scene gives each pixel a fundamental frequency that is proportional to its depth. This has the advantage of requiring only one image and thus being constrained by camera specifications rather than projector specifications. The disadvantage is that it is difficult to implement in real time.

Two-Dimensional Gray Codes

Gray codes don't necessarily have to be binary/black and white. Increasing the base of the gray codes corresponds to increasing the number of colors being projected. This has the advantage of reducing the number of frames required (if captured in both directions, 10 instead of 20 for 1024x768). It's a good idea to correct for tinting in the scene with an extra flat frame.

Gradient Gray Codes and Frequency Modulation

Instead of projecting binary gray codes, consider projecting a gradient from black to white wherever you would generally have a subdivision. This could allow for a higher accuracy within the same number of frames. You need to correct for scene illumination by taking an extra flat frame. This basic idea can be extended to sine waves that are frequency modulated from low frequency to a higher frequency. This method acts as a hybrid between the phase-shift algorithms and gray codes, overcoming the depth discontinuities of phase-shift while preserving its resolution.

Adaptive Algorithms

A single frequency for phase-shifting will be optimized for a certain amount of movement. Consider varying the frequency with respect to how much movement is observed: larger movements demand lower frequencies, and vice versa. Assuming that there are not large changes from moment to moment, gray codes could be adapted the same way. Only the last few bits of the gray codes would have to be updated each frame, maintaining a fast framerate, while less often the low frequency gray codes are updated, which overcomes depth discontinuities.

High Speed Scanning

Structured light scanning can be done at up to 10,000 fps using the dithering effect produced by DLP projectors.

Related Issues

Lens Distortion

Camera lens distortion creates an overall pillow-cushion effect to the image. OpenCV has some tools for dealing with undistorting camera lenses that rely on tracking a grid of a known size through space. Also consider projector gray codes onto a flat surface from different angles.

Brightness and Gamma Distortion

When a projector displays 0 (black), 255 (white) and 128 (gray), a camera might see 15, 240, and 130. Both the end boundaries (brightness, black and white) need to be corrected for, but the curve (gamma) within the boundaries must also be corrected for. Consider determining the projector-camera characteristics as a complete system rather than trying to determine them independently. A calibration step that accounts for lens distortion might project a grayscale pattern afterward to do brightness and gamma correction.
May 13, 2009, 10:46 AM
May 13, 2009, 10:47 AM