radon

Implementation

OpenCL

[output, xp] = radon_ATI(input, theta)

Both output and input are float types. Both xp and theta are double types.

Radon transform would project an image into lines with a certain number of angles. With each angle, a line is generated.

A naive parallel program would launch a thread for each individual angle, which does not take advantage of the massive thread-parallelism

supported in the GPU. Another challenge is due to the projection, memory access may not be coalesced. In our implementation,

one key optimization is the input/output block-tiling using shared memory, which helps to generate coalesced global

memory writes. Another one is to use atomic operations to generate extra thread-level parallelism.

Page updated

Google Sites

Report abuse