radon
Implementation
Implementation
OpenCL
Usage
Usage
[output, xp] = radon_ATI(input, theta)
Class Support
Class Support
Both output and input are float types. Both xp and theta are double types.
Algorithm
Algorithm
Radon transform would project an image into lines with a certain number of angles. With each angle, a line is generated.
A naive parallel program would launch a thread for each individual angle, which does not take advantage of the massive thread-parallelism
supported in the GPU. Another challenge is due to the projection, memory access may not be coalesced. In our implementation,
one key optimization is the input/output block-tiling using shared memory, which helps to generate coalesced global
memory writes. Another one is to use atomic operations to generate extra thread-level parallelism.