imregionalmax
Implementation
Implementation
OpenCL
Usage
Usage
output=imregionalmax_ATI(input)
Class Support
Class Support
float is supported for input, output
Algorithm
Algorithm
Since the algorithm is to compute the maximum value from 3X3 input pixels, we use float4 as data unit and put x,w component into share memory. Thus for one thread, it will read float4 from global memory and left pixel's w and right pixel' x from share memory to reduce global memory access. We also use one thread to compute some neighbor thread in the same column to reduce global memory access.