Home‎ > ‎Brahma.FSharp‎ > ‎

Example of usage

Matrix multiplication example is described below. Full code you can found in FsMatrixMultiply.fs.

More examples:

Prerequirements

Fist, you should add next references into your project:
  1. Brahma
  2. Brahma.OpenCL
  3. OpenCL.NET
  4. Brahma.FSharp.OpenCL.Extensions
  5. Brahma.FSharp.OpenCL.Core
  6. Brahma.FSharp.OpenCL.Translator

Configure provider

You should configure queue for executing commands on GPU. First step: device provider creating.
let platformName = "*"
let deviceType = Cl.DeviceType.Default

let provider =
        try  ComputeProvider.Create(platformName, deviceType)
        with 
        | ex -> failwith ex.Message

Next: command queue creating:
let mutable commandQueue = new CommandQueue(provider, provider.Devices |> Seq.head)

Write quotation

Well, now all are ready for  specifying kernel code. It should be a quoted function with all required parameters passed explicitly. Using of bindings that declared out of quotation is not supported. This function will be translated and executed on GPU.

let command = 
        <@
            fun (r:_2D) columns (a:array<_>) (b:array<_>) (c:array<_>) -> 
                let tx = r.GlobalID0
                let ty = r.GlobalID1
                let mutable buf = c.[ty * columns + tx]
                for k in 0 .. columns - 1 do
                    buf <- buf + (a.[ty * columns + k] * b.[k * columns + tx])
                c.[ty * columns + tx] <- buf
        @>

Compile&Run

First step is kernel function compilation.
let kernel, kernelPrepare, kernelRun = provider.Compile command
  • kernel is an object which contains command-dependent informaion: autoconfigured buffers for data transferring, target OpenCL code and other.
  • kernelPrepare is a function, which based on your command and should be applied to actual parameters for processing.
  • kernelRun is just return ready for executing command, which you can to add into queue.
Next is computation grid configuration: 2D grid with size = rows*columns and local block with size=localWorkSize*localWorkSize (e.g. 10*10)
let d =(new _2D(rows, columns, localWorkSize, localWorkSize))
Prepare kernel. Pass actual parameters for computation:
kernelPrepare d columns aValues bValues cParallel
Add command into queue and finish it.
let _ = commandQueue.Add(kernelRun()).Finish()

Note, that now all command queue methods are return command queue (it's the same object, not clone), so we should use "let _ =" ore "|> ignore".  


Get result

For getting calculated result back you should call ToHost method of array which you want to read. You can read result into the same array, or call ToHost(provider, resultArray) for read result into resultArray.
let _ = commandQueue.Add(cParallel.ToHost(kernel)).Finish()

Releasing of resources

Do not forget to release all resources.
commandQueue.Dispose()
provider.Dispose()

If you want only to clean buffers, you can use CloseAllBuffers method.
provider.CloseAllBuffers()

Comments