Jacket (www.accelereyes.com ) is a full run time system that provides visual computing capability as well as speed of GPU to MATLAB program by introducing data types such as GDOUBLE and GSINGLE into MATLAB. It transparently overloads CPU-based functions with GPU-based functions.
We are pleased to announce that HPCC has jacket licenses available for nodes quad06,quad07 (gpu queue) and gpu001-gpu008 (gpufermi queue). There are 8 tokens of network licence and two nodelocked licenses. We would like users to checkout as minimum licneses as possible being considerate to other users. To check the number of licenses being checked out, run the command below:
Through the aggressive tests, we experienced that the new (current) version of cuda-4.1 and Jacket-2.1 are more stable than the previous versions.
Get the CUDA enabled node by requesting a gpu or gpufermi type of nodes:
You will be assigned either quad06 or quad07.You will be assigned either gpu001- gpu024.
You can even request for one or two GPUs (gpus=1 or 2) depending on your need. Please use minimum considering other GPU users. Also, you should be sure that your script (job) is actually using both the GPUs in the node with gpus=2.
Set the environment by typing:
Open MATLAB window:
Testing - Example (GUI - Interactive)
Copy the content below in a new M-file and name it, float_pi.m
Run the matlab and you will have the value of pi printed in command window:
Testing - Example (Batch Job - using PBS Script)
copy the the following pbs script (jacket.pbs) in your home directory.
Copy the file float_pi.m in your home directory (content in interactive mode example above)
Submit your job:
Find your output at Jacket_test.o<jobid> file
If two jobs, each requesting one gpu, are simultaneously assigned to the same node (less than 10 sec delay between two jobs), the second job may be terminated with the following error without affecting the first job:
Running Jacket Job in Multiple GPUs
You can use the template working script provided in a Jacket website(http://wiki.accelereyes.com/wiki/index.php/Jacket_MGL) which generate random values and perform FFTs in parallel across all available devices.
Testing CPU vs GPU:
Running the test in quad07 using the example code from this website: http://ircs.seas.harvard.edu/display/USERDOCS/How+to+use+Jacket+(GPU+based+Matlab+accelerator):
The codes -- runjack.m, runmatGpu.m perform matrix multiplication over several data set sizes specified in the code, with the main functions jacket and matGpu, respectively. A mean, and SD are taken over various trials, and the results plotted. The functions -- jacket, and matGpu differ in the fact that jacket uses Jacket (v 1.4) routines for matrix multiplication while matGpu uses the native MATLAB R2010b Cuda specific routines.
While in general Jacket seems to scale better with larger datasets, it slows down the CPU (non-GPU) component, whereas the native MATLAB GPU routines, as expected, perform comparably with its GPU counterpart.
The FLOPs are computed by estimating the number of floating point operations performed by each of the routines, and the time taken for the routine to completea.
To reproduce the results, copy runjack.m, jacket.m, matGpu.m, runmatGpu.m to a folder, and cd to the folder.
aThe computation time does not include the variable initialization and the GPU data transfer time.