Utilizing Intel Xeon Phi Server
We have a new server that is equipped with 2 Xeon Phi 5110 coprocessors (MIC = Many Integrated Cores).
You are welcome to try out this node and see whether this can accelerate your code.
Information about the Intel Xeon Phi coprocessor is given here:
In essence, it can accelerate your code by running on high-number-of threads coprocessors, similar to GPU, but now running on similar Intel structure. And instead of having to write a code separately for the GPU, you can use your code, insert some MIC directives within the code, and then you can reap the benefit of getting the code accelerated.
Here is a book about Xeon Phi programming that would possibly help you get started:
Intel Xeon Phi Coprocessor High Performance Programming, Jeffers and Reinders, 2012
Another difference from the GPU is the ability for the users to ssh directly to the MICs (mic0 and mic1) and run very basic Busybox Linux commands, while in this mode.
EXAMPLES
Request the node using qsub:
qsub -I -q phi -l nodes=1:ppn=12:mics=2
There is a directory containing the code examples that you can copy, compile and run for your testing.
Copy the codes from /usr/local/doc/phi/uncompiled/* to your home directory (e.g. /home/hxd58/test/phi/.)
Compiling the codes (for example if we do this in the directory .../uncompiled/Ch2):
cp -r /usr/local/doc/phi/uncompiled/* ~/test/phi/.
cd ~/test/phi/uncompiled/Ch2
source /usr/local/intel/2013/composer_xe_2013/bin/compilervars.sh intel64
make all
Please change the source location to compilervars.csh if you are using C-shell.
Options:
Running directly on the MIC (Xeon Phi)
ssh mic0
cd ~/test/phi/uncompiled/Ch2
./helloflops1_xphi
Initializing
Starting Compute
GFlops = 25.600, Secs = 1.530, GFlops per sec = 16.730
Offload the code from the node and let it run on the MIC
module load micset
cd ~/test/phi/uncompiled/Ch2
./helloflops3o_xeon
Initializing
Starting Compute on 236 threads
GFlops = 6041.600, Secs = 3.124, GFlops per sec = 1933.704
You should be able to compile and runs for codes in Ch3 and Ch4, which runs directly on the MIC (Xeon Phi).
Results of these test codes are presented in this Google Spreadsheet:
https://docs.google.com/a/case.edu/spreadsheet/ccc?key=0Ai0GTaYFOIhddGNRdmYtVy10Z05ObTJYOG5ZUFpEV1E&usp=drive_web#gid=0