http://openmp.org/wp/
Current release 4.5 (Nov 2015)
Compiler -testing:
Write a program to test:
#include <omp.h> #include <stdio.h> int main() { #pragma omp parallel printf("Hello from thread %d, nthreads %d\n", omp_get_thread_num(), omp_get_num_threads()); }
Compilation should proceed with no errors or warnings. Execute the output, called variously a.out
,a.exe
, hello.exe
.
Check your compiler support:
http://openmp.org/wp/openmp-compilers/
=======
1. It is a shared memory model
2. Execution model is Master thread and team of thread workers.
3. Data can be private, shared
4. Usage: -fopenmp
5. Useful constructs/ functions:
#pragma omp
(parallel) (Data parallelism)
for ,sections, parallel for, task
functions:
omp_get_thread_num(), omp_get_num_threads(), omp_set_num_threads()
Synchronization:
barrier, critical, atomic, flush
Environment:
OMP_NUM_THREADS, OMP_SCHEDULE,..
about target, map constructi ..
Task:
1. Matrix multiplication
- create sequential version
Generate random matrix size N=2K where dimes is 2Kx2K
-create openmp version, use parallel for
set #numthread = #cores
- use three schedules: static (N/#numthread), dynamic(1), guided dynamic
measure time for each schedule
create table:
seq time static dynamic guided
time / speedup time / speedup time / speedup
1Kx1K
2Kx2K
Consider the numthread = 4,8,16,32.
2. K Means
Consider the code at
https://github.com/andreaferretti/kmeans/tree/master/openmp
We are comparing kmeans implementation here and try to tune the performance.
Find the best number of numthreads you can use to get the most speedup compared to the time numthread equal to 1. Also, combine with the possible three schedule.
Find the best # threads with the best schedule that yield the "best speedup".
Please show the graph or table to support your answer.
3. Consider the code
#include <stdio.h>
#include <string.h>
#define N 1000000
void main(void)
{
char str[] = "The";
int found;
int i,j;
for (i=0; i< N-strlen(str); i++) {
found = 0;
for (j=0; j < strlen(str); j++) {
if (str[j] != array[i]){
break;
}
}
if (j == strlen(str)) {
found = 1;
printf("%d\n",i)
}
}
}
Change the code to use openMP.
Consider the data set text read to the array from ...
Find the best speedup of your code varying numthreads, schedules, and others...
4. Graph500
https://github.com/graph500/graph500
Run sequential version
openmp version
and MPI version
on the jetson cluster
and measure the time each version.
Try to optimize the openmp by adding different schedule.
Record the improved time each case you have done.