OpenMP in gfortran

Toward Faster Development Time, Better Application Performance and Promised Software Reliability

(A) Definition of OpenMP

OpenMP stands for Open Multi Processing and is a standard Application Programming Interface (API) for distributing program tasks across threads of a shared memory computer.

The OpenMP supports multi-platform shared memory multiprocessing architecture using programming languages like C/C++ and Fortran, including Linux/Unix and Microsoft Windows platforms. It consists of a set of compiler directives, library routines, and run-time environment variables.

The OpenMP standards are jointly defined by a group of major computer hardware and software vendors so as to give shared-memory parallel programmers a simple and flexible interface for developing parallel applications.

The advantages of using OpenMP are simple, incremental parallelism and unified code for both serial and parallel applications.

The disadvantages of using OpenMP are currently only running efficiently in shared-memory multiprocessor platforms, requiring a compiler that supports OpenMP, and low parallel efficiency.

The performance expectation of OpenMP is that one may expect to get N times less wall clock execution time (or N times speedup) when running a program parallelized using OpenMP on a N processor platform.

As a reminder, OpenMP can often be used to improve performance on symmetric multi-processor (SMP) machines by simply adding a few compiler directives to the program code.

______________________________________________________________________

(B) History of gfortran with OpenMP

1. Objective

gfortran attempts to be OpenMP Application Program Interface v3.0 compatible when invoked with the -fopenmp option. gfortran then generates parallelized code according to the OpenMP directives used in the source. The OpenMP Fortran runtime library routines are provided both in a form of two Fortran 90 modules, named omp_lib and omp_lib_kinds, and in a form of a Fortran include file named omp_lib.h.

The GOMP project was GCC's OpenMP implementation project. The code was merged into mainline to become part of GCC 4.2.

The GOMP release will include a support library, libgomp, and extensions to target language parsers. The initial focus is on implementing the basic syntax of GOMP in the C, C++, and Fortran 95 frontends, to be followed by specific implementations for different platforms. A long-term goal is the generation of efficient and small code for OpenMP applications.

2. History

May 2, 2017 Version 4.5 of the OpenMP is now partially supported in the Fortran compiler in the GCC 7.1 release; the largest missing item is structure element mapping.

November 27, 2015 The final OpenMP v4.5 specification has been released.
 
July 16, 2014 An implementation of the OpenMP v4.0 parallel programming interface for Fortran has been added in the GCC 4.9.1 release.
 
July 23, 2013 The final OpenMP v4.0.0 specification has been released.
 
August 2, 2011 An implementation of the OpenMP v3.1 parallel programming interface for C, C++ and Fortran has been added. The OpenMP 3.1 is supported since GCC 4.7. Code was contributed by Jakub Jelinek of Red Hat, Inc. and Tobias Burnus.
 
July 9, 2011 The final OpenMP v3.1 specification has been released.
 
June 6, 2008 The gomp-3_0-branch has been merged into SVN mainline, so GCC 4.3 and later will feature OpenMP v3.0 support. An implementation of the OpenMP v3.0 parallel programming interface for C, C++ and Fortran has been added. Code was contributed by Jakub Jelinek, Richard Henderson and Ulrich Drepper of Red Hat, Inc.
May 12, 2008 The final OpenMP v3.0 specification has been released.
 
October 22, 2007 Draft of the OpenMP v3.0 specification has been released for public review, the gomp-3_0-branch branch has been created in SVN and work began on implementing v3.0 support.

March 9, 2006 Richard Henderson, Jakub Jelinek and Diego Novillo of Red Hat Inc, and Dmitry Kurochkin have contributed an implementation of the OpenMP v2.5 parallel programming interface for C, C++ and Fortran.

February 14, 2006 Jakub Jelinek committed the front end support for OpenMP.

November 18, 2005 The branch is ready to be merged into mainline. All three front ends are functional and there should not be many corners of the standard left to implement. There are 5 main modules to merge into mainline: (1) runtime library, (2) code generation, (3) C front end, (4) C++ front end, and, (5) Fortran front end.

October 20, 2005 The runtime library is functionally complete. The syntax parsers for C, C++ and Fortran are complete, though there are still dusty corners with respect to semantic translation to be resolved. Adventurous users who don't mind the compiler crashing on every other source file are encouraged to begin filing bugs.

3. Requirements of OpenMP in gfortran (v2.0)

The foremost goal is correctness; programs compiled with OpenMP must operate as expected according to programming language and OpenMP standards.

For Operation with compiler like gfortran, it is expected to follow the requirements below.

  • OpenMP directives will act as comments (F95) or unimplemented pragmas (C/C++) unless the -fopenmp option is specified on the compiler command line.
  • When -fopenmp is used, the compiler will generate parallel code based on the OpenMP directives encountered.
  • By default, GOMP will use the threading model specified by the -enable-thread=xxx configuration option.
  • When -fopenmp=stubs is used, the compiler will generate OpenMP code that links with a minimal stub library.
  • The design will consider, but might not implement, link-time selection of a threading model, via an -fopenmp=model syntax.
  • By default, the compiler will define the option -fno-OpenMP.
  • If -fopenmp is not specified, the compiler will generate a serial program that operates as if the OpenMP directives were non-existent.
  • Linking to the OpenMP support library will be automatic if -fopenmp is specified.
  • Array assignments and in WHERE is now run in parallel when OpenMP's WORKSHARE is used.
  • -fopenmp implies -frecursive, i.e., all local arrays will be allocated on the stack. When porting existing code to OpenMP, this may lead to surprising results, especially to segmentation faults if the stacksize is limited.

4. Future of OpenMP in gfortran

gfortran will probably support coarrays first using GCC's OpenMP infrastructure, i.e. coarrays will initially be implemented with threads and work only on shared-memory systems. [Later, distributed-memory support is likely to follow.] Some experts mention that coarrays replicating everything have certain limitations on memory-sharing computers.

5. Some common known issues

The compilers support OpenMP 32-bit and 64-bit operating system environment.

  • On Windows platform, OpenMP does not support THREADPRIVATE.
  • In some cases, the 'default(shared)' statement causes compilation errors, it is a good practice to replace with 'shared(a,b,c...)'.
  • -fopenmp implies -frecursive, i.e. all local arrays will be allocated on the stack. When porting existing code to OpenMP, this may lead to surprising results, especially to segmentation faults if the stack size is limited.
  • GOMP_CPU_AFFINITY, GOMP_DEBUG, GOMP_SPINCOUNT, GOMP_RTEMS_THREAD_POOLS and GOMP_STACKSIZE are GNU OpenMP Environment variables extensions.
  • OpenMP support now uses self-contained objects provided by Newlib <sys/lock.h> and offers a significantly better performance compared to the POSIX configuration of libgomp. It is possible to configure thread pools for each scheduler instance via the environment variable GOMP_RTEMS_THREAD_POOLS.
  • Final and/or Mergeable Clauses are not supported.
  • On glibc-based systems, OpenMP enabled applications can not be statically linked due to limitations of the underlying pthreads-implementation.
  • The OpenMP keywords should start one space to the right of !$OMP.
  • If Windows application does not run, link with the option -mwindows.
  • Full single-image support except for polymorphic coarrays. Initial support for multi-images via an of MPI-based coarray communication library. Note: Remote coarray access is not yet possible.
  • Environment variable OMP_PLACES=cores (which is not supported by gfortran on Windows).

______________________________________________________________________

(C) Compiling and running sample gfortran OpenMP programs (to be verified by contributor soon)

1. Fortran sample program

The program file hello.f contains the following program statements:

     PROGRAM HELLO
       USE omp_lib
       IMPLICIT NONE
       INTEGER OMP_GET_MAX_THREADS
       INTEGER OMP_GET_NUM_THREADS
       INTEGER OMP_GET_THREAD_NUM
       write(6,"(a, i3)") " OpenMP max threads: ", OMP_GET_MAX_THREADS()
     !$OMP PARALLEL
       write(6,"(2(a,i3))") " OpenMP: N_threads = ", &
            &   OMP_GET_NUM_THREADS()," thread = ", OMP_GET_THREAD_NUM()
     !$OMP END PARALLEL
     END PROGRAM
Note : A call to OMP_GET_NUM_THREADS within a parallel regions is used to check how many threads are executing within that parallel region.
 
2. Compile the gfortran OpenMP program
 
Type the following command:
 
   gfortran hello.f -o hello.exe -fopenmp
 
3. Run the gfortran OpenMP program

Type the following command:

 setenv OMP_NUM_THREADS 2
 hello.exe

[Note : Sometimes you may need to use "EXPORT OMP_NUM_THREADS=2" instead. If you are using the Windows platform, please use "set" (ie. set OMP_NUM_THREADS=2) instead of "setenv".]

The expected result or output :

   OpenMP max threads:   2
   OpenMP: N_threads =   2 thread =   0
   OpenMP: N_threads =   2 thread =   1

If you want to have more sample OpenMP programs, please refer to Using OpenMP - Portable Shared Memory Parallel Programming book by Barbara Chapman, Gabriele Jost and Ruud van der Pas.

4. Download the gfortran OpenMP under Windows platform from this web site

If you have any problems about setting up the Windows environment, please refer to GCC Wiki gfortran.

Alternatively, you can also download the 32-bit and 64-bit binaries for gfortran under Windows and Linux environment including OpenMP 4.5 features with manuals and debugger from this web site.

5. OpenMP Tutorial by Blaise Barney, Lawrence Livermore National Laboratory.
 
 
OpenMP 4.5 API Fortran Syntax Quick Reference Card is also a good quick reference guide.

______________________________________________________________________

(D) How to avoid the common mistakes in writing OpenMP programs and to think parallel

The "think-parallel" article cites a presentation of James Reinders, Intel's direct and chief evangelist at the SD West 2009 conference in Santa Clara, California. Reinders suggested eight rules for developers:

a. Think parallel
b. Program using abstraction
c. Program tasks, not threads
d. Design with the option of turning off concurrency
e. Avoid locks when possible
f. Use tools and libraries designed to help with concurrency
g. Use scalable memory  
h. Design to scale through increased workloads
______________________________________________________________________

(E) Runtime library status with OpenMP features

The Input/Output library is not thread-safe. This will have to be fixed because OpenMP allows threaded IO.

______________________________________________________________________

(F) A simple Hybrid OpenMP/MPI testing program (to be verified by contributor soon)

gfortran can link with MPICH, MPICH2 and LAM/MPI.

The program file hybridhello.f contains the following program statements:

     PROGRAM HYBRIDHELLO
       USE omp_lib
       IMPLICIT NONE
       INCLUDE 'mpif.h'
       INTEGER size, rank, ierr
       INTEGER OMP_GET_NUM_THREADS
       INTEGER OMP_GET_THREAD_NUM
       CALL mpi_init(ierr)
       CALL mpi_comm_size(MPI_COMM_WORLD, size, ierr)
       CALL mpi_comm_rank(MPI_COMM_WORLD, rank, ierr)
     !$OMP PARALLEL
       write(6, "(4(a,i3))") " MPI: size = ", size, " rank = ", rank, &
            &    " OpenMP: N_threads = ", OMP_GET_NUM_THREADS(), &
            &    " thread = ", OMP_GET_THREAD_NUM()
     !$OMP END PARALLEL
       CALL mpi_finalize(ierr)
     END PROGRAM

The expected result or output (2 MPI processes, 2 threads):

   MPI: size =   2 rank =   0 OpenMP: N_threads =   2 thread =   0
   MPI: size =   2 rank =   0 OpenMP: N_threads =   2 thread =   1
   MPI: size =   2 rank =   1 OpenMP: N_threads =   2 thread =   0
   MPI: size =   2 rank =   1 OpenMP: N_threads =   2 thread =   1

As a reminder, you may need to include the MPI linker library during compilation, for example, "-llam -lmpi" for LAM/MPI.

______________________________________________________________________
 
Remark : This page contains information on GCC's OpenMP standard and related functionality like the auto parallelizer (-ftree-parallelize-loops).
 
Contributor : Henry Kar Ming Chan, High Performance Computing Specialist in Scientific Computation.

My research interest is Parallel Optimization in Engineering using OpenMP and gfortran.

All comments are welcome. Please feel free to contact the contributor at karminghenry@gmail.com