OpenMP is an Application Program Interface (API) for parallel programming in C, C++, and Fortran on shared-memory machines. It is maintained by the OpenMP Architecture Review Board (ARB) and supported on a wide array of system architectures and operating systems. It is generally easier to use but less versatile than MPI (though many programmers use a hybrid approach, where shared-memory parallelism is achieved via OpenMP and distributed memory communication is achieved through MPI).
OpenMP programs use "fork-join" parallelism, where the program executes sequentially until directed to create parallel ("slave") threads. When a parallel section of the program is encountered, such as a large do or for loop, the iterations of the loop can be split up and executed in parallel. The parallel section can then be exited, after which sequential execution resumes until the next parallel section is encountered. The user can change the number of parallel threads for each parallel section. Note that this is fundamentally different from MPI, where all processes execute from the beginning of the program to the end.
OpenMP is implemented primarily through compiler directives, which can be easily added to an existing serial C or Fortran program. The directives look like comment statements, and the compiler can be told to ignore or activate them. This means that the conversion of a sequential code to one that takes full advantage of OpenMP can be done in stages, and at every step the user has a working code with improved performance. OpenMP also provides library functions and environment variables that can be used in a program.
OpenMP directives are comments in source code that specify parallelism. In C or C++, these are specified with the #pragma omp sentinel; in Fortran, they are specified with the !$OMP, C$OMP or *$OMP sentinels. There are a variety of directives, which are used for different thread behavior:
Directive behavior can be adjusted through the use of clauses, such as (but not limited to) the following:
OpenMP provides a few simple functions that can be called in a program to guide execution:
|omp_get_num_threads()||Returns the number of threads in team|
|omp_get_thread_num()||Returns the ID (0 to n-1) for the thread calling it|
|omp_get_num_procs()||Returns number of machine CPUs|
|omp_in_parallel()||True if in parallel region and multiple threads are executing|
|omp_set_num_threads(n)||Set number of threads for a parallel region to n|
Shared memory can be difficult to manage when multiple threads need to edit the same shared variable. For example, assume that a program uses two threads (labeled T1 and T2) and each is supposed to increment (add 1 to) the variable x. Then each thread needs to read the current value of x, calculate its new value, and write that value to x. However, the order in which these steps occur - which cannot be controlled by the program - affects the resulting value of x:
1. T1 reads x=0
2. T1 calculates x=0+1=1
3. T1 writes x=1
4. T2 reads x=1
5. T2 calculates x=1+1=2
6. T2 writes x=2
1. T1 reads x=0
2. T2 reads x=0
3. T1 calculates x=0+1=1
4. T2 calculates x=0+1=1
5. T1 writes x=1
6. T2 writes x=1
This is an example of a "race condition" - where two threads are "racing" to complete and task and the order in which they finish affects the result of the program. OpenMP has some mechanisms to avoid these situations, as described below. Note, however, that these mechanisms have the effect of serializing portions of the program, eliminating the performance benefit of multithreading, so their use should be minimized when possible.
To compile an OpenMP program, you need to add flags to your compiler command:
The OpenMP Architecture Review Board (ARB) provides instructions for a wide array of other compilers.
To compile these examples, see the Compiling section, above. Note that you will have to load the compiler module in order for the compiler command to work; for instructions, see the Software and Compilers section, or the Examples section, of the documentation for the system that you want to compile on.