OpenMP Overview
This document is intended as a true introduction. In other words,
it is a short presentation of enough information to allow you to decide
whether to pursue an interest in OpenMP.
OpenMP enables parallel programming: OpenMP is a system for setting up
a computation to be carried out in parallel. OpenMP is designed
to take advantage of the current trend in chip design, which places
multiple cores on a single chip. Each core can rapidly execute a part of a common
program, working on data stored on the chip.
OpenMP works on shared memory systems: In a shared memory system,
a number of cores on a single chip cooperate to process a common set of data.
Execution can be fast, because communication and data access is very local.
But the number of parallel threads that can cooperate is limited to the
number of cores on the chip. Currently, dual and quad core chips are common.
Chips with more cores are coming soon; there are also tightly coupled
multichip systems which OpenMP can treat as a single system.
OpenMP is implemented through compiler directives: the necessary
directives can be added to an existing C or Fortran program.
The directives look like comment statements, and the compiler can be
told to ignore or activate them. Thus, a single file can embody both
the original sequential and the parallel versions of a program.
The main thread "gets help" from other threads: In the OpenMP system,
the program begins executing sequentially, that is, with a single thread.
When a parallel section of the program is encountered, such as a large
do or for loop, the iterations of the loop are split up
and executed in parallel by multiple threads. Sequential execution by a
single thread then resumes until the next parallel section is encountered.
Data is universally shared: the program data is
available in a single location, which means there is very little
communication overhead. Each thread is free to read and write
from the common data area.
The potential for delays or conflicts must be handled by the
programmer, who can remove some data from the common area, or
control the manner in which it is accessed.
Converting an Existing Code Can Be Done in Steps: The OpenMP
compiler directives are applied to loops. The user can add these
directives to a single loop at a time and experiment with the settings.
This means that the conversion of a sequential code to one that
takes full advantage of OpenMP can be done in stages, and at every
step the user has a working code, and, one hopes, a code with
improved performance. In some cases, a loop may need to be rewritten
to remove data dependencies that are hampering OpenMP. If this work
is necessary, the revised code should still execute correctly in both
sequential and parallel versions. Thus, OpenMP is a very friendly
environment for experimenting with parallel programming.
MPI is an Alternative: Another attractive approach to parallel programming
is called MPI. MPI is used when a distributed memory system
is available. Such a system usually involves a cluster of computers
connected by a moderately fast communication network. These systems can
involve thousands of processors. MPI programs are typically very different
from the corresponding sequential version. They often require that the
problem data be distributed across the processors, and that processors
communicate using "messages", which involve calls to send and
receive functions.
For information on using OpenMP on the VT-ARC SGI Systems, see
Parallel Programming Using OpenMP.
|