Using Auto-Parallelization Compiler Options On SGI Systems
On the VT SGI systems, the easiest approach for parallelizing an existing
application is to use the automatic parallelization
option included with the Intel Fortran or C compilers.
For example to compile the code in the file "my_prog.f" with the Intel Fortran compiler and enable it to
create parallel code from paralelizable loops within the program code, enter:
ifort my_prog.f -fpp -parallel -par_report3
Note: A call to the Fortran preprocessor "fpp" is required in addition to the invocation of the
-parallel option. The -par_report3 option results in generation of diagnostics which indicate the
reason why loops in the code had not been parallelized.
In a similar fashion you could compile a C program for parallelization:
icc my_prog.c -parallel -par_report3
Note: the call to a preprocessor is not required as it is done
by default when using the icc command.
Here are some example outputs resulting from using the "-par_report3" option with a test code:
serial loop: line 27: not a parallel candidate due to insufficent work
serial loop: line 35: not a parallel candidate due to statement at line 36
serial loop: line 50: not a parallel candidate due to the loop trip being
uncountable
/tmp/ifortfhBt3t.f(18) : (col. 6) remark: LOOP WAS AUTO-PARALLELIZED.
parallel loop: line 18
The above statements indicate that the code in line 18 was
parallaleized, while code in lines 27, 35, and 50 had not been;
line 27 not being parallelized as there was not sufficient computing
involved (typically the result of a small loop size) to justify
parallelization of this portion of the code, line 35 was not
parallelized as it contained code which could not be parallelized (e.g.,
function call or I/O operation), and line 50 contained a loop whose size
was defined by a variable whose value was not known by the program, and
thus the program could not determine whether the loop was large enough
to justify parallelization.
Note: The line numbers displayed by the "-par_report3"
option will not necessarily match the line numbers in your original file
as they correspond to the line numbers associated with the Fortran
preprocessor code. You can create a copy of the preprocessor code by including the -E option
when you compile your code;
e.g., to write the code generated by the preprocessor to the file "my_prog.fpp", you could use the
following command:
ifort my_prog.f -fpp -E >my_prog.fpp
To run the above compiled code, first define the OMP_NUM_THREADS and then
invoke the name of the file containing the compiled code -- since the -o option had not been
specified in the above compilation examples, the default name "a.out"
will be assigned to the file containing the results of these compilations.
Using the Bourne, Korn, or
Bash shells, you could set the environmental variable to use four
processors and execute the compiled program "a.out" by entering the
following pair of commands:
export OMP_NUM_THREADS=4
./a.out
Note: C Shell and C Shell derivatives would use:
setenv OMP_NUM_THREADS 4
./a.out
|