教程:https://enccs.github.io/intermediate-mpi/mpi-and-threads-pt1/#
教程配套代码:https://github.com/ENCCS/intermediate-mpi/tree/master/content/code
Comparing pure MPI(一个rank一个线程) vs hybrid MPI-threading(一个rank多个线程) solutions. MPI ranks are shown in red boxes. Total memory usage and message cost tends to be lower with hybrid, because threads can share the same memory. However, realizing those benefits can lead to further work to reduce contention and eliminate race conditions.
MPI support for threading
Since version 2.0, MPI can be initialized in up to four different ways. The former approach using MPI_Init
still works, but applications that wish to use threading should use MPI_Init_thread
.
int MPI_Init_thread(int *argc, char ***argv, int required, int *provided)
The argc
and argv
may be NULL
(and generally should be). required
describes the level of threading support that is requested, and the value returned in *provided
describes the level that the MPI runtime was able to provide. If this is not the level required, the program should inform the user and either use threading only at the level provided, or MPI_Finalize
and e.g. exit()
.
The following threading levels are generally supported:
MPI_THREAD_SINGLE
- rank is not allowed to use threads, which is basically equivalent to callingMPI_Init
.With
MPI_THREAD_SINGLE
, the rank may use MPI freely and will not use threads.MPI_THREAD_FUNNELED
- rank can be multi-threaded but only the main thread may call MPI functions. Ideal for fork-join parallelism such as used in#pragma omp parallel
, where all MPI calls are outside the OpenMP regions.MPI_THREAD_SERIALIZED
- rank can be multi-threaded but only one thread at a time may call MPI functions. The rank must ensure that MPI is used in a thread-safe way. One approach is to ensure that MPI usage is mutually excluded by all the threads, eg. with a mutex.*With
MPI_THREAD_SERIALIZED
, the rank can use MPI from any thread so long as it ensures the threads synchronize such that no thread calls MPI while another thread is doing soMPI_THREAD_MULTIPLE
- rank can be multi-threaded and any thread may call MPI functions. The MPI library ensures that this access is safe across threads. Note that this makes all MPI operations less efficient, even if only one thread makes MPI calls, so should be used only where necessary.With
MPI_THREAD_MULTIPLE
, the rank can use MPI from any thread. The MPI library ensures the necessary synchronization
Querying the MPI runtime
threading level
When writing a library, sometimes MPI will be initialized outside your code. If you wish to use threading, you have to honor the requirements established at the time MPI was initialized (or give an error). This can be done with MPI_Query_thread
.
int MPI_Query_thread(int *provided)
The value returned in *provided
describes the level that the MPI runtime is providing. If this is not the level required, the library should inform the user and either use threading only at the level provided, or return an error to its caller.
It is possible to influence the threading support available from some MPI implementations with environment variables, so it can be wise to use such a method even if your code is managing the call to MPI_Init_thread
.
main thread
Similarly, MPI regards the thread that called MPI_Init_thread
as the main thread for the purpose of MPI_THREAD_FUNNELED
. If your code needs to identify that thread (eg. to ensure that calls to your library happen from that thread, so you use MPI), then you need to call MPI_Is_thread_main
.
int MPI_Is_thread_main(int *flag)
A boolean value is returned in *flag
to indicate whether the thread that called MPI_Is_thread_main
is the main thread, ie. the one that called MPI_Init_thread
.
code example
code:https://github.com/ENCCS/intermediate-mpi/tree/master/content/code/day-4/00_threading-query
Try to compile with:
mpicc -g -Wall -fopenmp -std=c11 threading-query.c -o threading-query
When you have the code compiling, try to run with:
mpiexec -np 2 ./threading-query
code example
code: https://github.com/ENCCS/intermediate-mpi/tree/master/content/code/day-4/10_integrate-pi
Try to compile with:
mpicc -g -Wall -fopenmp -std=c11 pi-integration.c -o pi-integration
try to run with:
export OMP_NUM_THREADS=2
mpiexec -np 2 ./pi-integration 10000000