mpi block send and receive

https://mpitutorial.com/tutorials/mpi-send-and-receive/

Overview of sending and receiving with MPI

A->B

MPI’s send and receive calls operate in the following manner. First, process A decides a message needs to be sent to process B. Process A then packs up all of its necessary data into a buffer for process B. These buffers are often referred to as envelopes since the data is being packed into a single message before transmission (similar to how letters are packed into envelopes before transmission to the post office). After the data is packed into a buffer, the communication device (which is often a network) is responsible for routing the message to the proper location. The location of the message is defined by the process’s rank.

Even though the message is routed to B, process B still has to acknowledge that it wants to receive A’s data. Once it does this, the data has been transmitted. Process A is acknowledged that the data has been transmitted and may go back to work.

更准确来说：

the MPI specification says that MPI_Send blocks until the send buffer can be reclaimed.

This means that MPI_Send will return when the network can buffer the message.

If the sends eventually can’t be buffered by the network, they will block until a matching receive is posted.

however, a big enough network buffer should never be assumed.

Sometimes there are cases when A might have to send many different types of messages to B. Instead of B having to go through extra measures to differentiate all these messages, MPI allows senders and receivers to also specify message IDs with the message (known as tags). When process B only requests a message with a certain tag number, messages with different tags will be buffered by the network until B is ready for them.

MPI_Send(
    void* data,
    int count,
    MPI_Datatype datatype,
    int destination,
    int tag,
    MPI_Comm communicator)
MPI_Recv(
    void* data,
    int count,
    MPI_Datatype datatype,
    int source,
    int tag,
    MPI_Comm communicator,
    MPI_Status* status)

The first argument is the data buffer.

The second and third arguments describe the count and type of elements that reside in the buffer. MPI_Send sends the exact count of elements, and MPI_Recv will receive at most the count of elements (more on this in the next lesson).

The fourth and fifth arguments specify the rank of the sending/receiving process and the tag of the message.

The sixth argument specifies the communicator and the last argument (for MPI_Recv only) provides information about the received message.

Elementary MPI datatypes

The MPI_Send and MPI_Recv functions utilize MPI Datatypes as a means to specify the structure of a message at a higher level.

MPI datatype	C equivalent
MPI_SHORT	short int
MPI_INT	int
MPI_LONG	long int
MPI_LONG_LONG	long long int
MPI_UNSIGNED_CHAR	unsigned char
MPI_UNSIGNED_SHORT	unsigned short int
MPI_UNSIGNED	unsigned int
MPI_UNSIGNED_LONG	unsigned long int
MPI_UNSIGNED_LONG_LONG	unsigned long long int
MPI_FLOAT	float
MPI_DOUBLE	double
MPI_LONG_DOUBLE	long double
MPI_BYTE	char

you will learn how to create your own MPI datatypes for characterizing more complex types of messages.

example program: ring

In this example, a value is passed around by all processes in a ring-like fashion

// Author: Wes Kendall
// Copyright 2011 www.mpitutorial.com
// This code is provided freely with the tutorials on mpitutorial.com. Feel
// free to modify it for your own use. Any distribution of the code must
// either provide a link to www.mpitutorial.com or keep this header intact.
//
// Example using MPI_Send and MPI_Recv to pass a message around in a ring.
//
#include <mpi.h>
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char** argv) {
  // Initialize the MPI environment
  MPI_Init(NULL, NULL);
  // Find out rank, size
  int world_rank;
  MPI_Comm_rank(MPI_COMM_WORLD, &world_rank);
  int world_size;
  MPI_Comm_size(MPI_COMM_WORLD, &world_size);

// The ring program initializes a value from process zero, and the value is passed around every single process. 
// The program terminates when process zero receives the value from the last process. 

// MPI_Send and MPI_Recv will block until the message has been transmitted.
// Because of this, the printfs should occur by the order in which the value is passed. 
// block的含义：A发消息给B，A会一直block在MPI_Send上直到收到B给A的acknowledgement，此时A继续执行MPI_Send之后的代码

// 代码阅读方式：点中心
// 先假装自己world_rank!=0，看程序是啥逻辑 // 会block在MPI_Recv上知道收到它上家传来的token，然后它会把收到的token发送给下家，它会block在MPI_Send上直到它下家确认收到了，然后它就结束
// 然后假装自己world_rank==0，看程序是啥逻辑 // 它会给下家发送token，发送完了之后就等待接收上家的消息：block在MPI_Recv上，收到之后就结束

  int token;
  // Receive from the lower process and send to the higher process. Take care
  // of the special case when you are the first process to prevent deadlock.
  if (world_rank != 0) {
    MPI_Recv(&token, 1, MPI_INT, world_rank - 1, 0, MPI_COMM_WORLD,
             MPI_STATUS_IGNORE);
    printf("Process %d received token %d from process %d\n", world_rank, token,
           world_rank - 1);
  } else {
    // Set the token's value if you are process 0
    token = -1;
  }
  MPI_Send(&token, 1, MPI_INT, (world_rank + 1) % world_size, 0,
           MPI_COMM_WORLD);
  // Now process 0 can receive from the last process. This makes sure that at
  // least one MPI_Send is initialized before all MPI_Recvs (again, to prevent
  // deadlock)
  if (world_rank == 0) {
    MPI_Recv(&token, 1, MPI_INT, world_size - 1, 0, MPI_COMM_WORLD,
             MPI_STATUS_IGNORE);
    printf("Process %d received token %d from process %d\n", world_rank, token,
           world_size - 1);
  }
  MPI_Finalize();
}