Click here to Skip to main content
15,887,214 members
Please Sign up or sign in to vote.
1.00/5 (4 votes)
See more:
Task: Compose a program using blocking and non-blocking operations according to the variant. Ensure that operations are executed in several processes. The distribution of initial data must be performed using non-blocking operations, and the collection of results must be performed using blocking operations.
b=min(A+C)

UPD:
Made one rank check in the code

#include <mpi.h>
#include <iostream>
#include <cstdlib>
#include <ctime>
#include <cfloat>
#include <algorithm>

#define N 13

int main(int argc, char* argv[])
{
    int rank, size;
    double* A = 0, * C = 0, localMin = DBL_MAX;

    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    int pSize = N / size;      // Number of data points per process
    int remainder = N % size;  // Distribute remaining data points evenly

    // Consider remaining data points
    if (rank < remainder) {
        pSize++;
    }

    std::cout << "Proc" << rank << ":  pSize=" << pSize << std::endl;

    // Master process: distribution of data to processes (asynchronous)
    MPI_Request* requestA = new MPI_Request[size - 1];
    MPI_Request* requestC = new MPI_Request[size - 1];

    if (rank == 0) {
        srand((unsigned)time(0));
        A = new double[N];
        C = new double[N];
        for (int i = 0; i < N; i++) {
            A[i] = (rand() % 20) / 2.;
            C[i] = (rand() % 20) / 2.;
            std::cout << i << ". sum:" << A[i] + C[i] << std::endl;
        }

        int offset = pSize;
        for (int i = 1; i < size; i++) {
            int send_count = (remainder == 0 || i < remainder) ? pSize : pSize - 1;
            MPI_Isend(A + offset, send_count, MPI_DOUBLE, i, 0, MPI_COMM_WORLD, &requestA[i - 1]);
            MPI_Isend(C + offset, send_count, MPI_DOUBLE, i, 0, MPI_COMM_WORLD, &requestC[i - 1]);
            offset += send_count;
        }

        MPI_Waitall(size - 1, requestA, MPI_STATUSES_IGNORE);
        MPI_Waitall(size - 1, requestC, MPI_STATUSES_IGNORE);

        double globalMin = localMin;

        for (int i = 0; i < pSize; i++) {
            double temp = A[i] + C[i];
            globalMin = std::min(globalMin, temp);
        }

        for (int i = 1; i < size; i++) {
            double receivedMin;
            // Collect data from other processes (blocking)
            MPI_Recv(&receivedMin, 1, MPI_DOUBLE, i, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
            globalMin = std::min(globalMin, receivedMin);
        }
        std::cout << "Minimum min(A+C) = " << globalMin << std::endl;
    }
    else {
        // Blocking receive from A and C
        A = new double[pSize];
        C = new double[pSize];
        MPI_Recv(A, pSize, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
        MPI_Recv(C, pSize, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);

        for (int i = 0; i < pSize; i++) {
            double temp = A[i] + C[i];
            localMin = std::min(localMin, temp);
        }

        // Send local result of the process to master
        MPI_Send(&localMin, 1, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD);
    }

    delete[] A;
    delete[] C;
    delete[] requestA;
    delete[] requestC;
    MPI_Finalize();
}


What I have tried:

I've tried everything and I don't know what's wrong.
Posted
Updated 11-Nov-23 0:33am
v10
Comments
merano99 10-Nov-23 15:11pm    
"UPD: the teacher said that there should be only one check if(rank==0) - is it realistic to do so?"
To implement an MPI program with only one check if(rank==0), you could execute the same code in both the if and else branches. I think this is (gross) nonsense, but of course it would work.
The program will be longer, more confusing and probably more difficult to maintain, but if there is a good grade for it, I wouldn't argue with him.
Of course, it could also be that the teacher wants to see if you can do it yourself, because no one will do that for you.
w4de 11-Nov-23 6:35am    
I've updated the code in the thread, please see if I've done one rank check correctly and in general if there are any errors somewhere else in the code. Thank you for helping me.
merano99 11-Nov-23 9:46am    
It seems to work for now, but there are still shortcomings, here are some examples:
1. When declaring the MPI_Request variables, memory and effort should be reduced.
2. The calculation of globalMin should be implemented in such a way that time is not wasted.
3. Many comments are missing that could improve understanding.
4. Almost all evaluations of return values are missing

And also remember to confirm all solutions that were useful, unfortunately this has been forgotten so far.
w4de 15-Nov-23 6:02am    
he asked - what command for receiving data should work in parallel with Isend? and said that the collection on 0 with the help of blockers should be............
w4de 16-Nov-23 1:46am    
help please

Well, this "question" is unanswerable as there is nothing to work with at all. "It's still broken in some way" (whatever IT is) is not anything anyone can work with.
 
Share this answer
 
You have tried to implement a tiny part of the suggestions, but unfortunately neither completely nor correctly.

The process with rank 0, for example, is usually responsible for distributing the task.
You are now trying to distribute N data to size processes as follows.

C++
if (!rank)
    {
    ...
    if (rank < remainder) {
       pSize++;
    }


It already starts with the fact that it is unhealthy to write if(!rank). Here, too, it probably goes wrong again. What value does pSize have and how much data is distributed if N=20 and size=8?

I had suggested calculating pSize per process ...

Then the process with rank 0 should also complete its task at the same time as all the others, but in your case it first waits for all of them and only then does process 0 start working, simply bad.

C++
MPI_Waitall(3, request, MPI_STATUSES_IGNORE);
    
for (int i = (size - 1) * pSize; i < N; i++) {
   double temp = A[i] + C[i];
   localMin = min(localMin, temp);
}


I had suggested here

https://www.codeproject.com/Answers/5370748/I-dont-understand-how-to-fix-the-MPI-code-for-the#answer2

a (usual) MPI flow that avoids code redundancy, keeps everything parallel and would completely avoid the need for Wait. It would be good to use this route.

I therefore only refer to my suggestion again. Your program has so many quirks that it is obvious that the teacher cannot be satisfied with it.

// edit:
There are various problems with your design. Since more help seems to be needed here, I'll provide a slightly longer draft. I have intentionally left out some details so that it does not become a copy & paste solution. The design distributes a possible remainder to the first processes, lets process 0 be calculated and does the sending asynchronously and blocking the receiving. I also tested the following framework with mpic++ under Ubuntu.
C++
#define N 13

int main(int argc, char* argv[])
{
    int rank, size;
    double *A, *C;
    double localMin = DBL_MAX;

    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);

    srand((unsigned)time(0));

    // Master process: Initialization of the data
    if (rank == 0) {
        A = new double[N];
        C = new double[N];

        for (int i = 0; i < N; i++) {
            A[i] = (rand() % 20) / 2.;
            C[i] = (rand() % 20) / 2.;
            cout << i << ". sum:" << A[i] + C[i] << endl;
        }
    }

    int pSize = N / size;      // Number of data points per process
    int remainder = N % size;  // Distribute remaining data points evenly

    // Consider remaining data points
    if (rank < remainder) {
        pSize++;
    }

   cout << "Proc" << rank << ":  pSize=" << pSize << endl;

   // Master process: distribution of data to processes (asynchronous)
   // Note: Alternatively, you could also use scatter
   MPI_Request requestA[size - 1], requestC[size - 1];
   if (rank == 0) {
        int offset = pSize;
        for (int i = 1; i < size; i++) {
            int send_count = (remainder == 0 || i < remainder) ? pSize : pSize - 1;
            MPI_Isend(A + offset, ...);
            MPI_Isend(C + offset, ...);
            offset += send_count;
        }
    }
    else {
        // Blocking receive from A and C
        A = new double[pSize];
        C = new double[pSize];
        MPI_Recv(A, pSize, ...);
        MPI_Recv(C, pSize, ...);
    }

    for (int i = 0; i < pSize; i++) {
        double temp = A[i] + C[i];
        localMin = min(localMin, temp);
    }

    // Collecting the local minima (blocking)

    // 1. with MPI_Reduce (optimized solution was not wanted
    
    // 2. with blocking MPI_Send and MPI_Recv instead of MPI_Reduce
    if (rank == 0) {
        // Wait for completion of the MPI_Isend processes (optional here)
        MPI_Waitall(size - 1, requestA, MPI_STATUSES_IGNORE);
        MPI_Waitall(size - 1, requestC, MPI_STATUSES_IGNORE);

        double globalMin = localMin;
        for (int i = 1; i < size; i++) {
            double receivedMin;
            // Collect data from other processes (blocking)
            MPI_Recv(&receivedMin, ...);
            globalMin = min(globalMin, receivedMin);
        }
        cout << "Minimum min(A+C) = " << globalMin << endl;
    }
    else {
       // Send local result of the process to master
       MPI_Send(&localMin, 1, MPI_DOUBLE, 0, 0, MPI_COMM_WORLD);
    }

    delete[] A;
    delete[] C;
    MPI_Finalize();
}

With N=13 and 5 Processes i get the Output
Proc2:  pSize=3
0. sum:4
1. sum:9
2. sum:15.5
3. sum:13
4. sum:2.5
5. sum:8
6. sum:5
7. sum:13
Proc1:  pSize=3
8. sum:12.5
9. sum:10
10. sum:10.5
11. sum:2.5
12. sum:17.5
Proc0:  pSize=3
Proc4:  pSize=2
Proc3:  pSize=2
Minimum min(A+C) = 2.5
 
Share this answer
 
v3
Comments
w4de 8-Nov-23 5:04am    
I've updated the thread, please take a look
merano99 8-Nov-23 18:27pm    
I would not assume that it works now, because the calculation if (rank < hvost) { pSize++; } still does not calculate an individual size for each process. I would also not let (size - 1) processes calculate, but let process 0 participate in the calculation. This would also work if there is only one process. We are still missing concrete questions or at least your description of what does not work from your point of view.
w4de 10-Nov-23 1:48am    
Then how to calculate the remainder correctly? I specifically don't understand what doesn't work myself, since the instructor just said that blocking and non-blocking operations don't work correctly....
merano99 10-Nov-23 6:35am    
I have added an extended program framework to get further here.
w4de 10-Nov-23 6:38am    
Ok, I'll try it soon and write about the results. Thank you so much for your help!
If your teacher is not specifying the exact issue in your code, it might be helpful to review your code and see if there are any possible errors or improvements that can be made. You can consult the link to figure it out and complete your assignment.
Blocking and Non-Blocking Algorithms – MC++ BLOG[^]
 
Share this answer
 
Comments
merano99 7-Nov-23 18:05pm    
The linked article deals with C++ threads, but the questioner uses MPI, i.e. (distributed) processes - not threads.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900