-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add nompi stub #425
base: develop
Are you sure you want to change the base?
Add nompi stub #425
Conversation
@cwsmith, build is failing for this PR. I think we should add |
@flagdanger and I are working on a very large update of PCU that should make this all easier by getting rid of the singleton. Pull request for that coming within the next week. |
@bobpaw sorting out a few things to make the pull request ready to merge, but #388 the draft pull request for replacing the singleton in PCU. Although we replaced the interface used in the test cases, this pull request is designed to work with no modification to existing codes by using a global PCU state when the old style interface is used with |
@jacobmerson, we will plan to use your PR for the future of core, but the sponsor wants this functionality soon, so we will use this branch until yours is finished. |
@bobpaw Does |
No, at the moment it only checks if PCU has been initialized using a global state variable. We could check with |
@bobpaw OK. I'm wondering if it will make sense to make the MPI check in that call, and if it fails call MPI_Init. But... handling MPI_Finalize could be a bit more tricky; other libraries may still be running and using MPI beyond the lifetime of PCU. I'd prefer symmetry in the interface so your original suggestion may be best. |
Added pcu_pnompi.c. Added SCOREC_NO_MPI CMake option. Removed direct MPI calls in parma, phasta, and PUMI. Added incomplete reimplementations of those MPI calls.
Add stubs for MPI_Init and MPI_Finalize. Add pcu_pnompi_types.h for MPI typedefs. Add status message about SCOREC_NO_MPI flag. Signed-off-by: Aiden Woodruff <[email protected]>
Signed-off-by: Aiden Woodruff <[email protected]>
Replace MPI_Comm_free with PCU_Comm_Free_One. Signed-off-by: Aiden Woodruff <[email protected]>
Signed-off-by: Aiden Woodruff <[email protected]>
Signed-off-by: Aiden Woodruff <[email protected]>
This will avoid object collisions when linking to libraries that already use MPI. - i.e. the intended use case for SCOREC_NO_MPI. Signed-off-by: Aiden Woodruff <[email protected]>
Changed add_nompi_msg to record receiver. - Since there's no MPI, sender == receiver but I want to quiet the warnings. Signed-off-by: Aiden Woodruff <[email protected]>
Add return type for pcu_pmpi_free, pcu_pmpi_split. Signed-off-by: Aiden Woodruff <[email protected]>
Add linked list traversal to find a matching message. Removed error message that did not actually indicate errors. - If no message is found it's perfectly reasonable and in fact correct to return 0 and that's how pmpi knows to return false. Signed-off-by: Aiden Woodruff <[email protected]>
Run smoke tests without mpirun given SCOREC_NO_MPI. Add SCOREC_NO_MPI to the GitHub Actions test matrix. Signed-off-by: Aiden Woodruff <[email protected]>
Signed-off-by: Aiden Woodruff <[email protected]>
I use clock_gettime, which is one of the many methods mpich uses depending on the system. Signed-off-by: Aiden Woodruff <[email protected]>
Also update TriBITS package file. Signed-off-by: Aiden Woodruff <[email protected]>
- Add mpi_test_depends function to replace set_test_properties. This function checks if SCOREC_NO_MPI is defined and assumes nonexistent tests were multiproc ones. Signed-off-by: Aiden Woodruff <[email protected]>
Signed-off-by: Aiden Woodruff <[email protected]>
- parma/group/parma_group.cc(runInGroups): remove PCU/MPI free that will be carried out by pcu::PCU::~PCU. - test/xgc_split.cc: remove direct mpi.h inclusion - test/pumiLoadMesh.cc: remove direct mpi.h inclusion. - test/modelInfo.cc: remove direct #include <mpi.h>. - phasta/phstream.cc(::getTime): use new pcu::Time. - test/fieldReduce.cc: use correct PCUObj.Peers. - pumi/pumi_sys.cc(pumi_sync): use PCU Barrier instead of MPI Barrier. - pcu/PCU.h: make PCU_Comm (holdover) functions extern C and PCU_Wtime. - pumi/pumi_ghost.cc(do_off_part_bridge): fix typo (forgot parens). - pumi/pumi_mesh.cc: capitalize GetMPIComm. - pumi/pumi_numbering(pumi_numbering_print): use pcu::PCU::Barrier instead of WORLD barrier. - pcu/PCU_C.h: optionally include mpi or stub. - pcu/pcu_c.cc: add split stub and include pcu_pmpi.h. - pcu/pcu_pmpi.c: remove leftover conflict markers. - pcu/pcu_pnompi.h: update function signatures, make extern C, and remove obsolete stubs. - pcu/pcu_pnompi.c: remove globals and use pcu_mpi_t*. - clean up formatting. - remove unused or (void)ed argument names. Signed-off-by: Aiden Woodruff <[email protected]>
- pcu/PCU.cc(PCU::~PCU): clean up message order if it exists (valgrind is happy now). Signed-off-by: Aiden Woodruff <[email protected]>
- pcu/pcu_pmpi.c: remove pcu_pmpi_switch, pcu_pmpi_comm which were used by old nompi stubs. - (pcu_pmpi_allgather, pcu_pmpi_allreduce): break long lines and clean up indentation. Signed-off-by: Aiden Woodruff <[email protected]>
- pcu/pcu_pnompi.c: restore function arguments in definitions that were removed in b76d8f1. Turns out they are required in C and the local compiler was not flagging it. Signed-off-by: Aiden Woodruff <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you. There are a few comments below.
Yes, I agree, we should squash-merge.
#include "pcu_pnompi_types.h" | ||
// Remove MPI calls. | ||
#define MPI_Init(argc, argv) do { \ | ||
(void) argc; \ | ||
(void) argv; \ | ||
} while (0) | ||
#define MPI_Finalize(void) | ||
#ifdef __cplusplus | ||
extern "C" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the scope of these macros? We still want to let users of PUMI run their application in parallel even if PUMI is only built for serial execution.
I think we should add a test that builds an MPI hello world app and links against a 'SCOREC_NO_MPI=on' build, and loads a serial pumi mesh etc.. This will help ensure that the symbols don't leak in the long run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a really good point. Right now, those macros would spread to every source that includes PCU.h. I think a simple #ifndef MPI_COMM_WORLD
may suffice in these cases. Then PUMI tests will use the stub MPI_Init and end-users can #include <mpi.h>
before PCU.h and they will get the real versions.
I will work on making a test case so that we can observe the negative first before trying that fix.
Also, it would require a major notice to end-users (since include order is not usually important). What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When building with SCOREC_NO_MPI=off then I assume that everything that worked prior to this PR would continue to work. Is that correct? If so, then I think we could count this as a new 'feature' with specific requirements (i.e., include order, no mpi headers, etc.).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add a test that builds an MPI hello world app and links against a 'SCOREC_NO_MPI=on' build, and loads a serial pumi mesh etc.
Should this be a GitHub action, similar to building the doc/
folder?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I see the Developer Guide on the Wiki which I will update after merging this PR. Is the user guide LaTeX anywhere? A quick Ctrl+F doesn't show any reference to any of the SCOREC_*
build options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a GitHub action, similar to building the doc/ folder?
Yes.
latex
The latex files for the user guide are here:
#include "pcu_pnompi_types.h" | ||
// Remove MPI calls. | ||
#define MPI_Init(argc, argv) do { \ | ||
(void) argc; \ | ||
(void) argv; \ | ||
} while (0) | ||
#define MPI_Finalize(void) | ||
#ifdef __cplusplus | ||
extern "C" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same scope question as in PCU.h
typedef int MPI_Comm; | ||
typedef int MPI_Request; | ||
typedef int MPI_Datatype; | ||
typedef int MPI_Op; | ||
#define MPI_COMM_WORLD 0 | ||
#define MPI_ANY_SOURCE -1 | ||
#define MPI_SUM 1 | ||
#define MPI_INT 2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same scope question as in PCU.h
@@ -11,7 +11,6 @@ | |||
#include <unistd.h> | |||
#include <iostream> | |||
#include <algorithm> | |||
#include <mpi.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If SCOREC_NO_MPI=ON, then unless the user is compiling with the mpi wrappers, then source code and tests cannot #include <mpi.h>
. Is that correct? If so, we should document that in the users and developers guide and that no MPI functions should be called directly within PUMI source.
- test/smokeTesting.cmake: indent if statements - test/testing.cmake: indent if statements in mpi_test. Signed-off-by: Aiden Woodruff <[email protected]>
- CMakeLists.txt: update SCOREC_NO_MPI option helpstring. Co-authored-by: Cameron Smith <[email protected]>
Before I make my example project I have a feeling I will need to add a config file, called something like |
We should probably just make a PUMI_config.h file (like this) and the first entry in it can be the |
- example/: new directory for example projects. I will move doc/ here. - example/mpi-nompi/: new example to test an mpi app using a nompi build of SCOREC. - README.md: describe example. - hello.cc: print size/rank info and then create a box mesh like in the first example. - CMakeLists.txt: add build code (from doc/ example) and tests on 1, 2, and 4 procs. - add tests for exact output. - .github/workflows/cmake.yml: add GitHub action for the mpi-nompi example. Signed-off-by: Aiden Woodruff <[email protected]>
Signed-off-by: Aiden Woodruff <[email protected]>
bf506a7
to
5218f25
Compare
Signed-off-by: Aiden Woodruff <[email protected]>
Signed-off-by: Aiden Woodruff <[email protected]>
- example/mpi-nompi/hello.cc: add allreduce sum code. - example/mpi-nompi/CMakeLists.txt: update test regexes. Signed-off-by: Aiden Woodruff <[email protected]>
Add nompi stub
Add a stub implementation of MPI in
pcu/pcu_pnompi.c
.Remove direct calls to MPI functions.
This adds a
SCOREC_NO_MPI
build option for users running on single-process, single-machine setups that do not want to compile with MPI. Message passing is simulated in memory.