Remove MPIData (#270)

* Test segmented buffer * implement a buffer for alltoall * fix buffer communication * implement startegy * add reporting for communication strats * add reporting for point to point * fix bug with NotDistributed * Change initialisation, add check to all MPI calls PDVec makes. * MPI workaround * Make AllToAll the default * add dot for alltoall * Fix broken tests * Docstrings, make sure both strategies are tested * Fix broken test, more parallelism * remove print * more parallelism * Fix communicator switching in tests * test that communicator is preserved * more docstring updates * some renames, docstring updates * report wait time * add show methods to communicators * More links and explanations * Small tweak in docstring * remove MPIData * remove RMPI, move helpers to toplevel * scrub all mentions of RMPI * add missing file * update MPI.Allreduce overload * Make sum work on Apple Silicon * Test on mac * only include macos for v1 * fix action config * fix path for macos --------- Co-authored-by: Joachim Brand <[email protected]>
RimuQMC · Nov 25, 2024 · 0a137fc · 0a137fc
1 parent 83f439f
commit 0a137fc
Show file tree

Hide file tree

Showing 31 changed files with 130 additions and 1,463 deletions.
diff --git a/.github/workflows/actions.yml b/.github/workflows/actions.yml
@@ -9,8 +9,10 @@ jobs:
     strategy:
       matrix:
         julia-version: ['1', 'nightly', '1.9']
-        julia-arch: [x64]
         os: [ubuntu-latest]
+        include:
+          - julia-version: '1'
+            os: macos-latest
       fail-fast: false
     steps:
       - name: "Checkout"
@@ -19,7 +21,6 @@ jobs:
         uses: julia-actions/setup-julia@v2
         with:
           version: ${{ matrix.julia-version }}
-          arch: ${{ matrix.julia-arch }}
       - name: "Load cache"
         uses: julia-actions/cache@v2
       - name: "Build"
@@ -39,7 +40,8 @@ jobs:
           # with Pkg.develop(path="."). using Rimu, KrylovKit, StaticArrays at the end
           # ensures everything is precompiled before the MPI job starts.
           julia --color=yes --project=test -e "using Pkg; Pkg.instantiate(); Pkg.develop(path=\".\"); Pkg.add(\"MPI\"); Pkg.build(); using MPI; MPI.install_mpiexecjl(); using Rimu, KrylovKit, StaticArrays"
-          export PATH=$PATH:/home/runner/.julia/bin
+          export PATH=$PATH:/home/runner/.julia/bin # for linux
+          export PATH=$PATH:/Users/runner/.julia/bin # for macos
 
           mpiexecjl -n 2 julia --code-coverage=user --depwarn=yes --project=test test/mpi_runtests.jl
 

diff --git a/docs/make.jl b/docs/make.jl
@@ -64,7 +64,6 @@ makedocs(;
             "Dict vectors" => "dictvectors.md",
             "BitString addresses" => "addresses.md",
             "Stochastic styles" => "stochasticstyles.md",
-            "RMPI" => "RMPI.md",
             "I/O" => "rimuio.md",
             "Random numbers" => "randomnumbers.md",
             "Documentation generation" => "documentation.md",

diff --git a/docs/src/API.md b/docs/src/API.md
@@ -34,10 +34,6 @@ See [Module `DictVectors`](@ref)
 
 See [Module `StatsTools`](@ref)
 
-## RMPI
-
-See [Module `RMPI`](@ref)
-
 # Index
 
 ```@index

diff --git a/docs/src/RMPI.md b/docs/src/RMPI.md
diff --git a/docs/src/index.md b/docs/src/index.md
@@ -113,8 +113,7 @@ needs to be done for communicating between different processes.
 Using MPI parallelism with `Rimu` is easy. Enabling MPI enabled automatically if
 [`PDVec`](@ref) is used to store a vector. In that case, data will be stored in a
 distributed fashion among the MPI ranks and only communicated between ranks when
-necessary. Additional MPI-related functionality is provided by the module [`RMPI`](@ref
-Rimu.RMPI).
+necessary.
 
 ## Compatibility
 

diff --git a/docs/src/mpi.md b/docs/src/mpi.md
@@ -5,9 +5,6 @@ MPI should be fairly straightforward. Generally, [`PDVec`](@ref Main.DictVectors
 work with MPI automatically, as long as MPI is set up correctly and a few common pitfalls
 are avoided.
 
-Rimu includes an unexported module [`RMPI`](@ref Main.Rimu.RMPI), which must be imported to access
-additional MPI-related functionality.
-
 ## Configuring MPI
 
 When running on a cluster, ensure that MPI.jl is using the system binary. See [the MPI.jl
@@ -16,7 +13,6 @@ documentation](https://juliaparallel.org/MPI.jl/latest/configuration/) for more
 It is always a good idea to start your script with a quick test that ensures the MPI is set up correctly. One way to do this is to open with
 
 ```julia
-using Rimu.RMPI
 mpi_allprintln("hello")
 ```
 
@@ -60,10 +56,9 @@ srun mpi=pmi2 julia --project -tauto script.jl
 ### Using `@mpi_root`
 
 Take care to not use reducing functions (such as `length`, `sum`, `norm`, ...) inside
-[`@mpi_root`](@ref Main.Rimu.RMPI.@mpi_root) blocks. Doing so will only initiate the
-distributed reduction on one rank only, which will cause the code to go out of sync and
-freeze. As an example, to report the current length of a vector, calculate the length before
-the [`@mpi_root`](@ref Main.Rimu.RMPI.@mpi_root) block:
+[`@mpi_root`](@ref) blocks. Doing so will only initiate the distributed reduction on one
+rank only, which will cause the code to go out of sync and freeze. As an example, to report
+the current length of a vector, calculate the length before the [`@mpi_root`](@ref) block:
 
 ```julia
 len = length(pdvec)

diff --git a/docs/src/randomnumbers.md b/docs/src/randomnumbers.md
@@ -9,4 +9,4 @@ If you want FCIQMC runs to be reproducible, make sure to seed the RNG with
 [Random.seed!](https://docs.julialang.org/en/v1/stdlib/Random/#Random.seed!).
 
 MPI-distributed runs can also be made reproducible by seeding the RNG with
-[`Rimu.RMPI.mpi_seed!`](@ref).
+[`mpi_seed!`](@ref).
diff --git a/scripts/BHM-example-mpi.jl b/scripts/BHM-example-mpi.jl
@@ -9,10 +9,8 @@
 # [here](https://github.com/joachimbrand/Rimu.jl/blob/develop/scripts/BHM-example-mpi.jl).
 # Run it with 2 MPI ranks with `mpirun -n 2 julia BHM-example-mpi.jl`.
 
-# We start by importing `Rimu` and `Rimu.RMPI`, which contains MPI-related
-# functionality.
+# We start by importing `Rimu`.
 using Rimu
-using Rimu.RMPI
 
 # Note that it is not necessary to initialise the MPI library, as this is already done
 # automatically when Rimu is loaded.
@@ -62,8 +60,8 @@ problem = ProjectorMonteCarloProblem(H;
     last_step=10_000
 );
 
-# The [`@mpi_root`](@ref Main.Rimu.RMPI.@mpi_root) macro performs an action on the root rank
-# only, which is useful for printing.
+# The [`@mpi_root`](@ref) macro performs an action on the root rank only, which is useful
+# for printing.
 @mpi_root println("Running FCIQMC with ", mpi_size(), " rank(s).")
 
 # Finally, we can run the computation.

diff --git a/src/DictVectors/communicators.jl b/src/DictVectors/communicators.jl
@@ -1,3 +1,5 @@
+import Rimu: mpi_rank, mpi_size, mpi_comm
+
 struct CommunicatorError <: Exception
     msg::String
 end

diff --git a/src/DictVectors/pdvec.jl b/src/DictVectors/pdvec.jl
@@ -186,7 +186,7 @@ function PDVec{K,V,N}(
 
     # This is a bit clunky. If you modify the communicator by hand, you have to make sure it
     # knows to hold values of type W. When we introduce more communicators, they should
-    # probably be constructed by a function, similar to how it's done in RMPI.
+    # probably be constructed by a function.
     IW = initiator_valtype(irule, W)
     if isnothing(communicator)
         if MPI.Comm_size(MPI.COMM_WORLD) > 1
@@ -531,6 +531,12 @@ function Base.mapreduce(f::F, op::O, t::PDVecIterator; kwargs...) where {F,O}
     return merge_remote_reductions(t.vector.communicator, op, result)
 end
 
+# The following method is required to make `sum` work for PDVecs with MPI on ARM processors.
+# The reason is that `sum` uses a non-default reduction operator, which is not supported by
+# MPI.jl on non-Intel processors. This method is a workaround that uses the default
+# reduction operator.
+Base.sum(f, t::PDVecIterator; kwargs...) = mapreduce(f, +, t; kwargs...)
+
 """
     all(predicate, keys(::PDVec); kwargs...)
     all(predicate, values(::PDVec); kwargs...)

diff --git a/src/DictVectors/projectors.jl b/src/DictVectors/projectors.jl
@@ -109,7 +109,6 @@ end
 
 # NOTE that this returns a `Float64` opposite to the convention for
 # dot to return the promote_type of the arguments.
-# NOTE: This operation should work for `MPIData` and is MPI synchronizing
 
 """
     PopsProjector() <: AbstractProjector

diff --git a/src/Interfaces/dictvectors.jl b/src/Interfaces/dictvectors.jl
@@ -75,8 +75,6 @@ StochasticStyle(::AbstractArray{T}) where {T} = default_style(T)
 
 Create a "frozen" version of `dv` which can no longer be modified or used in the
 conventional manner, but supports faster dot products.
-
-If `dv` is an [`MPIData`](@ref Main.Rimu.RMPI.MPIData), synchronize its contents among the ranks first.
 """
 freeze(v::AbstractVector) = copy(v)
 

diff --git a/src/RMPI/RMPI.jl b/src/RMPI/RMPI.jl
-Original file line number
+Diff line change
@@ Expand Up / @@ -34,10 +34,6 @@ See [Module `DictVectors`](@ref) @@
     See [Module `StatsTools`](@ref)
-    ## RMPI
-    See [Module `RMPI`](@ref)
     # Index
     ```@index
@@ Expand Down @@