Further development of OBL-based solver #1854
mkhait
started this conversation in
Capability/Feature Development
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Provided that the PR #1733 is going to be merged, the following list might be considered for further development of this solver:
Adaptive interpolator (MultivariableTableFunction)
Static interpolators are the biggest limitation of the OBL solver at the moment. Static interpolation means that operator values for every point in discretized parameter space should be available before the simulation starts. It is fine for simulation with low DOF count (~ 2-4) and coarse OBL discretization (~up to 64-128 points per DOF), but prevents running models with many components and fine (hence, accurate) OBL discretization, because of both computational time (to compute all points) and high memory consumption (to store computed operators). Adaptive interpolators compute the points required for interpolation adaptively, in the course of simulation. The biggest challenge for the platform-agnostic implementation of adaptive interpolators is sparse storage for computed points. In DARTS, std::unordered_map<array<float, N_OPS> is used for CPU, while for GPU a simple hash-table implementation is adopted.
This improvement allows proceeding with n.n 2, 4.
Integration with Python-described operators
When adaptive interpolators are available, the operator values, in principle, can still be loaded from a file prepared beforehand (currently, it is the only way to run an OBL simulation). The hope is then that the subspace of discretized parameter space required for GEOSX simulation will always be within the subspace described in such file. However, the integration of C++ part (interpolator) with operators defined in Python, such that the former can call the latter to add the points adaptively in the course of simulation, will tremendously expand the capabilities of OBL solver in GEOSX, since any formulation described in DARTS can be immediately used in GEOSX. Naturally, Python-described operators and properties are much easier to adjust/extend, rather than C++ code.
Integration with properties implemented in GEOSX.
The properties described in GEOSX can be rather easily used as building blocks to evaluate operators needed for OBL solver. This way, there will be an OBL alternative for existing formulations in GEOSX. In addition, GEOSX-described operators can be mixed with Python-described ones, further extending the number of formulations used in GEOSX.
Support of operator dump/reuse for adaptive interpolators.
When operator evaluation depends on expensive EOS computations, the possibility to reuse evaluated operators can substantially improve simulation performance for consequent simulations (e.g., in optimization/UQ scenarios). Computed values of operators can be cached to a binary file and loaded back by a consequent simulation, nullifying already low computational cost of property computations within the OBL framework.
Support of multiple operator tables (analogue of FIPNUM, SATNUM, PVTNUM, etc..)
At the moment, the same MultivariableTableFunction is used for all grid block elements. Multiple tables defined, similarly to other constitutives, for specific subregions, would enable support for FIPNUM, SATNUM, PVTNUM functionality. Moreover, dynamic switching between tables for the same grid block elements could also enable the support of relative permeability hysteresis and such.
Improve computational efficiency
6.1 Avoid atomics when computing fluxes
Currently, all fluxes are computed once, hence contribution to both neighbouring grid blocks requires atomics, which is damaging parallel performance. Implementation of flux computations via a for-loop over all grid blocks, with an inner loop over all neighbours of a given grid block, though computes the same flux twice (TPFA, by each neighbour), makes the assembly embarrasingly parallel, removing the necessity of using atomics. This should (significantly) improve assembly performance.
6.2 Improve device utilization by exposing more parallelism
Currently, each thread is assigned to a specific grid block, computing all operators in MultivariableTableFunction and all equations\variables in assembly. Assigning each thread to a specific operator for interpolation and to specific position of matrix grid block for assembly (i.e., specific equation and variable, such that the work currently performed by a single thread will be performed by N_DOF x N_DOF threads) would reduce register pressure and improve utilization. Proper data layout will change, but overall the coalescence (especially with respect to writing matrix entries) can be improved.
Beta Was this translation helpful? Give feedback.
All reactions