Merge kernels into rte-rrtmgp #21

goord · 2021-10-02T12:35:32Z

In this issue we can discuss a strategy to insert our tuned kernels into the rte-rrtmgp original source code.

goord · 2021-10-02T12:53:18Z

Regarding the kernels, the quickest way would be to wrap all kernel launches in C-functions, with void** argument type for any array, which can be called from Fortran.

The memory is more tricky:

The F90 code runs on the CPU, should we transfer input arrays to the GPU ourselves, or assume they are there already? Note that the openacc directives transfer the data every time step.
Intermediate Fortran allocations could be either interchanged with our GPU memory pool allocations (yields fastest code) via the C-layer, or we allocate & transfer after the allocation, as is currently done by the openacc code.
Same for deallocations

For sustainability, it will be useful to setup a 'mirroring' administration connecting CPU-arrays with their GPU equivalent

goord · 2021-10-15T13:36:12Z

The memory solution above does not take into account automatic allocations inside subroutines in the Fortran code. If we want to replace those with our pool allocations, more changes need to be made to the Fortran code.

goord assigned goord, isazi and Chiil Oct 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge kernels into rte-rrtmgp #21

Merge kernels into rte-rrtmgp #21

goord commented Oct 2, 2021

goord commented Oct 2, 2021

goord commented Oct 15, 2021

Merge kernels into rte-rrtmgp #21

Merge kernels into rte-rrtmgp #21

Comments

goord commented Oct 2, 2021

goord commented Oct 2, 2021

goord commented Oct 15, 2021