Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge kernels into rte-rrtmgp #21

Open
goord opened this issue Oct 2, 2021 · 2 comments
Open

Merge kernels into rte-rrtmgp #21

goord opened this issue Oct 2, 2021 · 2 comments
Assignees

Comments

@goord
Copy link
Collaborator

goord commented Oct 2, 2021

In this issue we can discuss a strategy to insert our tuned kernels into the rte-rrtmgp original source code.

@goord
Copy link
Collaborator Author

goord commented Oct 2, 2021

Regarding the kernels, the quickest way would be to wrap all kernel launches in C-functions, with void** argument type for any array, which can be called from Fortran.

The memory is more tricky:

  • The F90 code runs on the CPU, should we transfer input arrays to the GPU ourselves, or assume they are there already? Note that the openacc directives transfer the data every time step.
  • Intermediate Fortran allocations could be either interchanged with our GPU memory pool allocations (yields fastest code) via the C-layer, or we allocate & transfer after the allocation, as is currently done by the openacc code.
  • Same for deallocations

For sustainability, it will be useful to setup a 'mirroring' administration connecting CPU-arrays with their GPU equivalent

@goord
Copy link
Collaborator Author

goord commented Oct 15, 2021

The memory solution above does not take into account automatic allocations inside subroutines in the Fortran code. If we want to replace those with our pool allocations, more changes need to be made to the Fortran code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants