Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated Field API variants of the clouds dwarf (CPU and GPU) #96

Merged
merged 20 commits into from
Nov 4, 2024

Conversation

wertysas
Copy link

@wertysas wertysas commented Oct 1, 2024

This is an update of the Field API variants of cloudsc that aims to resemble the use of Field API in the IFS more closely than before. This includes the addition of a new global state type for the Field API variants: CLOUDSC_FIELD_STATE_TYPE and three types for aux vars, fluxes and tendencies that resemble similar types in the IFS:

  • CLOUDSC_AUX_TYPE
  • CLOUDSC_FLUX_TYPE
  • CLOUDSC_STATE_TYPE

There has also been updates to the build structure, that makes it possible to build the CPU Field API version without OpenACC or CUDA and to build the GPU Field API version without CUDA support. The following ecbuild_options have been added to support this:

  • with-field-api enables field-api variants to be built
  • with-mapped-fields builds field-api variants with field pointers automatically by OpenACC to device (requires OpenACC and CUDA)
  • with-pinned-fields builds field-api variants with fields allocated in pinned memory (requires OpenACC and CUDA)

@wertysas wertysas requested review from mlange05 and reuterbal October 1, 2024 09:16
@FussyDuck
Copy link

FussyDuck commented Oct 1, 2024

CLA assistant check
All committers have signed the CLA.

@wertysas wertysas force-pushed the je-field-api-view-updates branch from 00d866e to 9754c1a Compare October 1, 2024 09:59
@wertysas wertysas requested a review from awnawab October 1, 2024 13:03
@wertysas wertysas marked this pull request as draft October 2, 2024 08:01
@wertysas wertysas force-pushed the je-field-api-view-updates branch from 7eee6fb to 3804590 Compare October 2, 2024 14:16
@wertysas wertysas force-pushed the je-field-api-view-updates branch from a62f0ce to 91f01f2 Compare October 4, 2024 08:52
@wertysas wertysas marked this pull request as ready for review October 4, 2024 16:14
Copy link
Collaborator

@reuterbal reuterbal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks, this looks like it was a substantial piece of work that has been carried out very diligently and well structured!

I left a few small comments but nothing major.

To make sure this is tested in CI, please add --with-field-api in all four entries in the github workflow here:

- '' # Plain build without any options
- '--with-gpu --with-loki --with-atlas' # Enable Loki, Atlas, and GPU variants
- '--with-gpu --with-loki --with-atlas --with-mpi' # Enable Loki, Atlas, and GPU variants with MPI
- '--single-precision --with-gpu --with-loki --with-atlas --with-mpi' # Enable Loki, and GPU variants with MPI in a single-precision build

Further down, you may have to add the GPU-enabled field_api build target also to the ctest_exclude options, e.g., here:

ctest_exclude_pattern: '-gpu-|-scc-|-loki-c|-cuda-' # GPU variants don't work on CPU runners, loki-c variant causes SIGFPE

In addition, you will have to add the new target to the script here: https://github.com/ecmwf-ifs/dwarf-p-cloudsc/blob/develop/.github/scripts/verify-targets.sh

CMakeLists.txt Outdated Show resolved Hide resolved
bundle.yml Outdated Show resolved Hide resolved
src/cloudsc_fortran/CMakeLists.txt Outdated Show resolved Hide resolved
src/cloudsc_gpu/CMakeLists.txt Show resolved Hide resolved
src/common/module/cloudsc_field_state_mod.F90 Outdated Show resolved Hide resolved
src/common/module/cloudsc_field_state_mod.F90 Outdated Show resolved Hide resolved
@reuterbal
Copy link
Collaborator

Apologies, final request: Please update also the README section here:

- **dwarf-cloudsc-gpu-scc-field**: GPU-enabled and optimized version of
CLOUDSC that uses the SCC loop layout, and uses [FIELD API](https://github.com/ecmwf-ifs/field_api) (a Fortran library purpose-built for IFS data-structures that facilitates the
creation and management of field objects in scientific code) to perform device offload
and copyback. The intent is to demonstrate the explicit use of pinned host memory to speed-up
data transfers, as provided by the shipped prototype implmentation, and
investigate the effect of different data storage allocation layouts.
To enable this variant, a suitable CUDA installation is required and the
`--with-cuda` flag needs to be passed at the build stage. This variant lets the CUDA runtime
manage temporary arrays and needs a large `NV_ACC_CUDA_HEAPSIZE`
(eg. `NV_ACC_CUDA_HEAPSIZE=8GB` for 160K columns.)

Copy link
Contributor

@awnawab awnawab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks @wertysas for this excellent and comprehensive piece of work! 🙏 Mirroring the F-API usage in the IFS is really great as it will allow us to test F-API related Loki recipes on cloudsc first.

I've left a few comments for you to address, but none of these require major changes.

CMakeLists.txt Outdated Show resolved Hide resolved
bundle.yml Show resolved Hide resolved
bundle.yml Outdated Show resolved Hide resolved
src/common/module/cloudsc_aux_type_mod.F90 Outdated Show resolved Hide resolved
src/common/module/cloudsc_flux_type_mod.F90 Outdated Show resolved Hide resolved
src/common/module/cloudsc_state_type_mod.F90 Show resolved Hide resolved
src/cloudsc_gpu/cloudsc_driver_gpu_scc_field_mod.F90 Outdated Show resolved Hide resolved
src/cloudsc_gpu/cloudsc_driver_gpu_scc_field_mod.F90 Outdated Show resolved Hide resolved
bundle.yml Outdated Show resolved Hide resolved
${COMMON_MODULE}/cloudsc_field_state_mod.F90
${COMMON_MODULE}/cloudsc_flux_type_mod.F90
${COMMON_MODULE}/cloudsc_aux_type_mod.F90
${COMMON_MODULE}/cloudsc_state_type_mod.F90
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably not be added here, since this is the SCC CUF variant that is not based on the field variant of the gpu driver. My thinking is that a new loki_transform will be added through the dwarf-p-cloudsc/je-field-api-offload branch that I am using for development of the Loki Field API offload transformation.

@awnawab
Copy link
Contributor

awnawab commented Oct 24, 2024

I think the gnu --with-field and --with-gpu CI builds are failing because gnu has some limited openacc support but is untested for F-API and so it shouldn't be enabled. You can add --cmake="FIELD_API_ENABLE_ACC=OFF" to the build flags for the gnu runners.

…pped-field to without-mapped-fields, added field variants to verify-targets, and updated sync_host methods in scc_field driver
@awnawab
Copy link
Contributor

awnawab commented Oct 30, 2024

Many thanks @wertysas for addressing all of my comments! 🙏 The CI fails at the moment because it can't find the scc-field targets. This is because the condition for that is field_api_HAVE_ACC which we explicitly disable. Changing that condition to HAVE_ACC should fix it. You might also want to set the env variable DEV_ALLOC_SIZE=1073741824 (1Gb) for the F-API gpu enabled unit tests.

Copy link
Contributor

@awnawab awnawab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for sorting out the CI issues, this looks great and good to go 👌

Copy link
Collaborator

@reuterbal reuterbal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many thanks for the updates and @awnawab for the diligent review! Just one final request to use a more meaningful option description, otherwise ready to go!

Comment on lines +76 to +95
- arch: nvhpc/21.9
nvhpc_version: 21.9
io_library_flag: ''
build_flags: '--single-precision --with-gpu --with-loki --with-cuda --with-field'
ctest_exclude_pattern: '-gpu-|-scc-|-loki-c|-cuda' # GPU variants don't work on CPU runners, loki-c variant causes SIGFPE
- arch: nvhpc/21.9
nvhpc_version: 21.9
io_library_flag: '--with-serialbox'
build_flags: '--with-gpu --with-loki --with-cuda --with-field'
ctest_exclude_pattern: '-gpu-|-scc-|-loki-c|-cuda' # GPU variants don't work on CPU runners, loki-c variant causes SIGFPE
- arch: nvhpc/21.9
nvhpc_version: 21.9
io_library_flag: ''
build_flags: '--single-precision --with-gpu --with-loki --with-cuda --with-field --without-mapped-fields'
ctest_exclude_pattern: '-gpu-|-scc-|-loki-c|-cuda' # GPU variants don't work on CPU runners, loki-c variant causes SIGFPE
- arch: nvhpc/21.9
nvhpc_version: 21.9
io_library_flag: '--with-serialbox'
build_flags: '--with-gpu --with-loki --with-cuda --with-field --without-mapped-fields'
ctest_exclude_pattern: '-gpu-|-scc-|-loki-c|-cuda' # GPU variants don't work on CPU runners, loki-c variant causes SIGFPE
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[No action required!] I think we can skip the 21.9 builds. That compiler version is fairly outdated now and the 23.5 tests are much more meaningful. But that can be dealt with in a subsequent CI-cleanup that is on the horizon.

CMakeLists.txt Outdated
DEFAULT ON )

ecbuild_find_package( NAME loki )
ecbuild_find_package( NAME atlas )

ecbuild_add_option( FEATURE FIELD_API_DISABLE_MAPPED_MEMORY
DESCRIPTION "Use ACC mapped memory by default in Field API objects"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That description should probably read something like "Disable the use of ACC mapped memory in Field API objects"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, thank you for spotting that, I missed to update the description when changing this option. I have changed it now.

Copy link
Collaborator

@mlange05 mlange05 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fantastic! Great contribution and enables a lot of great follow-on stuff.

Driver structure and data types look much closer to the full system in this and the structural issue are solved quite elegantly (eg. the packed FIELD_GANG implementation.

GTG from me. :shipit:

@reuterbal reuterbal merged commit 3d5c82a into develop Nov 4, 2024
32 checks passed
@reuterbal reuterbal deleted the je-field-api-view-updates branch November 4, 2024 21:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants