-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated Field API variants of the clouds dwarf (CPU and GPU) #96
Merged
Merged
Changes from all commits
Commits
Show all changes
20 commits
Select commit
Hold shift + click to select a range
3dc7920
FIELD state types for cloudsc skeleton
wertysas 4f0ab23
Added Field API view functionality to field state module and Field ap…
wertysas 50ba3b5
Added aux, flux and state types using field API for storage as in the…
wertysas 8e0591c
Fortran CPU version modified to use Field API as in the IFS
wertysas 32828de
Updated bundle and CMake, added with-field api option
wertysas 99f4da3
moved aux, state and flux types into their own modules
wertysas 45a31c0
CMake and bundle updates
wertysas 6796466
Updated authors list
wertysas 91f01f2
Restoring modified comments
wertysas 97eb718
Cmake and bundle updates after PR comments
wertysas 0b52301
Updated github CI builds
wertysas 611b385
Updated Field API version in bundle and disables CUDA if --with-field…
wertysas 77d19d2
Updated call signature of GPU field variant and switched field api pi…
wertysas a8cc3a0
README updated and bug fixes in field gpu driver and CMake
wertysas b5a13c8
Switched F-API fields to mapped by default and changed option with-ma…
wertysas a3bc0eb
added missing ENABLE before mapped feature in bundle and moved back I…
wertysas d37f273
Updated workflows and tests to handle without-mapped-fields option, a…
wertysas 125741e
Passing flag to CMake to prevent F-API from breaking GNU workflows
wertysas e455384
replaced gnu gpu tests with cpu tests, added F-API DEV_ALLOC_SIZE env…
wertysas 128382c
CMake clean
wertysas File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -28,6 +28,9 @@ Balthasar Reuter ([email protected]) | |
prototype that validates runs against platform and language-agnostic | ||
off-line reference data via HDF5 or the Serialbox package. The kernel code | ||
also is slightly cleaner than the original version. | ||
- **dwarf-cloudsc-fortran-field**: A fortran version of CLOUDSC that uses Field API | ||
for the data structures. The intent of this version is to show how | ||
Field API is used in newer versions of the IFS. | ||
- **dwarf-cloudsc-c**: Standalone C version of the kernel that has | ||
been generated by ECMWF tools. This relies exclusively on the Serialbox | ||
validation mechanism. | ||
|
@@ -81,13 +84,18 @@ Balthasar Reuter ([email protected]) | |
- **dwarf-cloudsc-gpu-scc-field**: GPU-enabled and optimized version of | ||
CLOUDSC that uses the SCC loop layout, and uses [FIELD API](https://github.com/ecmwf-ifs/field_api) (a Fortran library purpose-built for IFS data-structures that facilitates the | ||
creation and management of field objects in scientific code) to perform device offload | ||
and copyback. The intent is to demonstrate the explicit use of pinned host memory to speed-up | ||
data transfers, as provided by the shipped prototype implmentation, and | ||
investigate the effect of different data storage allocation layouts. | ||
and copyback. | ||
The field api variant supports modern features of the FIELD API such as *field gangs* that group | ||
multiple fields and allocates them in one larger field, in order to reduce allocations and | ||
data transfers. Field gang support can be enabled at runtime by setting the environment | ||
variable `CLOUDSC_PACKED_STORAGE=ON`. If CUDA is available, then the field api variant also supports | ||
the use of allocating fields in pinned memory. This is enabled by setting the | ||
environemnt variable `CLOUDSC_FIELD_API_PINNED=ON` and will speed up data transfers between host and device. | ||
To enable this variant, a suitable CUDA installation is required and the | ||
`--with-cuda` flag needs to be passed at the build stage. This variant lets the CUDA runtime | ||
manage temporary arrays and needs a large `NV_ACC_CUDA_HEAPSIZE` | ||
(eg. `NV_ACC_CUDA_HEAPSIZE=8GB` for 160K columns.) | ||
manage temporary arrays and needs a large `NV_ACC_CUDA_HEAPSIZE` (eg. `NV_ACC_CUDA_HEAPSIZE=8GB` for 160K columns.). | ||
It is possible to disable Field API registering fields in the OpenACC data map, by passing the | ||
`--without-mapped-fields` flag at build stage. | ||
- **cloudsc-pyiface.py**: a combination of the cloudsc/cloudsc-driver routines | ||
of cloudsc-fortran with the uppermost `dwarf` program replaced with a | ||
corresponding Python script capable of HDF5 data load and | ||
|
@@ -320,8 +328,9 @@ transfer overheads will dominate timings, and that most supported GPU | |
variants aim to optimise compute kernel timings only. However, a | ||
dedicated variant `dwarf-cloudsc-gpu-scc-field` has been added to | ||
explore host-side memory pinning, which improves data transfer times | ||
and alternative data layout strategies. By default, this will allocate | ||
each array variable individually in pinned memory. A runtime flag | ||
and alternative data layout strategies. By default, pinned memory is turned off | ||
but can be turned on by setting the environment variable `CLOUDSC_FIELD_API_PINNED=ON`. | ||
This will allocate each array variable individually in pinned memory. A runtime flag | ||
`CLOUDSC_PACKED_STORAGE=ON` can be used to enable "packed" storage, | ||
where multiple arrays are stored in a single base allocation, eg. | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[No action required!] I think we can skip the 21.9 builds. That compiler version is fairly outdated now and the 23.5 tests are much more meaningful. But that can be dealt with in a subsequent CI-cleanup that is on the horizon.