-
Notifications
You must be signed in to change notification settings - Fork 368
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU Device Variable on Intel GPUs #4056
GPU Device Variable on Intel GPUs #4056
Conversation
227ae3b
to
da5f699
Compare
da5f699
to
3b34581
Compare
This adds GPU device variable support on Intel GPUs using Intel oneAPI compiler's experimental feature. To make the user interface consistent, we have add a macro AMREX_DEVICE_GLOBAL_VARIABLE. For example, the user can define a device variable as follows for all GPUs and CPUs. AMREX_DEVICE_GLOBAL_VARIABLE(amrex::Real, my_dg1); // amrex::Real my_dg1; AMREX_DEVICE_GLOBAL_VARIABLE(amrex::Real, 4, my_dg2); // amrex::Real my_dg2[4]; Below are their declarations. extern AMREX_DEVICE_GLOBAL_VARIABLE(amrex::Real, my_dg1); extern AMREX_DEVICE_GLOBAL_VARIABLE(amrex::Real, 4, my_dg2); GPU and CPU kernels can use the global variables if they see the declarations. We have also added two functions from copying data from and to device global variables. //! Copy `nbytes` bytes from host to device global variable. `offset` is the //! offset in bytes from the start of the device global variable. template <typename T> void memcpy_from_host_to_device_global_async (T& dg, const void* src, std::size_t nbytes, std::size_t offset = 0) //! Copy `nbytes` bytes from device global variable to host. `offset` is the //! offset in bytes from the start of the device global variable. template <typename T> void memcpy_from_device_global_to_host_async (void* dst, T const& dg, std::size_t nbytes, std::size_t offset = 0)
3b34581
to
eebd575
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commit looks good and straightforward.
My only question would be: do we want to name the function calls and macro after the generic Intel naming of the object? Don't have a good idea, but "global" and "device" are extremely overloaded at this point and if feels like I'm going to forget what this and need to look it up occasionally. "device_global_var"?
To me |
Fair enough. As long as it works generally. |
This adds GPU device variable support on Intel GPUs using Intel oneAPI compiler's experimental feature.
To make the user interface consistent, we have add a macro AMREX_DEVICE_GLOBAL_VARIABLE. For example, the user can define a device variable as follows for all GPUs and CPUs.
Below are their declarations.
GPU and CPU kernels can use the global variables if they see the declarations.
We have also added two functions from copying data from and to device global variables.
Summary
Additional background
Checklist
The proposed changes: