-
Notifications
You must be signed in to change notification settings - Fork 752
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][Doc] Extension spec for composite devices #11846
Conversation
Add an extension specification for new APIs that allow an application to access card-level devices on PVC.
sycl/doc/extensions/proposed/sycl_ext_oneapi_composite_device.asciidoc
Outdated
Show resolved
Hide resolved
sycl/doc/extensions/proposed/sycl_ext_oneapi_composite_device.asciidoc
Outdated
Show resolved
Hide resolved
Fix typos. Co-authored-by: Marcos Maronas <[email protected]>
Thanks for finding those, @maarquitos14. |
sycl/doc/extensions/proposed/sycl_ext_oneapi_composite_device.asciidoc
Outdated
Show resolved
Hide resolved
|
||
Some Intel GPU architectures are structured with multiple tiles on a single | ||
card. Currently, this applies only to the Data Center GPU Max series (aka | ||
PVC). By default, SYCL exposes each of these tiles as a separate root device, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would suggest to say that is not SYCL, and not even the UR, it is the Intel(R) L0 driver implementation that does that. SYCL and UR just expose what L0 driver has presented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to read this specification from the point of view of a SYCL application developer. From that point of view the UR and Level Zero are just implementation details. The only thing that matters to this person is how the SYCL APIs expose the hardware.
namespace sycl { | ||
namespace ext::oneapi::experimental { | ||
|
||
std::vector<device> get_composite_devices(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gmlueck : from the definition above:
A composite device has the same semantics as any other SYCL device, though the
performance characteristics might be different. The application may submit a
kernel to a composite device, and the implementation automatically schedules
work-items to each of the underlying tiles.
then get_composite_devices()
would return the equivalent devices returned by zeDeviceGet
when ZE_FLAT_DEVICE_HIERARCHY is set to COMPOSITE. When it is set to FLAT, then get_composite_devices()
will return 0, as with FLAT the root devices returned are tiles, so the statement above "the implementation automatically schedules work-items to each of the underlying tiles." doesn't hold.
Is that correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The get_composite_devices
function will return an empty list in both FLAT and COMPOSITE modes. It's only in COMBINED mode where it returns something interesting. In this mode, I believe the statement about distributing work-items to the underlying tiles is true, right?
== Impact to the ONEAPI_DEVICE_SELECTOR | ||
|
||
The `ONEAPI_DEVICE_SELECTOR` is an environment variable that is specific to the | ||
{dpcpp} implementation. Therefore, this section that describes the interaction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dont know if this needs to be reworded? The ONEAPI_DEVICE_SELECTOR is being implemented in the UR, which means will be used by all customers of the UR, not only the dpcpp implementation, oneapi-src/unified-runtime#220.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, DPC++ is the only SYCL implementation that uses the UR. I think we should keep the wording like this for now. We can revisit it if other SYCL implementation start using the UR.
In any event, I think the logic described in this section will probably be located in the DPC++ runtime, not in the UR.
sycl/doc/extensions/proposed/sycl_ext_oneapi_composite_device.asciidoc
Outdated
Show resolved
Hide resolved
…asciidoc Co-authored-by: Marcos Maronas <[email protected]>
sycl/doc/extensions/proposed/sycl_ext_oneapi_composite_device.asciidoc
Outdated
Show resolved
Hide resolved
sycl/doc/extensions/proposed/sycl_ext_oneapi_composite_device.asciidoc
Outdated
Show resolved
Hide resolved
Co-authored-by: Marcos Maronas <[email protected]>
sycl/doc/extensions/proposed/sycl_ext_oneapi_composite_device.asciidoc
Outdated
Show resolved
Hide resolved
sycl/doc/extensions/proposed/sycl_ext_oneapi_composite_device.asciidoc
Outdated
Show resolved
Hide resolved
…asciidoc Co-authored-by: John Pennycook <[email protected]>
I got feedback that this convention is more confusing than helpful.
@intel/llvm-gatekeepers I think this is ready to merge |
Initial implementation to support `sycl_ext_oneapi_composite_device` specified in #11846. Depends on oneapi-src/unified-runtime#1192. --------- Signed-off-by: Maronas, Marcos <[email protected]> Signed-off-by: Marcos Maronas <[email protected]>
Add an extension specification for new APIs that allow an application
to access card-level devices on PVC.