-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First steps to enable SYCL backend in Python Interface #155
base: sycl-develop
Are you sure you want to change the base?
First steps to enable SYCL backend in Python Interface #155
Conversation
python/cutlass_library/generator.py
Outdated
|
||
math_instructions = [ | ||
MathInstruction( | ||
[16, 8, 16], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it be 8, 16, 16 to match the 8x16x16 (M,N,K) MMA operation for bfloat?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that probably makes sense. This value was probably based on values used for CUDA devices, makes sense to adapt it for PVC.
I changed it in the latest commit.
@@ -7026,6 +7026,47 @@ def GenerateSM90(manifest, cuda_version): | |||
|
|||
################################################################################################### | |||
|
|||
def GeneratePVC_TensorOp_16b_gemm(manifest, cuda_version): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is cuda version here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's the CUDA version, e.g., 12.4.0
, defined here.
Right now, we don't use that parameter. If we come to a point where we need to make distinctions based on SYCL version or similar, we can change this to reflect a version that we need.
For now, we only have this parameter to be compatible with the expected interface here (via generate_function_name
and generate_function
).
python/cutlass_library/generator.py
Outdated
def GeneratePVC_TensorOp_16b_gemm(manifest, cuda_version): | ||
# TODO: Add remaining supported configurations | ||
layouts = [ | ||
[[LayoutType.RowMajor, 8], [LayoutType.RowMajor, 8], [LayoutType.RowMajor, 8]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is 8 the alignment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think so.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look good, thanks!!!
I left some questions but I think they will be more relevant for follow up PRs.
Signed-off-by: Lukas Sommer <[email protected]>
Signed-off-by: Lukas Sommer <[email protected]>
Signed-off-by: Lukas Sommer <[email protected]>
Signed-off-by: Lukas Sommer <[email protected]>
Signed-off-by: Lukas Sommer <[email protected]>
f4b9079
to
19524ee
Compare
Signed-off-by: Lukas Sommer <[email protected]>
Signed-off-by: Lukas Sommer <[email protected]>
Signed-off-by: Lukas Sommer <[email protected]>
Signed-off-by: Lukas Sommer <[email protected]>
Signed-off-by: Lukas Sommer <[email protected]>
Signed-off-by: Lukas Sommer <[email protected]>
Signed-off-by: Lukas Sommer <[email protected]>
Signed-off-by: Lukas Sommer <[email protected]>
Signed-off-by: Lukas Sommer <[email protected]>
Signed-off-by: Lukas Sommer <[email protected]>
Signed-off-by: Lukas Sommer <[email protected]>
Signed-off-by: Lukas Sommer <[email protected]>
Signed-off-by: Lukas Sommer <[email protected]>
Signed-off-by: Lukas Sommer <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine to me. Hard to see individual issues, but I also don't really have a knowledge of the whole system.
python/cutlass/backend/compiler.py
Outdated
if self._is_sycl(): | ||
q = dpctl.SyclQueue(cutlass.sycl_device()) | ||
module = dpctl.program.create_program_from_spirv(q, cubin_image) | ||
else: | ||
err, module = cuda.cuModuleLoadData(cubin_image) | ||
if err != cuda.CUresult.CUDA_SUCCESS: | ||
raise RuntimeError("Cuda Error: {}".format(err)) | ||
|
||
if self._is_sycl(): | ||
kernel = module.get_sycl_kernel(operation_name) | ||
else: | ||
err, kernel = cuda.cuModuleGetFunction( | ||
module, bytes(str.encode(operation_name))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if self._is_sycl(): | |
q = dpctl.SyclQueue(cutlass.sycl_device()) | |
module = dpctl.program.create_program_from_spirv(q, cubin_image) | |
else: | |
err, module = cuda.cuModuleLoadData(cubin_image) | |
if err != cuda.CUresult.CUDA_SUCCESS: | |
raise RuntimeError("Cuda Error: {}".format(err)) | |
if self._is_sycl(): | |
kernel = module.get_sycl_kernel(operation_name) | |
else: | |
err, kernel = cuda.cuModuleGetFunction( | |
module, bytes(str.encode(operation_name))) | |
if self._is_sycl(): | |
q = dpctl.SyclQueue(cutlass.sycl_device()) | |
module = dpctl.program.create_program_from_spirv(q, cubin_image) | |
kernel = module.get_sycl_kernel(operation_name) | |
else: | |
err, module = cuda.cuModuleLoadData(cubin_image) | |
if err != cuda.CUresult.CUDA_SUCCESS: | |
raise RuntimeError("Cuda Error: {}".format(err)) | |
err, kernel = cuda.cuModuleGetFunction( | |
module, bytes(str.encode(operation_name))) |
python/cutlass/backend/compiler.py
Outdated
if self.backend == "nvrtc": | ||
# 3. compile | ||
# 3. compile | ||
if self.backend == "nvrtc": # with nvrtc backend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if self.backend == "nvrtc": # with nvrtc backend | |
if self.backend == "nvrtc": |
python/cutlass/backend/compiler.py
Outdated
@@ -303,6 +335,50 @@ def emit_compile_(self, operation_list, compilation_options, host_compilation_op | |||
if err != nvrtc.nvrtcResult.NVRTC_SUCCESS: | |||
raise RuntimeError("NVRTC Error: {}".format(err)) | |||
|
|||
elif self.backend == "dpcpp": # with DPC++ backend |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
elif self.backend == "dpcpp": # with DPC++ backend | |
elif self.backend == "dpcpp": |
Signed-off-by: Lukas Sommer <[email protected]>
First implementation steps towards supporting the SYCL backend in the CUTLASS Python Interface.
The main additions from this PR are:
nvcc
to compile device and host code.The support so far focuses on a simple GEMM, epilogues (e.g, with visitor) are not yet supported.
Compilation is currently only possible with development versions of DPC++, the
-fsycl-rtc-mode
flag that was added to support CUTLASS nested parameter classes in free-function kernels as part of this work is not yet available in releases.The activation of the SYCL backend via environment variable is a temporary solution, a follow-up will look into a cleaner solution.