Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove restriction that group_local_memory/group_local_memory_for_overwrite function requires to be inlined #16617

Open
wenju-he opened this issue Jan 14, 2025 · 0 comments
Labels
enhancement New feature or request

Comments

@wenju-he
Copy link
Contributor

wenju-he commented Jan 14, 2025

Is your feature request related to a problem? Please describe

Currently group_local_memory/group_local_memory_for_overwrite is implemented with a function wrapper to builtin __sycl_allocateLocalMemory call. Since a call to group_local_memory_for_overwrite represents a distinct allocation of group local memory, we have to inline group_local_memory_for_overwrite function before builtin __sycl_allocateLocalMemory can be resolved. The restriction causes two problems:

  1. __sycl_allocateLocalMemory is lowered in SYCLLowerWGLocalMemoryPass which must run after AlwaysInlinerPass. Though AlwaysInlinerPass is guaranteed to run in LLVM pass pipeline, but it isn't run at O2 pipeline start. This is problematic if there is conflict that backend compiler requires __sycl_allocateLocalMemory to be lowered earlier than the AlwaysInlinerPass. In addition, the dependence on AlwaysInlinerPass is implicit and make SYCLLowerWGLocalMemoryPass not self-contained. PR [SYCL][SYCLLowerWGLocalMemoryPass] Remove implicit dependency on AlwaysInlinerPass and move to PipelineStart #16356 does inlining within SYCLLowerWGLocalMemoryPass, and thus removes the restriction. However, the PR introduces a new attribute which might be a tech debt.
  2. syclcompat::local_mem directly calls group_local_memory_for_overwrite:
    template <typename AllocT> auto *local_mem() {
    sycl::multi_ptr<AllocT, sycl::access::address_space::local_space>
    As_multi_ptr =
    sycl::ext::oneapi::group_local_memory_for_overwrite<AllocT>(
    sycl::ext::oneapi::this_work_item::get_work_group<3>());
    . This use propagates the inlining dependency and function syclcompat::local_mem must be inlined as well. Please also note that spec doesn't allow the use of group_local_memory in library function.

I am not sure if the issue is directly related to the restriction that group_local_memory must be defined at kernel function scope. The behavior, which prevents definition in non-kernel function, aligns with OpenCL. Please also see below note in the spec:

NOTE: The restriction that group-local variables must be defined at kernel
functor scope may be lifted in a future version of this extension.

Describe the solution you would like

No response

Describe alternatives you have considered

No response

Additional context

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant