-
Notifications
You must be signed in to change notification settings - Fork 754
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SYCL][SYCLLowerWGLocalMemoryPass] Remove implicit dependency on AlwaysInlinerPass and move to PipelineStart #16356
base: sycl
Are you sure you want to change the base?
Changes from 4 commits
e6ce584
51b1ec9
a4f8382
b42fd22
44db66a
97add86
a4fe915
d764d00
a765d79
0de9070
db2ad4a
442fe98
275d60d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -9,15 +9,17 @@ | |
//===----------------------------------------------------------------------===// | ||
|
||
#include "llvm/SYCLLowerIR/LowerWGLocalMemory.h" | ||
#include "llvm/Demangle/Demangle.h" | ||
#include "llvm/IR/Function.h" | ||
#include "llvm/IR/IRBuilder.h" | ||
#include "llvm/IR/InstIterator.h" | ||
#include "llvm/Pass.h" | ||
#include "llvm/TargetParser/Triple.h" | ||
#include "llvm/Transforms/Utils/Cloning.h" | ||
|
||
using namespace llvm; | ||
|
||
#define DEBUG_TYPE "LowerWGLocalMemory" | ||
#define DEBUG_TYPE "sycllowerwglocalmemory" | ||
|
||
static constexpr char SYCL_ALLOCLOCALMEM_CALL[] = "__sycl_allocateLocalMemory"; | ||
static constexpr char SYCL_DYNAMIC_LOCALMEM_CALL[] = | ||
|
@@ -84,6 +86,42 @@ ModulePass *llvm::createSYCLLowerWGLocalMemoryLegacyPass() { | |
return new SYCLLowerWGLocalMemoryLegacy(); | ||
} | ||
|
||
// In sycl header __sycl_allocateLocalMemory builtin call is wrapped in | ||
// group_local_memory/group_local_memory_for_overwrite functions, which must be | ||
// inlined first before each __sycl_allocateLocalMemory call can be lowered to a | ||
// unique global variable. Inlining them here so that this pass doesn't have | ||
// implicit dependency on AlwaysInlinerPass. | ||
static bool inlineGroupLocalMemoryFunc(Module &M) { | ||
Function *ALMFunc = M.getFunction(SYCL_ALLOCLOCALMEM_CALL); | ||
if (!ALMFunc || ALMFunc->use_empty()) | ||
return false; | ||
|
||
bool Changed = false; | ||
for (auto *U : ALMFunc->users()) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we need to use a work list here rather than the simple loop. This function https://github.com/intel/llvm/blob/sycl/sycl/include/syclcompat/memory.hpp#L71 needs to be updated as well, and this function won't be able to handle the nesting. The CI is currently green because there is no test requesting 2 distinct local memory objects using this function in the same kernel. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. done, thank you for the suggestion. Now I understand what you mean by |
||
auto *Caller = cast<CallInst>(U)->getFunction(); | ||
if (!Caller->hasFnAttribute(Attribute::AlwaysInline)) { | ||
// Already inlined. | ||
continue; | ||
} | ||
std::string FName = llvm::demangle(Caller->getName()); | ||
if (FName.find("sycl::_V1::ext::oneapi::group_local_memory") == | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hardcoding current function name from DPC++ library is unfortunate. The code in the DPC++ header files can be changed at any time. To make it more robust, I thought we could go up in the call stack up-to the kernel function ignoring all functions in @Naghasan, do you have any thoughts on that? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree it is unfortunate, especially w.r.t. upstreaming. I don't know what the plans are for this one but if it is seen as important, we might want to improve this.
I don't think this is an issue TBH, I don't see any benefit in not inline the SYCL kernel in the wrapper, even in SPIR-V. I think relying on an attribute is probably the most flexible: this makes the compiler agnostic to header refactor and changes in API. It is also cheap to add. I also just realized There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
A new attribute "sycl_forceinline" is added in a4fe915 |
||
std::string::npos) { | ||
// Already inlined. | ||
continue; | ||
} | ||
for (auto *U2 : make_early_inc_range(Caller->users())) { | ||
auto *CI = cast<CallInst>(U2); | ||
InlineFunctionInfo IFI; | ||
[[maybe_unused]] auto Result = InlineFunction(*CI, IFI); | ||
assert(Result.isSuccess() && "inlining failed"); | ||
} | ||
Caller->eraseFromParent(); | ||
Changed = true; | ||
} | ||
|
||
return Changed; | ||
} | ||
|
||
// TODO: It should be checked that __sycl_allocateLocalMemory (or its source | ||
// form - group_local_memory) does not occur: | ||
// - in a function (other than user lambda/functor) | ||
|
@@ -317,9 +355,8 @@ static bool dynamicWGLocalMemory(Module &M) { | |
|
||
PreservedAnalyses SYCLLowerWGLocalMemoryPass::run(Module &M, | ||
ModuleAnalysisManager &) { | ||
bool MadeChanges = allocaWGLocalMemory(M); | ||
MadeChanges = dynamicWGLocalMemory(M) || MadeChanges; | ||
if (MadeChanges) | ||
return PreservedAnalyses::none(); | ||
return PreservedAnalyses::all(); | ||
bool Changed = inlineGroupLocalMemoryFunc(M); | ||
Changed |= allocaWGLocalMemory(M); | ||
Changed |= dynamicWGLocalMemory(M); | ||
return Changed ? PreservedAnalyses::none() : PreservedAnalyses::all(); | ||
jsji marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can we not rewrite the SYCL headers to 'inline' these calls? Is there a specific reason? Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't ask users to call
__sycl_allocateLocalMemory
internal intrsinsic when documented interface issycl::ext::something::something::group_local_memory<T>