[NVPTX] Do not run the NVVMReflect pass as part of the normal pipeline #121834

jhuber6 · 2025-01-06T20:53:14Z

Summary:
This pass lowers the __nvvm_reflect builtin in the IR. However, this
currently runs in the standard optimization pipeline, not just the
backend pipeline. This means that if the user creates LLVM-IR without an
architecture set, it will always delete the reflect code even if it is
intended to be used later.

Pushing this into the backend pipeline will ensure that this works as
intended, allowing users to conditionally include code depending on
which target architecture the user ended up using. This fixes a bug in
OpenMP and missing code in libc.

llvmbot · 2025-01-06T20:53:47Z

@llvm/pr-subscribers-backend-nvptx

Author: Joseph Huber (jhuber6)

Changes

Summary:
This pass lowers the __nvvm_reflect builtin in the IR. However, this
currently runs in the standard optimization pipeline, not just the
backend pipeline. This means that if the user creates LLVM-IR without an
architecture set, it will always delete the reflect code even if it is
intended to be used later.

Pushing this into the backend pipeline will ensure that this works as
intended, allowing users to conditionally include code depending on
which target architecture the user ended up using. This fixes a bug in
OpenMP and missing code in libc.

Full diff: https://github.com/llvm/llvm-project/pull/121834.diff

6 Files Affected:

(modified) llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp (-1)
(modified) llvm/lib/Target/NVPTX/NVVMReflect.cpp (+7-1)
(modified) llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/nvvm-reflect-ocl.ll (+2-2)
(modified) llvm/test/CodeGen/NVPTX/nvvm-reflect-opaque.ll (+3-3)
(modified) llvm/test/CodeGen/NVPTX/nvvm-reflect.ll (+4-3)

diff --git a/llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp b/llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp
index b3b2880588cc59..f6ec780d963d9a 100644
--- a/llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp
+++ b/llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp
@@ -255,7 +255,6 @@ void NVPTXTargetMachine::registerPassBuilderCallbacks(PassBuilder &PB) {
   PB.registerPipelineStartEPCallback(
       [this](ModulePassManager &PM, OptimizationLevel Level) {
         FunctionPassManager FPM;
-        FPM.addPass(NVVMReflectPass(Subtarget.getSmVersion()));
         // Note: NVVMIntrRangePass was causing numerical discrepancies at one
         // point, if issues crop up, consider disabling.
         FPM.addPass(NVVMIntrRangePass());
diff --git a/llvm/lib/Target/NVPTX/NVVMReflect.cpp b/llvm/lib/Target/NVPTX/NVVMReflect.cpp
index 56525a1edc7614..a0e897584a9d32 100644
--- a/llvm/lib/Target/NVPTX/NVVMReflect.cpp
+++ b/llvm/lib/Target/NVPTX/NVVMReflect.cpp
@@ -21,6 +21,7 @@
 #include "NVPTX.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/Analysis/ConstantFolding.h"
+#include "llvm/CodeGen/CommandFlags.h"
 #include "llvm/IR/Constants.h"
 #include "llvm/IR/DerivedTypes.h"
 #include "llvm/IR/Function.h"
@@ -219,7 +220,12 @@ bool NVVMReflect::runOnFunction(Function &F) {
   return runNVVMReflect(F, SmVersion);
 }
 
-NVVMReflectPass::NVVMReflectPass() : NVVMReflectPass(0) {}
+NVVMReflectPass::NVVMReflectPass() {
+  // Get the CPU string from the command line if not provided.
+  StringRef SM = codegen::getMCPU();
+  if (!SM.consume_front("sm_") || SM.consumeInteger(10, SmVersion))
+    SmVersion = 0;
+}
 
 PreservedAnalyses NVVMReflectPass::run(Function &F,
                                        FunctionAnalysisManager &AM) {
diff --git a/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll b/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll
index ac5875c6ab1043..83cb3cde48de18 100644
--- a/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll
+++ b/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll
@@ -1,9 +1,9 @@
 ; Libdevice in recent CUDA versions relies on __CUDA_ARCH reflecting GPU type.
 ; Verify that __nvvm_reflect() is replaced with an appropriate value.
 ;
-; RUN: opt %s -S -passes='default<O2>' -mtriple=nvptx64 -mcpu=sm_20 \
+; RUN: opt %s -S -passes='nvvm-reflect' -mtriple=nvptx64 -mcpu=sm_20 \
 ; RUN:   | FileCheck %s --check-prefixes=COMMON,SM20
-; RUN: opt %s -S -passes='default<O2>' -mtriple=nvptx64 -mcpu=sm_35 \
+; RUN: opt %s -S -passes='nvvm-reflect' -mtriple=nvptx64 -mcpu=sm_35 \
 ; RUN:   | FileCheck %s --check-prefixes=COMMON,SM35
 
 @"$str" = private addrspace(1) constant [12 x i8] c"__CUDA_ARCH\00"
diff --git a/llvm/test/CodeGen/NVPTX/nvvm-reflect-ocl.ll b/llvm/test/CodeGen/NVPTX/nvvm-reflect-ocl.ll
index 9d383218dce86a..bf8d6e2cca3071 100644
--- a/llvm/test/CodeGen/NVPTX/nvvm-reflect-ocl.ll
+++ b/llvm/test/CodeGen/NVPTX/nvvm-reflect-ocl.ll
@@ -1,8 +1,8 @@
 ; Verify that __nvvm_reflect_ocl() is replaced with an appropriate value
 ;
-; RUN: opt %s -S -passes='default<O2>' -mtriple=nvptx64 -mcpu=sm_20 \
+; RUN: opt %s -S -passes='nvvm-reflect' -mtriple=nvptx64 -mcpu=sm_20 \
 ; RUN:   | FileCheck %s --check-prefixes=COMMON,SM20
-; RUN: opt %s -S -passes='default<O2>' -mtriple=nvptx64 -mcpu=sm_35 \
+; RUN: opt %s -S -passes='nvvm-reflect' -mtriple=nvptx64 -mcpu=sm_35 \
 ; RUN:   | FileCheck %s --check-prefixes=COMMON,SM35
 
 @"$str" = private addrspace(4) constant [12 x i8] c"__CUDA_ARCH\00"
diff --git a/llvm/test/CodeGen/NVPTX/nvvm-reflect-opaque.ll b/llvm/test/CodeGen/NVPTX/nvvm-reflect-opaque.ll
index 46ab79d9858cad..19c74df3037028 100644
--- a/llvm/test/CodeGen/NVPTX/nvvm-reflect-opaque.ll
+++ b/llvm/test/CodeGen/NVPTX/nvvm-reflect-opaque.ll
@@ -3,12 +3,12 @@
 
 ; RUN: cat %s > %t.noftz
 ; RUN: echo '!0 = !{i32 4, !"nvvm-reflect-ftz", i32 0}' >> %t.noftz
-; RUN: opt %t.noftz -S -mtriple=nvptx-nvidia-cuda -passes='default<O2>' \
+; RUN: opt %t.noftz -S -mtriple=nvptx-nvidia-cuda -passes='nvvm-reflect,simplifycfg' \
 ; RUN:   | FileCheck %s --check-prefix=USE_FTZ_0 --check-prefix=CHECK
 
 ; RUN: cat %s > %t.ftz
 ; RUN: echo '!0 = !{i32 4, !"nvvm-reflect-ftz", i32 1}' >> %t.ftz
-; RUN: opt %t.ftz -S -mtriple=nvptx-nvidia-cuda -passes='default<O2>' \
+; RUN: opt %t.ftz -S -mtriple=nvptx-nvidia-cuda -passes='nvvm-reflect,simplifycfg' \
 ; RUN:   | FileCheck %s --check-prefix=USE_FTZ_1 --check-prefix=CHECK
 
 @str = private unnamed_addr addrspace(4) constant [11 x i8] c"__CUDA_FTZ\00"
@@ -43,7 +43,7 @@ exit:
 
 declare i32 @llvm.nvvm.reflect(ptr)
 
-; CHECK-LABEL: define noundef i32 @intrinsic
+; CHECK-LABEL: define i32 @intrinsic
 define i32 @intrinsic() {
 ; CHECK-NOT: call i32 @llvm.nvvm.reflect
 ; USE_FTZ_0: ret i32 0
diff --git a/llvm/test/CodeGen/NVPTX/nvvm-reflect.ll b/llvm/test/CodeGen/NVPTX/nvvm-reflect.ll
index 2ed9f7c11bcf9b..244b44fea9b83c 100644
--- a/llvm/test/CodeGen/NVPTX/nvvm-reflect.ll
+++ b/llvm/test/CodeGen/NVPTX/nvvm-reflect.ll
@@ -3,12 +3,12 @@
 
 ; RUN: cat %s > %t.noftz
 ; RUN: echo '!0 = !{i32 4, !"nvvm-reflect-ftz", i32 0}' >> %t.noftz
-; RUN: opt %t.noftz -S -mtriple=nvptx-nvidia-cuda -passes='default<O2>' \
+; RUN: opt %t.noftz -S -mtriple=nvptx-nvidia-cuda -passes='nvvm-reflect,simplifycfg' \
 ; RUN:   | FileCheck %s --check-prefix=USE_FTZ_0 --check-prefix=CHECK
 
 ; RUN: cat %s > %t.ftz
 ; RUN: echo '!0 = !{i32 4, !"nvvm-reflect-ftz", i32 1}' >> %t.ftz
-; RUN: opt %t.ftz -S -mtriple=nvptx-nvidia-cuda -passes='default<O2>' \
+; RUN: opt %t.ftz -S -mtriple=nvptx-nvidia-cuda -passes='nvvm-reflect,simplifycfg' \
 ; RUN:   | FileCheck %s --check-prefix=USE_FTZ_1 --check-prefix=CHECK
 
 @str = private unnamed_addr addrspace(4) constant [11 x i8] c"__CUDA_FTZ\00"
@@ -43,7 +43,8 @@ exit:
 
 declare i32 @llvm.nvvm.reflect(ptr)
 
-; CHECK-LABEL: define noundef i32 @intrinsic
+; CHECK-LABEL: define i32 @intrinsic
+
 define i32 @intrinsic() {
 ; CHECK-NOT: call i32 @llvm.nvvm.reflect
 ; USE_FTZ_0: ret i32 0

Artem-B · 2025-01-06T21:22:00Z

The problem is that libdevice depends on this patch and it does carry a fair amount of code that will no longer benefit from removal of unused conditional branches.
The way libdevice is used in CUDA, the intent was to process conditional bitcode early.
If OpenMP wants to do it differently, I would prefer to make it a special case, and keep the early reflect pass for CUDA.

jhuber6 · 2025-01-06T21:25:18Z

The problem is that libdevice depends on this patch and it does carry a fair amount of code that will no longer benefit from removal of unused conditional branches. The way libdevice is used in CUDA, the intent was to process conditional bitcode early. If OpenMP wants to do it differently, I would prefer to make it a special case, and keep the early reflect pass for CUDA.

I don't think this will make a considerable difference, since it's usually guarding some very shallow code paths. We still get full optimizations when the backend runs. If you think this is a major issue, I could acquiesce to making the non-backend version skip lowering if SmVersion is not set, but I think that this is cleaner.

$ clang foo.c --target=nvptx64-nvidia-cuda -flto -c -O2 // Used to run here
$ clang foo.bc --target=nvptx64-nvidia-cuda -O2 // Now only runs here

llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp

jtramm · 2025-01-06T22:00:23Z

With this PR, OpenMC works again with NVIDIA cards (OpenMC has been broken on nvidia since #119091).

jdoerfert · 2025-01-06T22:50:49Z

FWIW, the pass should be super cheap, if it starts by looking for the intrinsic and then the uses. Running it twice is a valid option.

jhuber6 · 2025-01-06T22:53:21Z

FWIW, the pass should be super cheap, if it starts by looking for the intrinsic and then the uses. Running it twice is a valid option.

First pass should delete all the uses, the concern is that by not trimming the intrinsic earlier we're losing some optimizations, but I feel like we're still getting a full optimization pipeline and this likely will be easily optimized out. If @Artem-B is really concerned I'll just change it to keep the per-file run but ignore it if there's no SM passed.

Artem-B · 2025-01-06T23:51:58Z

Running it twice is a valid option.

The problem, IIUIC, is that in some compilation modes we may run optimization w/o the constants set properly for the reflect pass and running it may pick the wrong branch -- something that the late reflect pass would not be able to undo.

I don't think this will make a considerable difference, since it's usually guarding some very shallow code paths.

I don't think it's always the case. There are functions in libdevice where __nvvm_reflect() is used multiple times (i.e. not just a single top-level if).

I feel like we're still getting a full optimization pipeline and this likely will be easily optimized out.

It's a maybe. Considering that __nvvm_reflect() only depends on a string, its branches may be optimizable, as long as they have no other __nvvm_reflect() calls in them. However, libdevice does have some functions where it's not the case.

The practical impact will likely be limited to the heavy functions with multiple __nvvm_reflect() calls in branches, which will potentially keep those branches hanging around and blocking optimizations throughout the normal optimization pipeline. Whether the limited optimizations in the back-end will be sufficient to produce good-enough code for those functions -- I do not know. It will likely be rare, but it's a small consolation for those folks who use those functions.

ignore it if there's no SM passed.

Reflect can be used with other parameters, so SM-only check alone is, generally speaking, not sufficient as an on/off switch.

I think the decision where the reflect pass should run should be tied to the earliest point where the reflect inputs get set. For CUDA, it's the beginning of the pipeline. For openMP and stand-alone compilation it's probably somewhere closer to the back-end (or wherever we may link in with libdevice, or do LTO, or other point where we finally know what we're actually compiling for).

jhuber6 · 2025-01-07T00:11:40Z

I think the decision where the reflect pass should run should be tied to the earliest point where the reflect inputs get set. For CUDA, it's the beginning of the pipeline. For openMP and stand-alone compilation it's probably somewhere closer to the back-end (or wherever we may link in with libdevice, or do LTO, or other point where we finally know what we're actually compiling for).

Yeah, the point is to defer something until the backend knows what the actual target is. The optimizations that run on the initial compile are usually more generic so I wouldn't think this would fire until the backend.

jdoerfert · 2025-01-07T00:41:08Z

ignore it if there's no SM passed.

Reflect can be used with other parameters, so SM-only check alone is, generally speaking, not sufficient as an on/off switch.

Can we make the pass at first only specialize what is known to be known, and later do the rest?

jhuber6 · 2025-01-07T00:43:22Z

Can we make the pass at first only specialize what is known to be known, and later do the rest?

What I was suggesting, but I think it makes more sense to just make this a backend thing. Only difference it makes is having a few branches live slightly longer, but I really don't think that it'll make a noticeable difference.

arsenm · 2025-01-07T01:18:01Z

I don't think this belongs in the backend, or middle end optimization pipeline. It's really a job for whatever "frontend" is loading the bitcode for final code generation

Summary: This pass lowers the `__nvvm_reflect` builtin in the IR. However, this currently runs in the standard optimization pipeline, not just the backend pipeline. This means that if the user creates LLVM-IR without an architecture set, it will always delete the reflect code even if it is intended to be used later. Pushing this into the backend pipeline will ensure that this works as intended, allowing users to conditionally include code depending on which target architecture the user ended up using. This fixes a bug in OpenMP and missing code in `libc`.

jdoerfert · 2025-01-07T19:16:06Z

I don't think this belongs in the backend, or middle end optimization pipeline. It's really a job for whatever "frontend" is loading the bitcode for final code generation

I get @jhuber6's point about target-specific specialization. There is a benefit if we could do more "library" code IR generation w/o specifying all target details. We kinda do that now, and it broke stuff, but the direction is good.

What is the downside of multiple specialization runs, with the earlier one(s) not specializing what they do not know for sure?

jhuber6 · 2025-01-07T19:18:36Z

I think I could probably make @Artem-B happy if I just forewent the early pass if the architecture is not known. That'd leave CUDA with identical behavior while allowing this kind of use where we only specify the target during the final link + backend stage.

Artem-B · 2025-01-07T19:36:25Z

That would work, too.

jhuber6 · 2025-01-07T19:37:34Z

That would work, too.

It's a little annoying for NVPTX because we just default to sm_30 in these cases, might need to invent some way to detect if it's not been passed.

Artem-B · 2025-01-07T19:46:56Z

Can you rely on the 'cuda' part of the triple instead?

jhuber6 · 2025-01-07T19:48:25Z

Here's my attempt, hopefully it's not too invasive. Eager to get this landed so OpenMC works again.

Artem-B

LGTM

Artem-B · 2025-01-07T20:06:53Z

llvm/lib/Target/NVPTX/NVPTXSubtarget.cpp


-    ParseSubtargetFeatures(TargetName, /*TuneCPU*/ TargetName, FS);
+    ParseSubtargetFeatures(CPU.empty() ? "sm_30" : CPU,


CPU.empty() ? "sm_30" : CPU -> getTargetName()

Artem-B · 2025-01-07T20:08:15Z

llvm/lib/Target/NVPTX/NVPTXSubtarget.cpp

@@ -35,9 +35,10 @@ void NVPTXSubtarget::anchor() {}
 NVPTXSubtarget &NVPTXSubtarget::initializeSubtargetDependencies(StringRef CPU,
                                                                StringRef FS) {
    // Provide the default CPU if we don't have one.
-    TargetName = std::string(CPU.empty() ? "sm_30" : CPU);
+    TargetName = std::string(CPU);


It could use a comment on why we may want to keep CPU empty in some cases.

github-actions · 2025-01-07T20:14:34Z

✅ With the latest revision this PR passed the C/C++ code formatter.

kazutakahirata · 2025-01-07T20:58:05Z

I just checked in e7a83fc to fix a warning from this PR.

jhuber6 · 2025-01-07T20:59:30Z

I just checked in e7a83fc to fix a warning from this PR.

Was in the process of doing that myself, thanks for fixing it so fast.

llvm-ci · 2025-01-07T21:30:33Z

LLVM Buildbot has detected a new failure on builder sanitizer-aarch64-linux-bootstrap-hwasan running on sanitizer-buildbot11 while building llvm at step 2 "annotate".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/55/builds/5192

Here is the relevant piece of the build log for the reference

Step 2 (annotate) failure: 'python ../sanitizer_buildbot/sanitizers/zorg/buildbot/builders/sanitizers/buildbot_selector.py' (failure)
...
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 85760 tests, 72 workers --
Testing:  0.. 10.. 20.. 30.. 40
FAIL: LLVM :: CodeGen/NVPTX/nvvm-reflect-arch.ll (38296 of 85760)
******************** TEST 'LLVM :: CodeGen/NVPTX/nvvm-reflect-arch.ll' FAILED ********************
Exit Code: 2

Command Output (stderr):
--
RUN: at line 4: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/opt /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll -S -passes='nvvm-reflect' -mtriple=nvptx64 -mcpu=sm_20    | /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll --check-prefixes=COMMON,SM20
+ /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/opt /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll -S -passes=nvvm-reflect -mtriple=nvptx64 -mcpu=sm_20
+ /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll --check-prefixes=COMMON,SM20
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/opt /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll -S -passes=nvvm-reflect -mtriple=nvptx64 -mcpu=sm_20
 #0 0x0000b5fd5914ec6c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:13
 #1 0x0000b5fd59149440 llvm::sys::RunSignalHandlers() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Signals.cpp:106:18
 #2 0x0000b5fd59150280 SignalHandler(int) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3
 #3 0x0000f2d1577c98f8 (linux-vdso.so.1+0x8f8)
 #4 0x0000b5fd5906af80 SigTrap<(__hwasan::ErrorAction)1, (__hwasan::AccessType)0> /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan_checks.h:107:3
 #5 0x0000b5fd5906af80 MemcmpInterceptorCommon(void*, int (*)(void const*, void const*, unsigned long), void const*, void const*, unsigned long) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/../sanitizer_common/sanitizer_common_interceptors.inc:847:7
 #6 0x0000b5fd600b40f8 consume_front /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/ADT/StringRef.h:636:11
 #7 0x0000b5fd600b40f8 llvm::NVVMReflectPass::NVVMReflectPass() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Target/NVPTX/NVVMReflect.cpp:226:11
 #8 0x0000b5fd60099478 addPass<llvm::NVVMReflectPass> /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/IR/PassManager.h:201:9
 #9 0x0000b5fd60099478 operator() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Target/NVPTX/NVPTXPassRegistry.def:39:1
#10 0x0000b5fd60099478 __invoke<(lambda at /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/Passes/TargetPassRegistry.inc:112:36) &, llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> > &, llvm::ArrayRef<llvm::PassBuilder::PipelineElement> > /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/__type_traits/invoke.h:149:25
#11 0x0000b5fd60099478 __call<(lambda at /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/Passes/TargetPassRegistry.inc:112:36) &, llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> > &, llvm::ArrayRef<llvm::PassBuilder::PipelineElement> > /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/__type_traits/invoke.h:216:12
#12 0x0000b5fd60099478 operator() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/__functional/function.h:169:12
#13 0x0000b5fd60099478 std::__1::__function::__func<llvm::NVPTXTargetMachine::registerPassBuilderCallbacks(llvm::PassBuilder&)::$_3, std::__1::allocator<llvm::NVPTXTargetMachine::registerPassBuilderCallbacks(llvm::PassBuilder&)::$_3>, bool (llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement>)>::operator()(llvm::StringRef&&, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement>&&) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/__functional/function.h:314:10
#14 0x0000b5fd5dca5b40 operator() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/__functional/function.h:990:3
#15 0x0000b5fd5dca5b40 bool callbacksAcceptPassName<llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>, llvm::SmallVector<std::__1::function<bool (llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement>)>, 2u>>(llvm::StringRef, llvm::SmallVector<std::__1::function<bool (llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement>)>, 2u>&) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Passes/PassBuilder.cpp:1338:11
#16 0x0000b5fd5db13390 llvm::PassBuilder::parsePassPipeline(llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>&, llvm::StringRef) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Passes/PassBuilder.cpp:2173:16
#17 0x0000b5fd5c65dc98 getPtr /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/Support/Error.h:279:42
#18 0x0000b5fd5c65dc98 operator bool /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/Support/Error.h:242:16
#19 0x0000b5fd5c65dc98 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::ArrayRef<std::__1::function<void (llvm::PassBuilder&)>>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool, bool) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/tools/opt/NewPMDriver.cpp:478:14
#20 0x0000b5fd590b6830 __is_long /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/string:1892:23
#21 0x0000b5fd590b6830 ~basic_string /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/string:1228:9
#22 0x0000b5fd590b6830 optMain /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/tools/opt/optdriver.cpp:747:3
#23 0x0000f2d1570684c4 (/lib/aarch64-linux-gnu/libc.so.6+0x284c4)
#24 0x0000f2d157068598 __libc_start_main (/lib/aarch64-linux-gnu/libc.so.6+0x28598)
#25 0x0000b5fd5905ae70 _start (/home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/opt+0x57fae70)
==289873==ERROR: HWAddressSanitizer: tag-mismatch on address 0xffffd31298a1 at pc 0xb5fd5906af80
Step 11 (stage2/hwasan check) failure: stage2/hwasan check (failure)
...
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using lld-link: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/lld-link
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using ld64.lld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/ld64.lld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/llvm/config.py:506: note: using wasm-ld: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/wasm-ld
llvm-lit: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/utils/lit/lit/main.py:72: note: The test suite configuration requested an individual test timeout of 0 seconds but a timeout of 900 seconds was requested on the command line. Forcing timeout to be 900 seconds.
-- Testing: 85760 tests, 72 workers --
Testing:  0.. 10.. 20.. 30.. 40
FAIL: LLVM :: CodeGen/NVPTX/nvvm-reflect-arch.ll (38296 of 85760)
******************** TEST 'LLVM :: CodeGen/NVPTX/nvvm-reflect-arch.ll' FAILED ********************
Exit Code: 2

Command Output (stderr):
--
RUN: at line 4: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/opt /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll -S -passes='nvvm-reflect' -mtriple=nvptx64 -mcpu=sm_20    | /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll --check-prefixes=COMMON,SM20
+ /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/opt /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll -S -passes=nvvm-reflect -mtriple=nvptx64 -mcpu=sm_20
+ /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/FileCheck /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll --check-prefixes=COMMON,SM20
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/opt /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/test/CodeGen/NVPTX/nvvm-reflect-arch.ll -S -passes=nvvm-reflect -mtriple=nvptx64 -mcpu=sm_20
 #0 0x0000b5fd5914ec6c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:13
 #1 0x0000b5fd59149440 llvm::sys::RunSignalHandlers() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Signals.cpp:106:18
 #2 0x0000b5fd59150280 SignalHandler(int) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Support/Unix/Signals.inc:0:3
 #3 0x0000f2d1577c98f8 (linux-vdso.so.1+0x8f8)
 #4 0x0000b5fd5906af80 SigTrap<(__hwasan::ErrorAction)1, (__hwasan::AccessType)0> /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/hwasan_checks.h:107:3
 #5 0x0000b5fd5906af80 MemcmpInterceptorCommon(void*, int (*)(void const*, void const*, unsigned long), void const*, void const*, unsigned long) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/compiler-rt/lib/hwasan/../sanitizer_common/sanitizer_common_interceptors.inc:847:7
 #6 0x0000b5fd600b40f8 consume_front /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/ADT/StringRef.h:636:11
 #7 0x0000b5fd600b40f8 llvm::NVVMReflectPass::NVVMReflectPass() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Target/NVPTX/NVVMReflect.cpp:226:11
 #8 0x0000b5fd60099478 addPass<llvm::NVVMReflectPass> /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/IR/PassManager.h:201:9
 #9 0x0000b5fd60099478 operator() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Target/NVPTX/NVPTXPassRegistry.def:39:1
#10 0x0000b5fd60099478 __invoke<(lambda at /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/Passes/TargetPassRegistry.inc:112:36) &, llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> > &, llvm::ArrayRef<llvm::PassBuilder::PipelineElement> > /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/__type_traits/invoke.h:149:25
#11 0x0000b5fd60099478 __call<(lambda at /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/Passes/TargetPassRegistry.inc:112:36) &, llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function> > &, llvm::ArrayRef<llvm::PassBuilder::PipelineElement> > /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/__type_traits/invoke.h:216:12
#12 0x0000b5fd60099478 operator() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/__functional/function.h:169:12
#13 0x0000b5fd60099478 std::__1::__function::__func<llvm::NVPTXTargetMachine::registerPassBuilderCallbacks(llvm::PassBuilder&)::$_3, std::__1::allocator<llvm::NVPTXTargetMachine::registerPassBuilderCallbacks(llvm::PassBuilder&)::$_3>, bool (llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement>)>::operator()(llvm::StringRef&&, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement>&&) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/__functional/function.h:314:10
#14 0x0000b5fd5dca5b40 operator() /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/__functional/function.h:990:3
#15 0x0000b5fd5dca5b40 bool callbacksAcceptPassName<llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>, llvm::SmallVector<std::__1::function<bool (llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement>)>, 2u>>(llvm::StringRef, llvm::SmallVector<std::__1::function<bool (llvm::StringRef, llvm::PassManager<llvm::Function, llvm::AnalysisManager<llvm::Function>>&, llvm::ArrayRef<llvm::PassBuilder::PipelineElement>)>, 2u>&) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Passes/PassBuilder.cpp:1338:11
#16 0x0000b5fd5db13390 llvm::PassBuilder::parsePassPipeline(llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>&, llvm::StringRef) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/lib/Passes/PassBuilder.cpp:2173:16
#17 0x0000b5fd5c65dc98 getPtr /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/Support/Error.h:279:42
#18 0x0000b5fd5c65dc98 operator bool /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/include/llvm/Support/Error.h:242:16
#19 0x0000b5fd5c65dc98 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::ArrayRef<std::__1::function<void (llvm::PassBuilder&)>>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool, bool) /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/tools/opt/NewPMDriver.cpp:478:14
#20 0x0000b5fd590b6830 __is_long /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/string:1892:23
#21 0x0000b5fd590b6830 ~basic_string /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/libcxx_install_hwasan/include/c++/v1/string:1228:9
#22 0x0000b5fd590b6830 optMain /home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm-project/llvm/tools/opt/optdriver.cpp:747:3
#23 0x0000f2d1570684c4 (/lib/aarch64-linux-gnu/libc.so.6+0x284c4)
#24 0x0000f2d157068598 __libc_start_main (/lib/aarch64-linux-gnu/libc.so.6+0x28598)
#25 0x0000b5fd5905ae70 _start (/home/b/sanitizer-aarch64-linux-bootstrap-hwasan/build/llvm_build_hwasan/bin/opt+0x57fae70)
==289873==ERROR: HWAddressSanitizer: tag-mismatch on address 0xffffd31298a1 at pc 0xb5fd5906af80

jhuber6 requested review from AlexMaclean, arsenm, Artem-B, jdoerfert and jlebar January 6, 2025 20:53

llvmbot added the backend:NVPTX label Jan 6, 2025

jdoerfert reviewed Jan 6, 2025

View reviewed changes

llvm/lib/Target/NVPTX/NVPTXTargetMachine.cpp Show resolved Hide resolved

Only when not specified

383ae00

jhuber6 force-pushed the ReflectFix branch from f616b66 to 383ae00 Compare January 7, 2025 19:47

Artem-B approved these changes Jan 7, 2025

View reviewed changes

update

a9edd01

jhuber6 force-pushed the ReflectFix branch from 82ec461 to a9edd01 Compare January 7, 2025 20:16

jhuber6 merged commit 29b5c18 into llvm:main Jan 7, 2025
5 of 7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NVPTX] Do not run the NVVMReflect pass as part of the normal pipeline #121834

[NVPTX] Do not run the NVVMReflect pass as part of the normal pipeline #121834

jhuber6 commented Jan 6, 2025

llvmbot commented Jan 6, 2025

Artem-B commented Jan 6, 2025

jhuber6 commented Jan 6, 2025

jtramm commented Jan 6, 2025

jdoerfert commented Jan 6, 2025

jhuber6 commented Jan 6, 2025

Artem-B commented Jan 6, 2025

jhuber6 commented Jan 7, 2025

jdoerfert commented Jan 7, 2025

jhuber6 commented Jan 7, 2025

arsenm commented Jan 7, 2025

jdoerfert commented Jan 7, 2025

jhuber6 commented Jan 7, 2025

Artem-B commented Jan 7, 2025

jhuber6 commented Jan 7, 2025

Artem-B commented Jan 7, 2025

jhuber6 commented Jan 7, 2025

Artem-B left a comment

Artem-B Jan 7, 2025

Artem-B Jan 7, 2025

github-actions bot commented Jan 7, 2025 •

edited

Loading

kazutakahirata commented Jan 7, 2025

jhuber6 commented Jan 7, 2025

llvm-ci commented Jan 7, 2025


		ParseSubtargetFeatures(TargetName, /TuneCPU/ TargetName, FS);
		ParseSubtargetFeatures(CPU.empty() ? "sm_30" : CPU,

[NVPTX] Do not run the NVVMReflect pass as part of the normal pipeline #121834

[NVPTX] Do not run the NVVMReflect pass as part of the normal pipeline #121834

Conversation

jhuber6 commented Jan 6, 2025

llvmbot commented Jan 6, 2025

Artem-B commented Jan 6, 2025

jhuber6 commented Jan 6, 2025

jtramm commented Jan 6, 2025

jdoerfert commented Jan 6, 2025

jhuber6 commented Jan 6, 2025

Artem-B commented Jan 6, 2025

jhuber6 commented Jan 7, 2025

jdoerfert commented Jan 7, 2025

jhuber6 commented Jan 7, 2025

arsenm commented Jan 7, 2025

jdoerfert commented Jan 7, 2025

jhuber6 commented Jan 7, 2025

Artem-B commented Jan 7, 2025

jhuber6 commented Jan 7, 2025

Artem-B commented Jan 7, 2025

jhuber6 commented Jan 7, 2025

Artem-B left a comment

Choose a reason for hiding this comment

Artem-B Jan 7, 2025

Choose a reason for hiding this comment

Artem-B Jan 7, 2025

Choose a reason for hiding this comment

github-actions bot commented Jan 7, 2025 • edited Loading

kazutakahirata commented Jan 7, 2025

jhuber6 commented Jan 7, 2025

llvm-ci commented Jan 7, 2025

github-actions bot commented Jan 7, 2025 •

edited

Loading