Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Clang] __has_builtin should return false for aux triple builtins #121839

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

sarnex
Copy link
Member

@sarnex sarnex commented Jan 6, 2025

Currently, __has_builtin will return true when passed a builtin that is only supported on the aux target. I found this when __has_builtin was called with an X86 builtin but the current target was SPIR-V.

We should instead return false for aux builtins.

@jhuber6
Copy link
Contributor

jhuber6 commented Jan 6, 2025

Ah, I see how it is. And it probably worked in the NVPTX / AMDGCN case because we had valid code paths. This is definitely something that should be fixed, because it makes no sense to say a builtin is available if you can't even use it. Though offloading languages do tend to hate anything that lets users peel back the layers and see them as two separate compile jobs.

Copy link
Contributor

@jhuber6 jhuber6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that we could just skip generating the builtin IDs at all for the aux target? Or does that break something.

@sarnex
Copy link
Member Author

sarnex commented Jan 6, 2025

Is it possible that we could just skip generating the builtin IDs at all for the aux target? Or does that break something.

Seems like we can't, I tried this:

diff --git a/clang/lib/Basic/Builtins.cpp b/clang/lib/Basic/Builtins.cpp
index c6e54b89e848..904c64f05c36 100644
--- a/clang/lib/Basic/Builtins.cpp
+++ b/clang/lib/Basic/Builtins.cpp
@@ -64,8 +64,8 @@ void Builtin::Context::InitializeTarget(const TargetInfo &Target,
                                         const TargetInfo *AuxTarget) {
   assert(TSRecords.empty() && "Already initialized target?");
   TSRecords = Target.getTargetBuiltins();
-  if (AuxTarget)
-    AuxTSRecords = AuxTarget->getTargetBuiltins();
+  //  if (AuxTarget)
+  //  AuxTSRecords = AuxTarget->getTargetBuiltins();
 }

and got these failures

Failed Tests (3):
  Clang :: CodeGenHipStdPar/unsupported-builtins.cpp
  Clang :: Headers/__cpuidex_conflict.c
  Clang :: Sema/builtin-spirv-amdgcn-atomic-inc-dec-failure.cpp

and then reverted that and tried this

diff --git a/clang/lib/Basic/Builtins.cpp b/clang/lib/Basic/Builtins.cpp
index c6e54b89e848..34e1d0aa4b68 100644
--- a/clang/lib/Basic/Builtins.cpp
+++ b/clang/lib/Basic/Builtins.cpp
@@ -153,9 +153,9 @@ void Builtin::Context::initializeBuiltins(IdentifierTable &Table,
       Table.get(TSRecords[i].Name).setBuiltinID(i + Builtin::FirstTSBuiltin);
 
   // Step #3: Register target-specific builtins for AuxTarget.
-  for (unsigned i = 0, e = AuxTSRecords.size(); i != e; ++i)
-    Table.get(AuxTSRecords[i].Name)
-        .setBuiltinID(i + Builtin::FirstTSBuiltin + TSRecords.size());
+  //for (unsigned i = 0, e = AuxTSRecords.size(); i != e; ++i)
+  //  Table.get(AuxTSRecords[i].Name)
+  //      .setBuiltinID(i + Builtin::FirstTSBuiltin + TSRecords.size());
 
   // Step #4: Unregister any builtins specified by -fno-builtin-foo.
   for (llvm::StringRef Name : LangOpts.NoBuiltinFuncs) {

and it broke the same tests

Failed Tests (3):
  Clang :: CodeGenHipStdPar/unsupported-builtins.cpp
  Clang :: Headers/__cpuidex_conflict.c
  Clang :: Sema/builtin-spirv-amdgcn-atomic-inc-dec-failure.cpp

@sarnex
Copy link
Member Author

sarnex commented Jan 6, 2025

CodeGenHipStdPar/unsupported-builtins.cpp is pretty interesting actually, it looks like it tests for some behavior in CodeGen that seems like it's trying to fix the exact same problem

The other two tests seem to be actually unrelated breakages though.

@jhuber6
Copy link
Contributor

jhuber6 commented Jan 6, 2025

CodeGenHipStdPar/unsupported-builtins.cpp is pretty interesting actually, it looks like it tests for some behavior in CodeGen that seems like it's trying to fix the exact same problem

The other two tests seem to be actually unrelated breakages though.

Maybe @AlexVlx would know about this in general, but I'd definitely say it's problematic behavior.

Signed-off-by: Sarnie, Nick <[email protected]>
@sarnex sarnex marked this pull request as ready for review January 7, 2025 13:57
@llvmbot llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Jan 7, 2025
@llvmbot
Copy link
Member

llvmbot commented Jan 7, 2025

@llvm/pr-subscribers-clang

Author: Nick Sarnie (sarnex)

Changes

Currently, __has_builtin will return true when passed a builtin that is only supported on the aux target. I found this when __has_builtin was called with an X86 builtin but the current target was SPIR-V.

If we know for sure the builtin can't be supported on the current target, the function should return false.

We can't simply check if it's an aux builtin, see the test mentioned in the comment.


Full diff: https://github.com/llvm/llvm-project/pull/121839.diff

4 Files Affected:

  • (modified) clang/include/clang/Basic/Builtins.h (+5)
  • (modified) clang/lib/Basic/Builtins.cpp (+19)
  • (modified) clang/lib/Lex/PPMacroExpansion.cpp (+17-3)
  • (added) clang/test/Preprocessor/builtin_aux_info.cpp (+12)
diff --git a/clang/include/clang/Basic/Builtins.h b/clang/include/clang/Basic/Builtins.h
index 63559d977ce6b6..0939f95b0922c1 100644
--- a/clang/include/clang/Basic/Builtins.h
+++ b/clang/include/clang/Basic/Builtins.h
@@ -74,6 +74,7 @@ struct Info {
   const char *Features;
   HeaderDesc Header;
   LanguageID Langs;
+  bool operator==(const Info &Other) const;
 };
 
 /// Holds information about both target-independent and
@@ -268,6 +269,10 @@ class Context {
   /// for AuxTarget).
   unsigned getAuxBuiltinID(unsigned ID) const { return ID - TSRecords.size(); }
 
+  // Return true if the AuxBuiltin ID represents a target-specific builtin that
+  // is always unsupported on the default target.
+  bool isAuxBuiltinIDAlwaysUnsupportedOnDefaultTarget(unsigned ID) const;
+
   /// Returns true if this is a libc/libm function without the '__builtin_'
   /// prefix.
   static bool isBuiltinFunc(llvm::StringRef Name);
diff --git a/clang/lib/Basic/Builtins.cpp b/clang/lib/Basic/Builtins.cpp
index 588183788de322..bd443b3b371e35 100644
--- a/clang/lib/Basic/Builtins.cpp
+++ b/clang/lib/Basic/Builtins.cpp
@@ -41,6 +41,14 @@ static constexpr Builtin::Info BuiltinInfo[] = {
 #include "clang/Basic/Builtins.inc"
 };
 
+bool Builtin::Info::operator==(const Builtin::Info &Other) const {
+  auto StrCompare = [](StringRef A, StringRef B) { return A == B; };
+  return Name == Other.Name && StrCompare(Type, Other.Type) &&
+         StrCompare(Attributes, Other.Attributes) &&
+         StrCompare(Features, Other.Features) && Header.ID == Other.Header.ID &&
+         Langs == Other.Langs;
+}
+
 const Builtin::Info &Builtin::Context::getRecord(unsigned ID) const {
   if (ID < Builtin::FirstTSBuiltin)
     return BuiltinInfo[ID];
@@ -183,6 +191,17 @@ unsigned Builtin::Context::getRequiredVectorWidth(unsigned ID) const {
   return Width;
 }
 
+bool Builtin::Context::isAuxBuiltinIDAlwaysUnsupportedOnDefaultTarget(
+    unsigned ID) const {
+  assert(isAuxBuiltinID(ID) && "Expected aux target builtin ID");
+  const auto &Record = getRecord(ID);
+  for (const auto &MainTargetBuiltin : TSRecords)
+    if (Record == MainTargetBuiltin)
+      return false;
+
+  return true;
+}
+
 bool Builtin::Context::isLike(unsigned ID, unsigned &FormatIdx,
                               bool &HasVAListArg, const char *Fmt) const {
   assert(Fmt && "Not passed a format string");
diff --git a/clang/lib/Lex/PPMacroExpansion.cpp b/clang/lib/Lex/PPMacroExpansion.cpp
index 347c13da0ad215..13d9a0094a5827 100644
--- a/clang/lib/Lex/PPMacroExpansion.cpp
+++ b/clang/lib/Lex/PPMacroExpansion.cpp
@@ -1804,8 +1804,9 @@ void Preprocessor::ExpandBuiltinMacro(Token &Tok) {
                                            diag::err_feature_check_malformed);
         if (!II)
           return false;
-        else if (II->getBuiltinID() != 0) {
-          switch (II->getBuiltinID()) {
+        auto BuiltinID = II->getBuiltinID();
+        if (BuiltinID != 0) {
+          switch (BuiltinID) {
           case Builtin::BI__builtin_cpu_is:
             return getTargetInfo().supportsCpuIs();
           case Builtin::BI__builtin_cpu_init:
@@ -1818,8 +1819,21 @@ void Preprocessor::ExpandBuiltinMacro(Token &Tok) {
             // usual allocation and deallocation functions. Required by libc++
             return 201802;
           default:
+            // We may get here because of aux builtins which may not be
+            // supported on the default target, for example if we have an X86
+            // specific builtin and the current target is SPIR-V. Sometimes we
+            // rely on __has_builtin returning true when passed a builtin that
+            // is not supported on the default target due to LangOpts but is
+            // supported on the aux target. See
+            // test/Headers/__cpuidex_conflict.c for an example. If the builtin
+            // is an aux builtin and it can never be supported on the default
+            // target, __has_builtin should return false.
+            if (getBuiltinInfo().isAuxBuiltinID(BuiltinID) &&
+                getBuiltinInfo().isAuxBuiltinIDAlwaysUnsupportedOnDefaultTarget(
+                    BuiltinID))
+              return false;
             return Builtin::evaluateRequiredTargetFeatures(
-                getBuiltinInfo().getRequiredFeatures(II->getBuiltinID()),
+                getBuiltinInfo().getRequiredFeatures(BuiltinID),
                 getTargetInfo().getTargetOpts().FeatureMap);
           }
           return true;
diff --git a/clang/test/Preprocessor/builtin_aux_info.cpp b/clang/test/Preprocessor/builtin_aux_info.cpp
new file mode 100644
index 00000000000000..041c7edfdcadac
--- /dev/null
+++ b/clang/test/Preprocessor/builtin_aux_info.cpp
@@ -0,0 +1,12 @@
+// REQUIRES: spirv-registered-target
+// REQUIRES: x86-registered-target
+
+// RUN: %clang_cc1 -fopenmp -triple=spirv64 -fopenmp-is-target-device \
+// RUN: -aux-triple x86_64-linux-unknown -E %s | FileCheck -implicit-check-not=BAD %s
+
+// CHECK: GOOD
+#if __has_builtin(__builtin_ia32_pause)
+  BAD
+#else
+  GOOD
+#endif

@AlexVlx AlexVlx requested a review from yxsamliu January 7, 2025 14:16
@AlexVlx
Copy link
Contributor

AlexVlx commented Jan 7, 2025

CodeGenHipStdPar/unsupported-builtins.cpp is pretty interesting actually, it looks like it tests for some behavior in CodeGen that seems like it's trying to fix the exact same problem

The other two tests seem to be actually unrelated breakages though.

It is not quite trying to fix a problem:). In hipstdpar mode there's no host/device segregation, and it is possible to codegen IR for host only bits (e.g., parts of the stdlib). We delay resolving whether a construct is or isn't viable to the ME, and just place in these special stubs where the otherwise unsupported construct (in this case a builtin) is used. Then, iff it turns out that the code path that reaches the unsupported construct is accessible, we error out - please see https://reviews.llvm.org/D155850 and https://reviews.llvm.org/D155856 for historical context.

I don't quite think that this change interacts with that at all (I'll look again), but it might interact in some subtle way with offload languages such as e.g. HIP which jump through some hoops to keep the AST consistent, so I've added @yxsamliu to this review.

@sarnex
Copy link
Member Author

sarnex commented Jan 7, 2025

It is not quite trying to fix a problem:). In hipstdpar mode there's no host/device segregation, and it is possible to codegen IR for host only bits (e.g., parts of the stdlib). We delay resolving whether a construct is or isn't viable to the ME, and just place in these special stubs where the otherwise unsupported construct (in this case a builtin) is used. Then, iff it turns out that the code path that reaches the unsupported construct is accessible, we error out - please see https://reviews.llvm.org/D155850 and https://reviews.llvm.org/D155856 for historical context.

I don't quite think that this change interacts with that at all (I'll look again), but it might interact in some subtle way with offload languages such as e.g. HIP which jump through some hoops to keep the AST consistent, so I've added @yxsamliu to this review.

Ah, thanks for the clarification and for adding reviewers!

@erichkeane
Copy link
Collaborator

I think I wrote the original patch. The point for allowing the 'aux' target to enable a builtin that is not supported on the 'default' target is for languages where 2-pass compilation is necessary for device/host, and the 'aux' target is used to capture that this is a 2-pass compilation.

So I think we're conflating "This should be legal in this program" and "this should be illegal to execute on THIS target". Historically, those were the same questions, but 2 pass compilation makes this no longer true. So right now this functionality is answering the 1st question, but we have no way for the 2nd question.

IMO, we probably have something where we need to have code-gen start diagnosing the 2nd question. Because something like:

void only_executed_on_host() {
__builtin_host_thing();
}

void device_entry_point() {
 // doesn't call executed_on_host
}

is a pattern that we need to support.

@jhuber6
Copy link
Contributor

jhuber6 commented Jan 7, 2025

I don't think it makes any sense for __has_builtin to return true when the target does not in-fact have the builtin. Most of the time this is used to guard target specific code, which will then be wrong if it's compiled on the device. Realistically the solution that makes sense to me is to not emit any of the aux target's builtins as available.

@erichkeane
Copy link
Collaborator

I don't think it makes any sense for __has_builtin to return true when the target does not in-fact have the builtin. Most of the time this is used to guard target specific code, which will then be wrong if it's compiled on the device. Realistically the solution that makes sense to me is to not emit any of the aux target's builtins as available.

From the user's perspective, the target DOES support the builtin though, since from their perspective, their target is 'SYCL host + device'. So this is actually not as clear as "it should never", because target/aux-target are machinations of the implementation, and don't properly respect the user's definition of 'target'.

@AlexVlx
Copy link
Contributor

AlexVlx commented Jan 7, 2025

I don't think it makes any sense for __has_builtin to return true when the target does not in-fact have the builtin. Most of the time this is used to guard target specific code, which will then be wrong if it's compiled on the device. Realistically the solution that makes sense to me is to not emit any of the aux target's builtins as available.

From the user's perspective, the target DOES support the builtin though, since from their perspective, their target is 'SYCL host + device'. So this is actually not as clear as "it should never", because target/aux-target are machinations of the implementation, and don't properly respect the user's definition of 'target'.

Right, this is essentially a consequence of trying to provide the single-source appearance / illusion.

@jhuber6
Copy link
Contributor

jhuber6 commented Jan 7, 2025

I don't think it makes any sense for __has_builtin to return true when the target does not in-fact have the builtin. Most of the time this is used to guard target specific code, which will then be wrong if it's compiled on the device. Realistically the solution that makes sense to me is to not emit any of the aux target's builtins as available.

From the user's perspective, the target DOES support the builtin though, since from their perspective, their target is 'SYCL host + device'. So this is actually not as clear as "it should never", because target/aux-target are machinations of the implementation, and don't properly respect the user's definition of 'target'.

I don't see how this is desirable behavior, since it just leads to stuff like this https://godbolt.org/z/na4rGn9od. This behavior makes __has_builtin effectively unusable if you ever use it inside a function shared with the host + device.

@yxsamliu
Copy link
Collaborator

yxsamliu commented Jan 8, 2025

I am afraid this will break all existing CUDA/HIP programs since they expect to be able to parse the builtins for both host and device targets.

In the spirit of single source, the compiler sees the entire code for all targets, including host target and all device targets. It is supposed to have AST for all host and device targets. Ideally, it should generate a heterogeneous IR that is for all host and device targets, and eventually generate a heterogeneous executable that includes everything. That would make some optimizations across host and device IR possible. It is just due to current limitation of LLVM/Clang, we generate and process IR separately for each host and device target.

I think at least during parsing/sema/AST clang could and should see code and AST for both host and device, since that helps avoid inconsistency between host and device. Therefore it is necessary to see builtins for both host and device. Clang only needs to make sure they are only emitted for the supported target.

@erichkeane
Copy link
Collaborator

I am afraid this will break all existing CUDA/HIP programs since they expect to be able to parse the builtins for both host and device targets.

In the spirit of single source, the compiler sees the entire code for all targets, including host target and all device targets. It is supposed to have AST for all host and device targets. Ideally, it should generate a heterogeneous IR that is for all host and device targets, and eventually generate a heterogeneous executable that includes everything. That would make some optimizations across host and device IR possible. It is just due to current limitation of LLVM/Clang, we generate and process IR separately for each host and device target.

I think at least during parsing/sema/AST clang could and should see code and AST for both host and device, since that helps avoid inconsistency between host and device. Therefore it is necessary to see builtins for both host and device. Clang only needs to make sure they are only emitted for the supported target.

This is/was my concern. However, upon thinking further, as long as we RECOGNIZE/Parse/etc the builtins, it is OK I think if we report "!__has_builtin".

That is:

void foo() {
#if __has_builtin(__builtin_x86_thing)
__builtin_x86_thing();
#else
__builtin_x86_thing(); // woops, not has_builtin!
#endif
}

NEEDS to still work, even if the __has_builtin answer is 'no'.

@jhuber6
Copy link
Contributor

jhuber6 commented Jan 8, 2025

I am afraid this will break all existing CUDA/HIP programs since they expect to be able to parse the builtins for both host and device targets.

In the spirit of single source, the compiler sees the entire code for all targets, including host target and all device targets. It is supposed to have AST for all host and device targets. Ideally, it should generate a heterogeneous IR that is for all host and device targets, and eventually generate a heterogeneous executable that includes everything. That would make some optimizations across host and device IR possible. It is just due to current limitation of LLVM/Clang, we generate and process IR separately for each host and device target.

I think at least during parsing/sema/AST clang could and should see code and AST for both host and device, since that helps avoid inconsistency between host and device. Therefore it is necessary to see builtins for both host and device. Clang only needs to make sure they are only emitted for the supported target.

I'm curious what would break, since I cannot imagine a single situation where someone would use __has_builtin and not immediately use said builtin. The current behavior makes this unusable in a device context. There's already ways to split the host/device compiles via the preprocessor (i.e. __HIP_DEVICE_COMPILE__ or __CUDA_ARCH) and it's pretty much baked into the CUDA compilation pipeline, and by extension HIP. Other languages like OpenMP and SYCL skirt around the idea, seeking to put these things into the AST instead of the processor, so it's more of an implementation detail there.

People can already use the preprocessor to emit different code depending on the target architecture, I don't see how this is any different. It's important that both sides match up as much as possible so things like sizes of structs are coherent, but I really don't see this case breaking anything more than it already is.

@jhuber6
Copy link
Contributor

jhuber6 commented Jan 8, 2025

This is/was my concern. However, upon thinking further, as long as we RECOGNIZE/Parse/etc the builtins, it is OK I think if we report "!__has_builtin".

That is:

void foo() {
#if __has_builtin(__builtin_x86_thing)
__builtin_x86_thing();
#else
__builtin_x86_thing(); // woops, not has_builtin!
#endif
}

NEEDS to still work, even if the __has_builtin answer is 'no'.

Yes, I think that makes sense as it would allow both sides to parse the same code without forcing the compiler to always emit an incompatible builtin call. Since we already allow users to split compilations with preprocessor values I think that's reasonable.

Comment on lines 1 to 3
// REQUIRES: spirv-registered-target
// REQUIRES: x86-registered-target

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These shouldn't be necessary, we're just emitting IR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed in 3ce5111, thx

// REQUIRES: x86-registered-target

// RUN: %clang_cc1 -fopenmp -triple=spirv64 -fopenmp-is-target-device \
// RUN: -aux-triple x86_64-linux-unknown -E %s | FileCheck -implicit-check-not=BAD %s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add lines for other targets.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hopefully addressed in 3ce5111, thx

@yxsamliu
Copy link
Collaborator

yxsamliu commented Jan 8, 2025

I am afraid this will break all existing CUDA/HIP programs since they expect to be able to parse the builtins for both host and device targets.
In the spirit of single source, the compiler sees the entire code for all targets, including host target and all device targets. It is supposed to have AST for all host and device targets. Ideally, it should generate a heterogeneous IR that is for all host and device targets, and eventually generate a heterogeneous executable that includes everything. That would make some optimizations across host and device IR possible. It is just due to current limitation of LLVM/Clang, we generate and process IR separately for each host and device target.
I think at least during parsing/sema/AST clang could and should see code and AST for both host and device, since that helps avoid inconsistency between host and device. Therefore it is necessary to see builtins for both host and device. Clang only needs to make sure they are only emitted for the supported target.

This is/was my concern. However, upon thinking further, as long as we RECOGNIZE/Parse/etc the builtins, it is OK I think if we report "!__has_builtin".

That is:

void foo() {
#if __has_builtin(__builtin_x86_thing)
__builtin_x86_thing();
#else
__builtin_x86_thing(); // woops, not has_builtin!
#endif
}

NEEDS to still work, even if the __has_builtin answer is 'no'.

If declarations are still there. Only making changes to __has_builtin, the risk of breaking existing code is low.

Comment on lines 1831 to 1832
if (getBuiltinInfo().isAuxBuiltinID(BuiltinID) &&
getBuiltinInfo().isAuxBuiltinIDAlwaysUnsupportedOnDefaultTarget(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this extra logic? I figured it would simply be if the builtin is only in the aux target, then we return false.

Copy link
Member Author

@sarnex sarnex Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this extra logic? I figured it would simply be if the builtin is only in the aux target, then we return false.

If you mean why isn't the check just if (getBuiltinInfo().isAuxBuiltinID(BuiltinID)), it's because doing that breaks test/Headers/__cpuidex_conflict.c (the last RUN line in the test).

That test has x86 for both the default and aux target, but the builtin the test cares about is MSVC only and the default target env isn't MSVC, so that builtin is considered unsupported and not registered, but the aux target is MSVC, so it is registered and isAuxBuiltinID returns true, and then returning false in __has_builtin breaks the test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like some weird behavior, since it's just going to end up with the GNU environment emitting an MSVC builtin. Maybe @boomanaiden154 can chime in on whether or not that's intended behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming this is related to the openmp test in __cpuidex_conflict.c? I believe that is related to the following comment from @AaronBallman in https://reviews.llvm.org/D150646:

I think these changes exacerbate an issue that we had but didn't know about, because I'm able to reproduce the redefinition problem in both Clang 16 and trunk: https://godbolt.org/z/8Yq1vzEnT

What's happening is that the aux-triple is not being honored when compiling for an offload device (OpenMP, SYCL, etc) and so the wrong builtins are being enabled, leading to exposing __cpuidex as a builtin on non-Windows targets. This means the workaround to look for _MSC_EXTENSIONS doesn't solve the underlying issue -- we already try to only expose the intrinsic when MS extensions are enabled:

TARGET_HEADER_BUILTIN(__cpuidex, "vi*ii", "nh", INTRIN_H, ALL_MS_LANGUAGES, "")
and
if (!LangOpts.MicrosoftExt && (BuiltinInfo.Langs & MS_LANG))

The changes in this patch are causing us problems with our downstream at Intel. I think we might want to consider reverting the change temporarily (and on the 17.x branch) because they're exacerbating an existing problem. That gives us time to solve the problem of exposing the wrong set of builtins, and hopefully means we can remove the _MSC_EXTENSIONS workaround in the header file. WDYT?

The test is intended to make sure that the __cpuidex definiton from cpuid.h isn't used (and thus doesn't conflct) with the MSVC builtin definition. I'm not really familiar enough with OpenMP to tell if I set up the test correctly, and it could very well be the test that needs to be adjusted.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late response.

My interpretation of Aaron's comment seems like he expects the behavior we see today (aux triple builtins enabled on offloading targets), thus making the OpenMP offloading part of the cpuid LIT test correct.

@AaronBallman Do you mind clarifying what you think should happen here? Sorry, your comment was from a while back.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@boomanaiden154 If we break the aux triple use part of the test, do you expect it to affect actual users or could we just XFAIL it and proceed with this change (assuming everyone agrees the proposed behavior is correct) and fix it later?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't expect the aux triple part of the test to break any actual users, so that should be fine. Nothing in cpuid.h makes sense to use on an offloading target and I wouldn't really expect anyone to have done so.

Then again, I wouldn't have expected people to include cpuid.h when compiling for windows, but I can see that being a vastly more popular configuration than somehow trying to use MS builtins on code that gets offloaded.

Copy link
Member Author

@sarnex sarnex Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, are you okay with me disabling that part of the test and leaving it to you to figure out the correct guard when you have time?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should work for me. Can you add a TODO(boomanaiden154) as a comment when you disable it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, thanks!

Signed-off-by: Sarnie, Nick <[email protected]>
@sarnex
Copy link
Member Author

sarnex commented Jan 13, 2025

Sorry, I kind of got lost in the above discussion. Are additional changes required in Clang to parse builtins unsupported on the current target to make sure we don't break anything or is only changing the result of __has_builtin acceptable?

@jhuber6
Copy link
Contributor

jhuber6 commented Jan 13, 2025

Sorry, I kind of got lost in the above discussion. Are additional changes required in Clang to parse builtins unsupported on the current target to make sure we don't break anything or is only changing the result of __has_builtin acceptable?

I believe the conclusion was that we should emit the builtin functions, but __has_builtin should return false for builtins only present in the aux triple. For the one test that complicates this behavior, I don't think the use-case is completely well-formed but was hoping @boomanaiden154 would chime in on what the expected behavior is. I would prefer if this is just a simply check for "not in aux builtin".

@sarnex
Copy link
Member Author

sarnex commented Jan 13, 2025

I believe the conclusion was that we should emit the builtin functions, but __has_builtin should return false for builtins only present in the aux triple. For the one test that complicates this behavior, I don't think the use-case is completely well-formed but was hoping @boomanaiden154 would chime in on what the expected behavior is. I would prefer if this is just a simply check for "not in aux builtin".

Thanks, hopefully we can figure out the intended behavior for that test.

Copy link
Contributor

@boomanaiden154 boomanaiden154 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes to __cpuidex_conflict.c LGTM. I will put that on my TODO list.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants