Skip to content
This repository has been archived by the owner on Jan 20, 2024. It is now read-only.

Jc d2d memcpy #243

Closed

Conversation

JonChesterfield
Copy link
Contributor

No description provided.

changpeng and others added 30 commits September 6, 2023 21:17
…in the first function-attrs pass

Fix lit test too

Summary:
  Argument attributes like NoAlias and ReadOnly could affect memoryssa and thus earlyCSE in the function simplification pipeline.
https://reviews.llvm.org/D145210 adjusted PostOrderFunctionAttrs placement and caused the argument attributes not referred for the use
in the pipeline. This work (initiated by @nikic) unconditionally performs argument attribute inference in the first function-attrs pass.

Reviewers:
  aeubanks and nikic

Differential Revision:
  https://reviews.llvm.org/D156397

Change-Id: If9d1a1b165b708dddc03dfb4d33de2ee48e42844
Replace the ASO variant with upstream Debug.cpp

Change-Id: I5a1b0ae8d49a9d8d7ab49ce7a37eb46bde9d8c1b
Change-Id: I98f9dbf2b938cfe1774bc25b22bb543f46027f6d
Change-Id: I63bbe1eb7b76198ea4d1663bf3be6041f8d3db40
…ssert messages

Change-Id: Ia59e5bd3f9645213a15e68a959757749bef564aa
Change-Id: Id991d428db0c107790505e631257c3122e33281c
unxfails 3 tests
xfails: clang/test/CodeGen/X86/sm3-error.c

Change-Id: If8017b561cc0534a1c717119ecf861f8d6288d5a
Change-Id: If4d53d1e9fe69bd8d21eb155c0552acb03706d55
…upstream no longer uses named variables for master-worker handshaking in generic kernels

Change-Id: I50ca6c010f4c0b7c58b706385ceb71cea32f7c28
Change-Id: I415b21f1ce259e2ec018765ed97748d715e83de6
Change-Id: Id4842d20619920792cfb76b939695427857bf139
… target launch and data transfer operations

Implemented RAII objects, initialized at target entry points, that
invoke tool-supplied callbacks. Updated status of target callbacks as
implemented.

Depends on D127365

Patch from John Mellor-Crummey <[email protected]>
With contributions from:
Dhruva Chakrabarti <[email protected]>
Jan-Patrick Lehr <[email protected]>

Reviewed By: jdoerfert, dhruvachak, jplehr

Differential Revision: https://reviews.llvm.org/D127367

Change-Id: Ic6fcee7059aa4e6237e81e2a702adca6a26bcdc6
Change-Id: I1a79f3e6361fece67bee02fb55a62849d650b15d
Change-Id: Ib53689bd8e7eed86af1cff01a97eb3d6592fe9e4
The attributes changes were left out of Clang 17.
Attributes that used to take a string literal now accept an unevaluated
string literal instead, which means they reject numeric escape sequences
and strings literal with an encoding prefix - but the later was already
ill-formed in most cases.

We need to know that we are going to parse an unevaluated string literal
before we do - so we can reject numeric escape sequence,
so we derive from Attrs.td which attributes parameters are expected
to be string literals.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D156237

Change-Id: I8dcf4c4de75a3f7b089d04cf25b9a20682fa72ff
…source_symbol' and 'uuid' attributes as unevaluated

This is a complementary to D156237.
These attributes have custom parsing logic.

Reviewed By: cor3ntin

Differential Revision: https://reviews.llvm.org/D159024

Change-Id: Icb6d3e0f9ea02b4058a567e5c998be71a3aea7c2
Change-Id: Ie972cb4507ff7b20727e7d1ce7275b18786f2efb
Change-Id: Ia0a6077c7a6eaef44d8a915123e11b4cf24489b0
Change-Id: I3c8e3576cef0a1cbf77bc7e76c0af527938128de
Change-Id: I17ec52cc181708d1134949c66781d2ae0be8ba9f
Change-Id: Iad8bd30ae767ce015d6d39987c7e8e4b07d186df
Change-Id: Id69edef44c1f1fb3eefe9a71413a00c831ba10ee
Revert "[OpenMPOpt] Allow indirect calls in AAKernelInfoCallSite (#65836)"

Change-Id: I790c81ab7d92e0f828e81535cb131c6c45248138
Only a subset of the fields of DbgVariable are meaningful at any time,
and some fields are re-used for multiple purposes (for example
FrameIndexExprs is used with a throw-away frame-index of 0 to hold a
single DIExpression without needing to add another member). The exact
invariants must be reverse-engineered by inspecting the actual use of
the class, its imprecise/outdated doc-comment, and some asserts.

Refactor DbgVariable into a sum type by inheriting from std::variant.
This makes the active fields for any given state explicit and removes
the need to re-use fields in disparate contexts. As a bonus, it seems to
reduce the size on my x86_64 linux box from 144 bytes to 96 bytes.

There is some potential cost to `std::get` as it must check the active
alternative even when context or an assert obviates it. To try to help
ensure the compiler can optimize out the checks the patch also adds a
helper `get` method which uses the noexcept `std::get_if`.

Some of the extra cost would also be avoided more cleanly with a
refactor that exposes the alternative types in the public interface,
which will come in another patch.

Differential Revision: https://reviews.llvm.org/D158675

[NFC][AsmPrinter] Remove dead multi-MMI handling from DwarfFile::addScopeVariable

Differential Revision: https://reviews.llvm.org/D158676

[NFC][AsmPrinter] Expose std::variant-ness of DbgVariable

Differential Revision: https://reviews.llvm.org/D158677

[NFC][AsmPrinter] Use std::visit in constructVariableDIEImpl

This potentially has a slightly positive performance impact, as
std::visit can be implemented as a `switch`-like jump rather than
a series of `if`s.

More importantly, the reader can be confident is no overlap between the
cases.

Differential Revision: https://reviews.llvm.org/D158678

Change-Id: Ie5b1fead7b4a4407f73b295530e46e5ce37f638e
Jenkins and others added 28 commits October 18, 2023 08:31
Change-Id: Ifa714fcf388fb2cb35410a9b6e2e2f38257d07e6
  - device-libs
    - Move amdgcn to lib/llvm/lib/clang/<ver>/lib/amdgcn
    - Create symlink amdgcn -> lib/llvm/lib/clang/<ver>/lib/amdgcn

Change-Id: I9d0715c966fd962bfcbda8815ab8966f780b2268
Change-Id: Ief661bcea6c0205a4c0ba51aded0bc2b74ad4b10
…ad slices

Second try at A-Wadhwani's https://reviews.llvm.org/D132096, which was reverted.
The original patch had three issues:
* https://reviews.llvm.org/D134032, which bjope kindly fixed. That patch is merged into this one.
* [GHI #57796](llvm/llvm-project#57796). Fixed and added a test.
* [GHI #57821](llvm/llvm-project#57821). I believe this is an undefined behavior which is not the fault of the original patch. Please see the issue for more details.

Original diff summary:

This patch adds additional vector types to be considered when doing promotion in
SROA, based on the types of the store and load slices. This provides more
promotion opportunities, by potentially using an optimal "intermediate" vector
type.

For example, the following code would currently not be promoted to a vector,
since `__m128i` is a `<2 x i64>` vector.
```

__m128i packfoo0(int a, int b, int c, int d) {
  int r[4] = {a, b, c, d};
  __m128i rm;
  std::memcpy(&rm, r, sizeof(rm));
  return rm;
}
```
```
packfoo0(int, int, int, int):
  mov     dword ptr [rsp - 24], edi
  mov     dword ptr [rsp - 20], esi
  mov     dword ptr [rsp - 16], edx
  mov     dword ptr [rsp - 12], ecx
  movaps  xmm0, xmmword ptr [rsp - 24]
  ret
```
By also considering the types of the elements, we could find that the `<4 x i32>` type would be valid for promotion, hence removing the memory accesses for this function. In other words, we can explore other new vector types, with the same size but different element types based on the load and store instructions from the Slices, which can
provide us more promotion opportunities.

Additionally, the step for removing duplicate elements from the `CandidateTys` vector was not using an equality comparator, which has been fixed.

Differential Revision: https://reviews.llvm.org/D143225

Change-Id: I5b75f0a6ca59bc55af5202b0cb9d1641072cc95c
Fix a crash when compiling Skia. See https://reviews.llvm.org/D143225#4180342
for more details

Change-Id: I0779cbaa76f12ccf2327e234c19970ae9d3d2272
Change-Id: I1e4983691ff7eb6457484e8ccace2ded0084a4c1
…oad's config files. They will be added again if they are made publicly available. In the meantime, HSA is used to detect all kinds of GFX94* devices.

Change-Id: If2fcd3b3d4fff66115f31202eced08d9472cb673
Fixed assertion failure

  Basic Block in function 'main' does not have terminator!
  label %land.end

caused by premature setting of CodeGenIP upon entry to
emitTargetDataCalls, where subsequent evaluation of logical
expression created new basic blocks, leaving CodeGenIP pointing to
the wrong basic block. CodeGenIP is now set near the end of the
function, just prior to generating a comparison of the logical
expression result (from the if clause) which uses CodeGenIP to
insert new IR.

Fixes SWDEV-422794/AOMP issue #601

Test already exists in smoke-fail/issue601_if_clause

Change-Id: I792141db01b0f030705ec0742c9d9fb1255f036a
This implements the following event types:
  * DeviceInitialize
  * DeviceLoad
  * Target
  * TargetDataOp
  * TargetSubmit

Add class equality operators
Adapt CTORs for more convenient manual usage
Fix errors in toString methods

Change-Id: Id335e412a3c90bdc5ea1f691290c1bc84012b51c
Change-Id: I14f4df348163a9d019527f001a9df3d8a4c305b4

This patch gets us one step closer to being able to run check-openmp.

Summary of changes:
- enable <TRIPLE>-LTO tests for non-AMDGPU architectures.
- compile OpenMP offloading tests using the AOMP pattern which
  specifies -Xopenmp-target and -march,
- enable the compiler selection to use the installed version of AOMP
  instead of using AOMP directly from the build folder
- enables a way to run check-openmp for AOMP:
     AOMP=<path to rocm folder> AOMP_GPU=gfx90a make check-openmp
- Overall, this brings down the number of check-openmp fails
  from 243 to 108.

Change-Id: I14f4df348163a9d019527f001a9df3d8a4c305b4
Change-Id: Ibf46e633bebbce5ef6067564e1e8ef9d605e2cf2
….amdgcn.ballot

Change-Id: I9016008de8ffaabee40870fe1254a7b1a1eb13c7
…ith llvm.amdgcn.ballot"

This reverts commit ac84482.

premature

Change-Id: I8fe8a0d7e5b30715e33c99bda0ed236e11cd5b4a
Change-Id: I828359e27d379e1dd2c1dcdbda7d1ca09dcbad00
…ack to detect AMD GPUs for which the PCI ID is yet unkown.

- Added new command line argument -hsa which enables the HSA detection
  algorithm.
- Removed method getRuntimeCapabilities (no call-sites)
- TODO: I will move the  method isHomogeniousSystemOf to the plugins. I don't see any
  reason for this method to be implemented in OffloadArch. The method
doesn't work if the PCI ID of a GPU is unknown.

Change-Id: Ia0fd44f6d5786eaf513296ac4d731c00d92170d6
Change-Id: I0a0c54ac90501f9b5d0b156259eeec021422c0a1
xfail: clang/test/Driver/cl-offload.cu
xfail: llvm/test/tools/llvm-objdump/ELF/AMDGPU/kd-gfx11.s

Change-Id: Ibca6ffdf3e6b2b0fee56bfe40b391920b0025351
Change-Id: I9419e45a805a02321ec4c958ba8cf26b471d27cb
Change-Id: I9d34efc5eda2ab4ba2b0b57d2310c20fa755bfb1
Change-Id: Ie8605f374a310448a9b7641353842524b2fac184
Change-Id: I47fb7c186f71a990c4445f28869f39df6d8288d1
Change-Id: I1151032336b181c744802c2c2e710025fe14a8a4
Change-Id: Ie4bec760ec88f7762d84a74535576420422b6af5
  locally : reverts d3921e4 [OpenMP] Basic BumpAllocator for (AMD)GPUs (#69806)

Change-Id: Id512e729870279855744ce65bfc69e2155fb68ee
Change-Id: I9ba82077bfaf2baf4076840f90d133c617fa3605
Change-Id: I86584ffca31489ab151a64cda1e10b99347d4a1e
@JonChesterfield
Copy link
Contributor Author

sigh

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet