-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
update to alpaka 0.7.0 #213
Merged
sbastrakov
merged 9 commits into
alpaka-group:dev
from
psychocoderHPC:topic-updateToAlpaka0.7.0
Sep 1, 2021
Merged
update to alpaka 0.7.0 #213
sbastrakov
merged 9 commits into
alpaka-group:dev
from
psychocoderHPC:topic-updateToAlpaka0.7.0
Sep 1, 2021
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- CMakeLists.txt can be included in Projects via add_subdirectory and find_package - add mechanism to extract version number from header file (copied from alpaka - thanks to Jan stephan) - add cuplaConfig.cmake and cuplaTargets.cmake - add CMake integration tests - build external project, which uses cupla via - add_subdirectory() and internal alpaka - add_subdirectory() and external alpaka - find_package() - add cmake option to build examples
87a8f454e1 ROCm container update outdated keys cb3b026e68 Update Zenodo file 09735c2946 Update author list 2dc9443aeb Fix bug, if alpaka_add_library() and Clang as CUDA compiler is used together f0cfd5281c fix `alpaka_add_(executable|library)` e0dadf9d7d Prepare for 0.7.0 release 4a2fb705ea Incorporate 0.6.1 changelog 5a69368d6a TaskKernelOacc: copyin(all used local vars) 77e6c836f7 Fix outdated comments in the OpenMP schedule example ad77239f6d CI: fix warp test 6c403ae440 Add missing <limits> header and fix integer namespace 60d0dcc54e fix HIP compiler options not recognized bc7a7c1eac Changelog for 0.7.0 3e3ddaf81d CI: fix MSVC+nvcc issues 7d9d2bf363 Prefer TBBConfig.cmake over FindTBB.cmake 6906420ad4 CI: remove MSVC+nvcc CUDA 11.0 and 11.1 tests 0c19198201 CI: update CUDA to latest versions 31df413c73 fix CI: MSVC + nvcc compile 36cd9a8460 fix OpenMP schedule support 36a19a3b7a Update version number to 0.7.0-rc1 6b027bdd93 Add support for clang 11 a215cefe4d Ädd a workaround for nvcc 10 on Windows failing to compile unit tests 85bda8d147 Add missing combinations to OmpSchedule unit tests 2f014af8cd Fix OpenMP schedule support not compiling in some configurations 8ca08f7c00 Fix OpenMP schedule support not compiling in some configurations 5875a14d28 First-class CMake CUDA support 2707e3bb8b example: fix warning (NVCC+OpenMP) 0ba29ea736 add Front and Contains type list meta functions f0bbebf4a9 CI: add CUDA 11.3 support fea9400db4 fix CUDA 11.3 compile issues 64a22ad0ba CMake CUDA: dev compile options not propergated f88412c7de CtxBlockOacc: Fix assert in DeclareSharedVar f9cb4f322d drop support for clang < 9 as CUDA compiler fd52e36626 CI: disable GCC 10.3 + NVCC debug tests bd6934475f Fix CtxBlockOacc: SyncBlockThreads f428bb0448 suggestions from the review 87279eb79c cmake: use `alpaka_set_compiler_options` b2ef7bcdab refactor cmake flags 474d36473c Add a workaround for Intel compiler bug 2ade15bccd Attempt to add a workaround for failing builds with gcc 5 and 6 7c659671fc Rebase and fixes ee7b91a28f Rework implementation of openMP schedule support 21728a6ec8 Prepare omp::Schedule for future compile-time dispatch or OpenMP schedule 394498cef4 Address reviewers' comments d1f09c0e90 Set default values for CUDA and HIP fast math flags to off fe5c3f9238 Add support for CMake 3.20 7affeb9677 manual formatting fixes 2c91ce3b59 remove section comments 835d706121 fix static shared memory alignment 77220d99d6 remove magic number fd5e862507 Compared float by error tolerance to avoid compiler warning. 8b41234ae2 Specified 32-bit precision ints on shfl() and added width parameter. 06a1f598fd removed unused param in Shfl<WarpSingleThread> c95371330f Added warp::shfl functionality. 2d589bb1a7 Fix BlockSharedMemStMemberImpl::getVarPtr for last var 689ce54115 Added warp info. to cheatsheet. Fixed some other names there. 8eea243922 improve HIP test environemnt 1d505dfebc enable callbacks for HIP f28253a201 increase HIP requirement 2d774b8c8b Address reviewers' comments c02c82675b Add a new example demonstrating kernel specialization for a specific backend 1654b470d2 CI: fix deadlock when using ubuntu 20.04 container c764069894 fix queue test e5bd702a10 document return value of `empty()` and `isComplete()` f45e3af066 fix CPU static shared memory implementation ed3c123b22 Add ALPAKA_ASSERT_OFFLOAD 56849b1bc2 Fix ALPAKA_ACC_CPU_B_OMP2_T_SEQ_ENABLED checked without defined 9a8d3a463a document ADL behavior of math functions da7589b32d let math traits find user provided implementations via ADL aab9ec48be replace static functions of math trait implementations with operator() c9eb2bd4fe change cross for HIP on MacOS to dash 062c682e30 fix table markup in README.md eb31bcd89b Fix declareVar<BlockSharedMemStOmp5>(): only thread 0 should alloc 72785f6f41 drop support for VS 2017 and 32bit Windows 50d0748450 use ubuntu-18.04 for gcc-5 and gcc-6 builds ad362f3a7a increase the static shared memory to 47 KiB 4f4f8c38e4 fix zenodo file fedc0f8607 HIP: fix missing fast math compiler option 132f9df878 fix OpenPower CPU name detection 24c7b5e800 changelog for 0.6.0 1b3edad5cf Update the contributor list in the readme be435660bb update zenodo file ca51f5093f ICC: Update compiler package 8c90f81cf2 CI: Add Latest Intel Compiler 9a11b5bd74 add supported CUDA versions to README.md ce3a8cc67e add diagnosis of unsupported compilers for CUDA 11 c0dc355cba add CI runs for CUDA 11.2 ec56e2e3f5 fix gh-pages REVERT: 394eb4e5ee fix zenodo file REVERT: d0ec3773e6 update release date REVERT: db8335fc10 update CHANGELOG.md REVERT: b8572d922c TaskKernelOacc: copyin(all used local vars) REVERT: 942728e5dd Fix outdated comments in the OpenMP schedule example REVERT: 1647a864e1 CI: fix warp test REVERT: 7e7c094948 Add missing <limits> header and fix integer namespace REVERT: ef9c036661 fix HIP compiler options not recognized REVERT: b8ad556471 CI: test MSVC 2019+nvcc CUDA 11.2 REVERT: b87d96b5a1 update changelog REVERT: b4a038ba70 CI: update CUDA to latest versions REVERT: c11aac38e6 update changelog REVERT: 91d2d78ba0 Prefer TBBConfig.cmake over FindTBB.cmake REVERT: 7121e843f2 fix CI: MSVC + nvcc compile REVERT: 818f326ec4 fix OpenMP schedule support REVERT: c01b5264f8 update CHANGELOG.md REVERT: 85be3fd305 Ädd a workaround for nvcc 10 on Windows failing to compile unit tests REVERT: 4c523460cc Add missing combinations to OmpSchedule unit tests REVERT: 1406bcae80 Fix OpenMP schedule support not compiling in some configurations REVERT: ecdf30ac91 Fix OpenMP schedule support not compiling in some configurations REVERT: 10501bf6c0 example: fix warning (NVCC+OpenMP) REVERT: ed0b9d953b CMake CUDA: dev compile options not propergated REVERT: 04a61c522c CI: disable GCC 10.3 + NVCC debug tests REVERT: afd504eed5 update changelog REVERT: 5fd0733808 CtxBlockOacc: Fix assert in DeclareSharedVar REVERT: 5a0a5bd7f6 cmake: fix macOSX compile REVERT: 38afc0f4bf Prepare omp::Schedule for future compile-time dispatch or OpenMP schedule REVERT: 344e5b432a update CHANGELOG.md REVERT: dd3e64ecd1 Add a workaround for Intel compiler bug REVERT: 9da628c3f6 Attempt to add a workaround for failing builds with gcc 5 and 6 REVERT: 894b24a15b Rebase and fixes REVERT: d343232c62 Rework implementation of openMP schedule support REVERT: 6b72e24997 Fix BlockSharedMemStMemberImpl::getVarPtr for last var REVERT: ddd6ccadad fix static shared memory alignment REVERT: 04b70eeda4 update changelog REVERT: 1acca2798c fix queue test REVERT: 8cf2897ef5 document return value of `empty()` and `isComplete()` REVERT: 967fa3286c fix CPU static shared memory implementation REVERT: 316b9e79b6 backport some PR's to 0.6.1 REVERT: e05d8c3d8a Add ALPAKA_ASSERT_OFFLOAD REVERT: 3e58682b96 Fix ALPAKA_ACC_CPU_B_OMP2_T_SEQ_ENABLED checked without defined REVERT: a485b284f8 Fix declareVar<BlockSharedMemStOmp5>(): only thread 0 should alloc REVERT: 81f08aaefb use ubuntu-18.04 for gcc-5 and gcc-6 builds REVERT: 9a4544c6db set version to 0.6.1 REVERT: a803a50ba6 HIP: fix missing fast math compiler option REVERT: 579af47ffb fix OpenPower CPU name detection REVERT: b7a20dd1d5 changelog for 0.6.0 REVERT: a29be15a0c Update the contributor list in the readme REVERT: 8edd26f07d update zenodo file REVERT: daf599b113 increase version to 0.6.0 git-subtree-dir: alpaka git-subtree-split: 87a8f454e102767e8bfc0b754c68fc665ee5b96a
…dateToAlpaka0.7.0
alpaka changed the math interfaces. This makes cupla math function compatible with the new interfaces.
Fix CUDA bug we workarounded already in alpaka alpaka-group/alpaka#1293
Mention alpaka 0.7.X as required alpaka version.
psychocoderHPC
force-pushed
the
topic-updateToAlpaka0.7.0
branch
from
September 1, 2021 08:43
ee43a2f
to
03c6a76
Compare
sbastrakov
previously approved these changes
Sep 1, 2021
Do not apply warning options to the cupla target if CUDA or HIP is used.
psychocoderHPC
force-pushed
the
topic-updateToAlpaka0.7.0
branch
2 times, most recently
from
September 1, 2021 11:21
3fc6e84
to
ac52664
Compare
Workaround following error: ``` /usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9220): error alpaka-group#167: argument of type "const void *" is incompatible with parameter of type "const float *" 100/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9231): error alpaka-group#167: argument of type "const void *" is incompatible with parameter of type "const float *" 101/usr/lib/gcc/x86_64-linux-gnu/5/include/avx512fintrin.h(9244): error alpaka-group#167: argument of type "const void *" is incompatible with parameter of type "const double *" ```
psychocoderHPC
force-pushed
the
topic-updateToAlpaka0.7.0
branch
from
September 1, 2021 11:24
ac52664
to
5ae04bd
Compare
sbastrakov
approved these changes
Sep 1, 2021
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Update to alpaka 0.7.0
CI: compile for CUDA 9.2 with debug mode to workaround header incompatibilities
This PR is built on top of #203. Please merge this PR, #203 will be merged then automatically.
Review
Please check the last 6 commits only!