We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
My environment is
PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Arch Linux (x86_64) GCC version: (conda-forge gcc 12.3.0-7) 12.3.0 Clang version: 17.0.6 CMake version: version 3.29.3 Libc version: glibc-2.39 Python version: 3.11.9 | packaged by conda-forge | (main, Apr 19 2024, 18:36:13) [GCC 12.3.0] (64-bit runtime) Python platform: Linux-6.9.1-arch1-1-x86_64-with-glibc2.39 Is CUDA available: True CUDA runtime version: 12.1.66 CUDA_MODULE_LOADING set to: LAZY GPU models and configuration: GPU 0: NVIDIA GeForce RTX 4070 Ti SUPER Nvidia driver version: 550.78 cuDNN version: Probably one of the following: /usr/lib/libcudnn.so.8.9.7 /usr/lib/libcudnn_adv_infer.so.8.9.7 /usr/lib/libcudnn_adv_train.so.8.9.7 /usr/lib/libcudnn_cnn_infer.so.8.9.7 /usr/lib/libcudnn_cnn_train.so.8.9.7 /usr/lib/libcudnn_ops_infer.so.8.9.7 /usr/lib/libcudnn_ops_train.so.8.9.7 HIP runtime version: N/A MIOpen runtime version: N/A Is XNNPACK available: True Versions of relevant libraries: [pip3] numpy==1.26.4 [pip3] nvidia-nccl-cu12==2.20.5 [pip3] torch==2.3.0 [pip3] triton==2.3.0 [conda] numpy 1.26.4 pypi_0 pypi [conda] nvidia-nccl-cu12 2.20.5 pypi_0 pypi [conda] torch 2.3.0 pypi_0 pypi [conda] triton 2.3.0 pypi_0 pypiROCM Version: Could not collect Neuron SDK Version: N/A vLLM Version: N/A vLLM Build Flags: CUDA Archs: 8.0; ROCm: Disabled; Neuron: Disabled GPU Topology: GPU0 CPU Affinity NUMA Affinity GPU NUMA ID GPU0 X 0-23 0 N/A
and I got
FAILED: /tmp/tmp3dq5g304.build-temp/csrc/sgmv_flashinfer/sgmv_all.o ~/.conda/envs/punica/bin/nvcc --generate-dependencies-with-compile --dependency-output /tmp/tmp3dq5g304.build-temp/csrc/sgmv_flashinfer/sgmv_all.o.d -I/workspace/punica/third_party/cutlass/include -punica/third_party/flashinfer/include -I/tmp/pip-build-env-2gz00_u5/overlay/lib/python3.11/site-packages/torch/include -I/tmp/pip-build-env-2gz00_u5/overlay/lib/python3.11/site-packages/torch/include/torch/csrc/api/include -I/tmp/pip-build-env-2gz00_u5/overlay/lib/python3.11/site-packages/torch/include/TH -I/tmp/pip-build-env-2gz00_u5/overlay/lib/python3.11/site-packages/torch/include/THC -I~/.conda/envs/punica/include -I~/.conda/envs/punica/include/python3.11 -c -c /workspace/punica/csrc/sgmv_flashinfer/sgmv_all.cu -o /tmp/tmp3dq5g304.build-temp/csrc/sgmv_flashinfer/sgmv_all.o --expt-relaxed-constexpr --compiler-options ''"'"'-fPIC'"'"'' -O3 -DTORCH_API_INCLUDE_EXTENSION_H '-DPYBIND11_COMPILER_TYPE="_gcc"' '-DPYBIND11_STDLIB="_libstdcpp"' '-DPYBIND11_BUILD_ABI="_cxxabi1011"' -DTORCH_EXTENSION_NAME=_kernels -D_GLIBCXX_USE_CXX11_ABI=0 -gencode=arch=compute_80,code=sm_80 -ccbin ~/.conda/envs/punica/bin/gcc -std=c++17 ~/.conda/envs/punica/include/cuda/std/barrier(144): error: no instance of overloaded "operator new" matches the argument list argument types are: (unsigned long, cuda::std::__4::__barrier_base<cuda::std::__4::__empty_completion, 2> *) new (&__b->__barrier) __barrier_base(__expected); ^ 1 error detected in the compilation of "/workspace/punica/csrc/sgmv_flashinfer/sgmv_all.cu".
What may be the reasons? Thank you for your help!
The text was updated successfully, but these errors were encountered:
No branches or pull requests
My environment is
and I got
What may be the reasons? Thank you for your help!
The text was updated successfully, but these errors were encountered: