Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: Error building extension 'cpu_adam' #260

Closed
AayushSameerShah opened this issue Oct 4, 2023 · 1 comment
Closed

RuntimeError: Error building extension 'cpu_adam' #260

AayushSameerShah opened this issue Oct 4, 2023 · 1 comment

Comments

@AayushSameerShah
Copy link

Loading dataset

from xturing.datasets.text_dataset import TextDataset

dataset = TextDataset({
    "text": sample["text"],
    "target": sample["target"]
})
# >>> [2023-10-04 11:09:41,704] [INFO] [real_accelerator.py:110:get_accelerator] Setting ds_accelerator to cuda (auto detect)

Loading model

from xturing.models import BaseModel

# Load the model
model = BaseModel.create('llama2_lora_int8')

Simple configuration

finetuning_config = model.finetuning_config()

finetuning_config.num_train_epochs = 1
finetuning_config.learning_rate = 1e-3
finetuning_config.weight_decay = 0.01
finetuning_config.eval_steps = 200
finetuning_config.save_steps = 1000
finetuning_config.logging_steps = 200

Training

model.finetune(dataset=dataset)

🔴 Error

Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
trainable params: 4194304 || all params: 6742609920 || trainable%: 0.06220594176090199
You are using a CUDA device ('NVIDIA RTX A5000') that has Tensor Cores. To properly utilize them, you should set `torch.set_float32_matmul_precision('medium' | 'high')` which will trade-off precision for performance. For more details, read https://pytorch.org/docs/stable/generated/torch.set_float32_matmul_precision.html#torch.set_float32_matmul_precision
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
Using /root/.cache/torch_extensions/py310_cu118 as PyTorch extensions root...
Creating extension directory /root/.cache/torch_extensions/py310_cu118/cpu_adam...
Detected CUDA files, patching ldflags
Emitting ninja build file /root/.cache/torch_extensions/py310_cu118/cpu_adam/build.ninja...
Building extension module cpu_adam...
Allowing ninja to set a default number of workers... (overridable by setting the environment variable MAX_JOBS=N)

[1/3] c++ -MMD -MF cpu_adam.o.d -DTORCH_EXTENSION_NAME=cpu_adam -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/usr/local/lib/python3.10/dist-packages/deepspeed/ops/csrc/includes -I/usr/local/cuda/include -isystem /usr/local/lib/python3.10/dist-packages/torch/include -isystem /usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -isystem /usr/local/lib/python3.10/dist-packages/torch/include/TH -isystem /usr/local/lib/python3.10/dist-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -std=c++17 -g -Wno-reorder -L/usr/local/cuda/lib64 -lcudart -lcublas -g -march=native -fopenmp -D__SCALAR__ -D__ENABLE_CUDA__ -DBF16_AVAILABLE -c /usr/local/lib/python3.10/dist-packages/deepspeed/ops/csrc/adam/cpu_adam.cpp -o cpu_adam.o 
FAILED: cpu_adam.o 
c++ -MMD -MF cpu_adam.o.d -DTORCH_EXTENSION_NAME=cpu_adam -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/usr/local/lib/python3.10/dist-packages/deepspeed/ops/csrc/includes -I/usr/local/cuda/include -isystem /usr/local/lib/python3.10/dist-packages/torch/include -isystem /usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -isystem /usr/local/lib/python3.10/dist-packages/torch/include/TH -isystem /usr/local/lib/python3.10/dist-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -fPIC -std=c++17 -O3 -std=c++17 -g -Wno-reorder -L/usr/local/cuda/lib64 -lcudart -lcublas -g -march=native -fopenmp -D__SCALAR__ -D__ENABLE_CUDA__ -DBF16_AVAILABLE -c /usr/local/lib/python3.10/dist-packages/deepspeed/ops/csrc/adam/cpu_adam.cpp -o cpu_adam.o 
In file included from /usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/Device.h:4,
                 from /usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include/torch/python.h:8,
                 from /usr/local/lib/python3.10/dist-packages/torch/include/torch/extension.h:6,
                 from /usr/local/lib/python3.10/dist-packages/deepspeed/ops/csrc/adam/cpu_adam.cpp:7:
/usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/python_headers.h:12:10: fatal error: Python.h: No such file or directory
   12 | #include <Python.h>
      |          ^~~~~~~~~~
compilation terminated.
[2/3] /usr/local/cuda/bin/nvcc  -DTORCH_EXTENSION_NAME=cpu_adam -DTORCH_API_INCLUDE_EXTENSION_H -DPYBIND11_COMPILER_TYPE=\"_gcc\" -DPYBIND11_STDLIB=\"_libstdcpp\" -DPYBIND11_BUILD_ABI=\"_cxxabi1011\" -I/usr/local/lib/python3.10/dist-packages/deepspeed/ops/csrc/includes -I/usr/local/cuda/include -isystem /usr/local/lib/python3.10/dist-packages/torch/include -isystem /usr/local/lib/python3.10/dist-packages/torch/include/torch/csrc/api/include -isystem /usr/local/lib/python3.10/dist-packages/torch/include/TH -isystem /usr/local/lib/python3.10/dist-packages/torch/include/THC -isystem /usr/local/cuda/include -isystem /usr/include/python3.10 -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 --compiler-options '-fPIC' -O3 --use_fast_math -std=c++17 -U__CUDA_NO_HALF_OPERATORS__ -U__CUDA_NO_HALF_CONVERSIONS__ -U__CUDA_NO_HALF2_OPERATORS__ -gencode=arch=compute_86,code=sm_86 -gencode=arch=compute_86,code=compute_86 -DBF16_AVAILABLE -c /usr/local/lib/python3.10/dist-packages/deepspeed/ops/csrc/common/custom_cuda_kernel.cu -o custom_cuda_kernel.cuda.o 
ninja: build stopped: subcommand failed.

And then...

CalledProcessError                        Traceback (most recent call last)
--- ommited ---

   1622 if verbose:
   1623     print(f'Building extension module {name}...', file=sys.stderr)
-> 1624 _run_ninja_build(
   1625     build_directory,
   1626     verbose,
   1627     error_prefix=f"Error building extension '{name}'")

File /usr/local/lib/python3.10/dist-packages/torch/utils/cpp_extension.py:1911, in _run_ninja_build(build_directory, verbose, error_prefix)
   1909 if hasattr(error, 'output') and error.output:  # type: ignore[union-attr]
   1910     message += f": {error.output.decode(*SUBPROCESS_DECODE_ARGS)}"  # type: ignore[union-attr]
-> 1911 raise RuntimeError(message) from e

RuntimeError: Error building extension 'cpu_adam'

Any solution to this? I am only using a single GPU.

@StochasticRomanAgeev
Copy link
Contributor

StochasticRomanAgeev commented Oct 30, 2023

Hi @AayushSameerShah,
You need to install python3-dev to use this libraries.
Here is the link to next installation steps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants