Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docker] bump neuron sdk to 2.21 #2656

Merged
merged 1 commit into from
Jan 14, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 55 additions & 19 deletions serving/docker/pytorch-inf2.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -12,21 +12,28 @@
FROM ubuntu:22.04
ARG djl_version
ARG djl_serving_version
ARG python_version=3.10

# PyTorch and Vision
ARG torch_version=2.1.2
ARG torchvision_version=0.16.2
ARG python_version=3.10
ARG neuronsdk_version=2.20.2
ARG torch_neuronx_version=2.1.2.2.3.2
ARG transformers_neuronx_version=0.12.313
ARG neuronx_distributed_version=0.9.0
ARG neuronx_cc_version=2.15.143.0
ARG neuronx_cc_stubs_version=2.15.143.0
ARG torch_xla_version=2.1.5

# Neuron SDK components
ARG neuronsdk_version=2.21.0
ARG torch_neuronx_version=2.1.2.2.4.0
ARG transformers_neuronx_version=0.13.322
ARG neuronx_distributed_version=0.10.0
ARG neuronx_distributed_inference_version=0.1.0
ARG neuronx_cc_version=2.16.345.0
ARG neuronx_cc_stubs_version=2.16.345.0
ARG torch_xla_version=2.1.6
ARG libneuronxla_version=2.1.681.0

ARG transformers_version=4.45.2
ARG accelerate_version=0.29.2
ARG diffusers_version=0.28.2
ARG pydantic_version=2.6.1
ARG optimum_neuron_version=0.0.24
ARG optimum_neuron_version=0.0.27
ARG huggingface_hub_version=0.25.2
# %2B is the url escape for the '+' character
ARG vllm_wheel="https://publish.djl.ai/neuron_vllm/vllm-0.6.2%2Bnightly-py3-none-any.whl"
Expand Down Expand Up @@ -72,24 +79,53 @@ COPY config.properties /opt/djl/conf/
COPY partition /opt/djl/partition
RUN mkdir -p /opt/djl/bin && cp scripts/telemetry.sh /opt/djl/bin && \
echo "${djl_serving_version} inf2" > /opt/djl/bin/telemetry && \
# Install python and djl serving
scripts/install_python.sh && \
scripts/install_djl_serving.sh $djl_version $djl_serving_version && \
scripts/install_djl_serving.sh $djl_version $djl_serving_version ${torch_version} && \
# Install inferentia packages
scripts/install_inferentia2.sh && \
pip install accelerate==${accelerate_version} safetensors torchvision==${torchvision_version} \
neuronx-cc==${neuronx_cc_version} torch-neuronx==${torch_neuronx_version} transformers-neuronx==${transformers_neuronx_version} \
torch_xla==${torch_xla_version} neuronx-cc-stubs==${neuronx_cc_stubs_version} huggingface-hub==${huggingface_hub_version} \
neuronx_distributed==${neuronx_distributed_version} protobuf sentencepiece jinja2 \
diffusers==${diffusers_version} opencv-contrib-python-headless Pillow --extra-index-url=https://pip.repos.neuron.amazonaws.com \
pydantic==${pydantic_version} optimum optimum-neuron==${optimum_neuron_version} tiktoken blobfile && \
pip install transformers==${transformers_version} ${vllm_wheel} && \
echo y | pip uninstall triton && \
# Configure pip and install python packages
pip config set global.extra-index-url "https://pip.repos.neuron.amazonaws.com" && \
pip install \
accelerate==${accelerate_version} \
safetensors \
torchvision==${torchvision_version} \
neuronx-cc==${neuronx_cc_version} \
torch-neuronx==${torch_neuronx_version} \
torch_xla==${torch_xla_version} \
neuronx-cc-stubs==${neuronx_cc_stubs_version} \
huggingface-hub==${huggingface_hub_version} \
libneuronxla==${libneuronxla_version} \
neuronx_distributed==${neuronx_distributed_version} \
protobuf \
sentencepiece \
jinja2 \
diffusers==${diffusers_version} \
opencv-contrib-python-headless \
Pillow \
pydantic==${pydantic_version} \
optimum \
tiktoken \
blobfile && \
# Install packages with no-deps flag
pip install --no-deps \
neuronx_distributed_inference==${neuronx_distributed_inference_version} \
optimum-neuron==${optimum_neuron_version} \
transformers-neuronx==${transformers_neuronx_version} \
${vllm_wheel} \
transformers==${transformers_version} && \
# Install s5cmd and patch OSS DLC
scripts/install_s5cmd.sh x64 && \
scripts/patch_oss_dlc.sh python && \
# Create user and set permissions
useradd -m -d /home/djl djl && \
chown -R djl:djl /opt/djl && \
rm -rf scripts && pip3 cache purge && \
apt-get clean -y && rm -rf /var/lib/apt/lists/*
# Cleanup
rm -rf scripts && \
pip3 cache purge && \
apt-get clean -y && \
rm -rf /var/lib/apt/lists/*

LABEL maintainer="[email protected]"
LABEL dlc_major_version="1"
Expand Down
8 changes: 4 additions & 4 deletions serving/docker/scripts/install_inferentia2.sh
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,12 @@ echo "deb https://apt.repos.neuron.amazonaws.com ${VERSION_CODENAME} main" >/etc
curl -L https://apt.repos.neuron.amazonaws.com/GPG-PUB-KEY-AMAZON-AWS-NEURON.PUB | apt-key add -

# https://awsdocs-neuron.readthedocs-hosted.com/en/latest/release-notes/releasecontent.html#inf2-packages
apt-get update -y && apt-get install -y aws-neuronx-collectives=2.22.26.0* \
aws-neuronx-runtime-lib=2.22.14.0* \
aws-neuronx-tools=2.19.0.0
apt-get update -y && apt-get install -y aws-neuronx-collectives=2.23.133.0* \
aws-neuronx-runtime-lib=2.23.110.0* \
aws-neuronx-tools=2.20.204.0

# TODO: Remove this hack after aws-neuronx-dkms install no longer throws an error, this bypasses the `set -ex`
# exit criteria. The package is installed and functional after running, just throws an error on install.
apt-get install -y aws-neuronx-dkms=2.18.12.0 || echo "Installed aws-neuronx-dkms with errors"
apt-get install -y aws-neuronx-dkms=2.19.64.0 || echo "Installed aws-neuronx-dkms with errors"

export PATH=/opt/aws/neuron/bin:$PATH
Loading