Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker image needs to precompile when pulled & run on different CPU #14

Open
BSchilperoort opened this issue Jan 8, 2025 · 1 comment
Labels
wontfix This will not be worked on

Comments

@BSchilperoort
Copy link
Member

When building the docker image, the Julia packages are precompiled for a specific CPU architecture. When someone wants to run the image on a different machine with a different CPU, Julia needs to pre-compile all packages again. This will happen every time the container is run, unless the user re-builds the image locally.

It is possible to precompile, and the following Dockerfile should do that:

Dockerfile precompile

FROM ghcr.io/ewatercycle/remotebmi-julia:0.1.0

LABEL org.opencontainers.image.source="https://github.com/eWaterCycle/ewatercycle-wflowjl"

# Install Wflow
RUN julia -e 'using Pkg; Pkg.add(PackageSpec(name="Wflow", version="0.8.1"))'
RUN julia -e 'import Pkg; Pkg.add("PackageCompiler")'

RUN apt update
RUN apt install build-essential -y

# https://words.yuvi.in/post/pre-compiling-julia-docker/
# https://github.com/docker-library/julia/issues/79
# note that the -cx16 flag allows it to run on more cpus but can cause performance and stability problems
ENV JULIA_CPU_TARGET="generic,-cx16;sandybridge,-xsaveopt,clone_all;haswell,-rdrnd,base(1)"

RUN echo "using Wflow" > run.jl
RUN echo "import RemoteBMI.Server: run_bmi_server" >> run.jl
RUN echo "port = parse(Int, get(ENV, \"BMI_PORT\", \"50051\"))" >> run.jl
RUN echo "run_bmi_server(Wflow.Model, \"0.0.0.0\", port)" >> run.jl

RUN echo "using Wflow" > compile.jl
RUN echo "using RemoteBMI" >> compile.jl
RUN echo "import RemoteBMI.Server: run_bmi_server" >> compile.jl
RUN echo "port = parse(Int, get(ENV, \"BMI_PORT\", \"50051\"))" >> compile.jl

RUN julia -e 'using PackageCompiler; create_sysimage(["Wflow", "RemoteBMI"], sysimage_path="wflow.so", precompile_execution_file="compile.jl")'

# chmod central depot path so all users can access it
RUN chmod -R 777 ${JULIA_DEPOT_PATH}

# Expose port and start server
EXPOSE 50051
CMD ["julia", "-J", "wflow.so", "run.jl"]

However, this breaks on the machine of @sverhoeven (WSL + Docker), with the following error:

stefanv@ID14300:~$ docker run ghcr.io/ewatercycle/wflowjl-remotebmi:0.2.0-cx16
ERROR: Unable to find compatible target in cached code image.
Target 0 (tigerlake): Rejecting this target due to use of runtime-disabled features
stefanv@ID14300:~$ docker run -u 10000 ghcr.io/ewatercycle/wflowjl-remotebmi:0.2.0-cx16
ERROR: Unable to find compatible target in cached code image.
Target 0 (tigerlake): Rejecting this target due to use of runtime-disabled features

Which should have been fixed by the generic,-cx16 specification. So no clue why this breaks & no idea how to fix it. Could not find any more info on this anywhere.

Workaround

If this issue is bothering you as you need to start the container many times, and it's too much overhead, the workaround is to re-build the docker image/apptainer .sif file on the machine you're running the container on.

@BSchilperoort BSchilperoort added the wontfix This will not be worked on label Jan 8, 2025
@sverhoeven
Copy link
Member

The precompile takes a while so use delay=120 in remotebmi.BmiClientDocker constructor and use model.logs to check that precompile is complete and server is up and running ([ Info: Listening on: 0.0.0.0:50051, thread id: 1 in log)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

2 participants