-
Notifications
You must be signed in to change notification settings - Fork 184
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove cuda based images #903
Comments
makes sense to me -- that's what I've been doing for my needs, e.g. building on top of the jupyterhub cuda images. (e.g. https://github.com/boettiger-lab/k8s/blob/main/images/Dockerfile.gpu#L1 is my current gpu setup) |
@eitsupi I'm thinking I'll drop a JupyterHub-based image into the old https://github.com/rocker-org/ml repo. |
Thanks, that might make sense. However, when we look here, there are multiple images for ML use. Which one is agreed to be the base image? Since it is not practical to cover all of these, I imagine it would be probably easiest to provide documentation and sample Dockerfiles explaining how to install R and RStudio on these images. |
Yes, great points, thanks for raising these issues! I'll document these things, and I won't intend to cover all those images. You've probably noticed that actually quite few of those images include the NVIDIA CUDA libraries. I do intend to provide a pre-built image with my recommended configuration as well, which will use the CUDA image on latest Ubuntu, as I indicate above. Jupyter's tensorflow only provides cuda latest, while tagged versions exist only for their pytorch base image. In my experience and surveys I have seen from colleagues at computing centers, pytorch is far more widely used at this time. So while I agree with you that when users look at all the images discussed there it looks intimidating, I think for this the choice I indicated above, I completely agree that we want to document how to customize this. Given the recent introduction of JupyterHub's Fancy Profiles that can build directly from a Dockerfile, it is easier than ever to bring-your-own Dockerfile (which is a natural pattern for codespaces and gitlab use as well). There's obviously a lot of ways to set these things up, and just as Rocker has always done the rocker/ml repo will show just one opinionated way to go about it rather than something comprehensive or overly flexible; experienced users will always be able to adapt. e.g. I will go with Dirk's I know you've grown wary of all the python and cuda issues over here, so it sounds like addressing these in a different repo would be helpful too. For simplicity, the ml/ cuda image will not attempt the strong versioned promises we try and have here. |
As a user, pre-built images are easier to work with than using a base image + a virtual environment manager. IMHO containers [like the ones here] +
You could also use b-data's/my CUDA-based JupyterLab R docker stack.
Most people are simply building on existing Rocker or Jupyter images. |
Thanks @benz0li ! Your work is excellent as well. And yes, I totally get where you're coming from on containers vs virtual envs. I think that's definitely true for 'production containers', but perhaps a bit different for these 'dev containers' in which the goal is to support an end user customizing the environment further using patterns with which they are already familiar. e.g. However, as you already know, the official jupyter stacks are conda based, the python geospatial pangeo community is deeply conda based, and users know and expect conda. Hence the design I proposed above. This provides a concise Dockerfile that transparently extends the base Jupyter cuda image. Python installs are handled by conda. Meanwhile R installs are handled by Dirk's excellent As I noted above, this is certainly an opinionated setup, a bit different than existing setups but closely aligned with the official Jupyter images. I've tested this in a range of classroom and research settings over the past year or so alongside the other images discussed above. Moreover I think this provides a good way forward to maintain some cuda options in a separate repo in the rocker project while avoiding the headaches @eitsupi noted at top. big thanks to you both! |
Yes. That was one reason I created my own docker stacks. Other reasons: Rocker images' use of s6-overlay and Juypter images' handling of the user's home directory1.
People may install Conda / Mamba at user level. Both the Version-stable Rocker images and Jupyter Docker Stacks are very popular and @eitsupi as well as @mathbunnyru do a great job improving and maintaining them. Footnotes
|
Regarding dev containers: (CUDA-based) Data Science dev containers (I am trying to serve a larger community with a unified setup) |
I am fed up with the amount of questions about cuda and Python setup and the maintenance hassle, and strongly believe that users should install any version of Python using uv on the version of cuda image they want to use. (and then use rig to install and use any version of R)
The situation has changed dramatically from a few years ago when there was no rig or uv, and I think the significance of the old kind of pre-built image is declining.
@cboettig @noamross Thoughts?
The text was updated successfully, but these errors were encountered: