Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump 2.7.0, Python 3.10, and libprotobuf 3.19 #189

Merged
merged 7 commits into from
Feb 6, 2022

Conversation

hmaarrfk
Copy link
Contributor

@hmaarrfk hmaarrfk commented Dec 5, 2021

TODO:

Checklist

  • Used a personal fork of the feedstock to propose changes
  • Bumped the build number (if the version is unchanged)
  • Reset the build number to 0 (if the version changed)
  • Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
  • Ensured the license file is being packaged.

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I was trying to look for recipes to lint for you, but it appears we have a merge conflict.
Please try to merge or rebase with the base branch to resolve this conflict.

Please ping the 'conda-forge/core' team (using the @ notation in a comment) if you believe this is a bug.

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@hmaarrfk
Copy link
Contributor Author

hmaarrfk commented Dec 6, 2021

I disabled python 3.10 so we can focus on the real failures.

@hmaarrfk
Copy link
Contributor Author

hmaarrfk commented Dec 6, 2021

I feel like for some reason the sys/stat.h file isn't included in the check.

I'm kinda tempted to define HAVE_SYS_STAT_H somewhere in the build.sh file

@hmaarrfk
Copy link
Contributor Author

hmaarrfk commented Dec 6, 2021

fingers crossed.

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

I do have some suggestions for making it better though...

For recipe:

  • It looks like the 'tensorflow-base' output doesn't have any tests.

@hmaarrfk
Copy link
Contributor Author

@conda-forge-admin please rerender

@hmaarrfk
Copy link
Contributor Author

Wheel is not really a requirement: tensorflow/tensorflow#53308

@hmaarrfk hmaarrfk force-pushed the bump_2.7.0 branch 2 times, most recently from a064433 to e22910e Compare December 29, 2021 08:47
@hmaarrfk hmaarrfk changed the title WIP: Bump 2.7.0 WIP: Bump 2.7.0, Python 3.10, and libprotobuf 3.19 Dec 29, 2021
@hmaarrfk
Copy link
Contributor Author

@h-vetinari sorry for squashing all your commits, I was just cleaning up the PR to make it more manageable.

I have good feelings about the current state.

Do comment on anything you see that is strange in the feedstock.

@ngam
Copy link
Contributor

ngam commented Feb 6, 2022

especially for a very popular package like tensorflow

Again, we have a CPU option for a reason, no? Why would anyone download the -gpu variant if they don't want to use it?

The HPC case should be solved separately

How?

then we shouldn't have cuda stuff at all

Way to throw the baby out with the bathwater

What I mean here is: Most systems with GPUs will already have cudatoolkit, etc. installed, but we still provide them anyway so that the users can have a fully contained and reproducible and customizable environment.

@hmaarrfk
Copy link
Contributor Author

hmaarrfk commented Feb 6, 2022

For your HPC application, I believe you should use

https://github.com/conda/conda/blob/d50ce102a5ce5fb006d69761ca8c9d4e870ab250/conda/base/context.py#L855

18:44 $ conda info

     active environment : dev
    active env location : /home/mark/mambaforge/envs/dev
            shell level : 2
       user config file : /home/mark/.condarc
 populated config files : /home/mark/mambaforge/.condarc
                          /home/mark/.condarc
          conda version : 4.11.0
    conda-build version : 3.21.8
         python version : 3.9.9.final.0
       virtual packages : __linux=5.13.0=0
                          __glibc=2.34=0
                          __unix=0=0
                          __archspec=1=x86_64
       base environment : /home/mark/mambaforge  (writable)
      conda av data dir : /home/mark/mambaforge/etc/conda
  conda av metadata url : None
           channel URLs : https://conda.anaconda.org/mark.harfouche/linux-64
                          https://conda.anaconda.org/mark.harfouche/noarch
                          https://conda.anaconda.org/conda-forge/linux-64
                          https://conda.anaconda.org/conda-forge/noarch
          package cache : /home/mark/mambaforge/pkgs
                          /home/mark/.conda/pkgs
       envs directories : /home/mark/mambaforge/envs
                          /home/mark/.conda/envs
               platform : linux-64
             user-agent : conda/4.11.0 requests/2.27.1 CPython/3.9.9 Linux/5.13.0-28-generic ubuntu/21.10 glibc/2.34
                UID:GID : 1000:1000
             netrc file : None
           offline mode : False

18:44 $ CONDA_OVERRIDE_CUDA=11.2 conda info

     active environment : dev
    active env location : /home/mark/mambaforge/envs/dev
            shell level : 2
       user config file : /home/mark/.condarc
 populated config files : /home/mark/mambaforge/.condarc
                          /home/mark/.condarc
          conda version : 4.11.0
    conda-build version : 3.21.8
         python version : 3.9.9.final.0
       virtual packages : __cuda=11.2=0
                          __linux=5.13.0=0
                          __glibc=2.34=0
                          __unix=0=0
                          __archspec=1=x86_64
       base environment : /home/mark/mambaforge  (writable)
      conda av data dir : /home/mark/mambaforge/etc/conda
  conda av metadata url : None
           channel URLs : https://conda.anaconda.org/mark.harfouche/linux-64
                          https://conda.anaconda.org/mark.harfouche/noarch
                          https://conda.anaconda.org/conda-forge/linux-64
                          https://conda.anaconda.org/conda-forge/noarch
          package cache : /home/mark/mambaforge/pkgs
                          /home/mark/.conda/pkgs
       envs directories : /home/mark/mambaforge/envs
                          /home/mark/.conda/envs
               platform : linux-64
             user-agent : conda/4.11.0 requests/2.27.1 CPython/3.9.9 Linux/5.13.0-28-generic ubuntu/21.10 glibc/2.34
                UID:GID : 1000:1000
             netrc file : None
           offline mode : False

@hmaarrfk hmaarrfk deleted the bump_2.7.0 branch February 6, 2022 23:46
@h-vetinari
Copy link
Member

The HPC case should be solved separately

How?

I was thinking of an override for the virtual package, but apparently that's done already:

CONDA_OVERRIDE_CUDA=11.2

Thanks @hmaarrfk!

@ngam
Copy link
Contributor

ngam commented Feb 6, 2022

I'm just one voice, but I vote to keep it

@hmaarrfk do you agree with this being the default and that people should use the override on an HPC, etc.? (Thanks for the info btw and the link, I always learn from you both!)

@h-vetinari
Copy link
Member

Again, we have a CPU option for a reason, no?

Those exist as a compatibility layer to other channels (I speak of tensorflow-cpu not the CPU build). Conda(-forge) encourages usage of plain tensorflow so that the CPU/GPU split can be abstracted away for a package using tensorflow that doesn't care about the backend - in order that the dependency handling is unified, yet people get what's appropriate for each individual installation - __cuda plays a key role in making that determination.

I see your point about the intent of -gpu, but since tensorflow-gpu wraps tensorflow-base =*=cuda (where the __cuda dependency is set), I don't think we can remove that dependency from the wrapper.

@h-vetinari
Copy link
Member

h-vetinari commented Feb 6, 2022

but since tensorflow-gpu wraps tensorflow-base =*=cuda (where the __cuda dependency is set), I don't think we can remove that dependency from the wrapper.

Scratch that; if we want to (and it's an "if" worth discussing...), I think we could move __cuda from tensorflow-base to tensorflow, and not add it for tensorflow-gpu.

@ngam
Copy link
Contributor

ngam commented Feb 6, 2022

but since tensorflow-gpu wraps tensorflow-base =*=cuda (where the __cuda dependency is set), I don't think we can remove that dependency from the wrapper.

Scratch that; if we want to (and it's an "if" worth discussing...), I think we could use __cuda from tensorflow-base to tensorflow, and not add it for tensorflow-gpu.

Okay, I like this latter part! Thanks for understanding and entertaining the protest :) Let's please have it such that when someone goes for the cuda builds specifically, they can actually get the package

$ mamba create -q -n test tensorflow-gpu==2.7.0
Encountered problems while solving:
  - nothing provides __cuda needed by tensorflow-2.7.0-cuda102py310hcf4adbc_0

@h-vetinari
Copy link
Member

h-vetinari commented Feb 7, 2022

Let's please have it such that when someone goes for the cuda builds specifically, they can actually get the package

I'd be OK with that. We can add it to the pile of things to do for the next PR.

Thanks for understanding and entertaining the protest :)

Happy to see we found common ground. Still, I'll allow myself to note that the chance of this would be higher next time (talking only about: with me) with a less... presumptuous... tone. 🙃

@isuruf
Copy link
Member

isuruf commented Feb 7, 2022

Scratch that; if we want to (and it's an "if" worth discussing...), I think we could move __cuda from tensorflow-base to tensorflow, and not add it for tensorflow-gpu.

That's not going to work, since some downstream packages rely on tensorflow.

@isuruf
Copy link
Member

isuruf commented Feb 7, 2022

I suggest reading up on the PRs that added CONDA_OVERRIDE_CUDA for the rationale on why it was done that way
instead of immediately assuming that it was done wrong.

@h-vetinari
Copy link
Member

That's not going to work, since some downstream packages rely on tensorflow.

Do you mean tensorflow-base? tensorflow itself would still have the __cuda dependency.

@isuruf
Copy link
Member

isuruf commented Feb 7, 2022

Nope. I meant tensorflow. That's why I said it wouldn't work.
Also, just asking for tensorflow-gpu doesn't give the user the best possible variant because we don't know if the user has cuda 10.2 or 11.0 or 11.0 or 11.2.

@h-vetinari
Copy link
Member

h-vetinari commented Feb 7, 2022

Nope. I meant tensorflow. That's why I said it wouldn't work.

To be clear, I'm at best lukewarm on this change, but I don't understand how anything would break for tensorflow consumers if we moved

        - __cuda  # [cuda_compiler_version != "None"]

from tensorflow-base to tensorflow.

because we don't know if the user has cuda 10.2 or 11.0 or 11.0 or 11.2.

That however sounds like it seals the deal, and that people without a detectable GPU will need CONDA_OVERRIDE_CUDA.

@isuruf
Copy link
Member

isuruf commented Feb 7, 2022

To be clear, I'm at best lukewarm on this change, but I don't understand how anything would break for tensorflow consumers if we moved

If you ask for conda install tensorflow-base=*cuda110* downstream_package, you'll still get the error of __cuda not being found. So, we haven't improved the situation at all.

@h-vetinari
Copy link
Member

Yes OK, but now we're talking about tensorflow-base as a consumer again (which I consider something of an implementation detail to break the circular dependencies here, and hence not something that people should depend on directly).

But in any case, I agree with you that moving __cuda somewhere doesn't help because then -gpu would be severely hampered.

@ngam
Copy link
Contributor

ngam commented Feb 7, 2022

For @isuruf: From general experience in the conda-forge ecosystem, do we think of HPC as a special case or more like one of the main use cases of conda-forge? Isn't that kind of why we maintain older glibc, cos6, etc. --- because many of us (users) are actually at institutions that never update these things?

I am trying to find more info on this, but it's been long my experience that HPC compute nodes don't have internet access and you'd often need to prepare an environment on a login node (with relatively different setup, but they try to make them as similar as possible). So in this case, without the override, it is relatively difficult to get tensorflow working properly for an average user. In fact, I would argue we are likely breaking people's setups with this change.

(I saw a conversation between you two discussing something similar and I believe isuruf's point was that we maintain such a common denominator because a lot of computational scientists are at institutions with really old stuff... I will try to find the exact link. Update: here it is: conda-forge/conda-forge.github.io#1436)

@ngam
Copy link
Contributor

ngam commented Feb 7, 2022

with a less... presumptuous... tone

Sorry, that wasn't the intention. However, I still maintain that this particular change wasn't a good idea. I really don't think the logic adds up --- but I am happy to listen and be accommodating here. The reason I say the logic doesn't add up is simply that we offer crazy packages (e.g. cudatoolkit, the mother of all cuda stuff) that doesn't have the __cuda constraint. So I really don't understand the storage savings we are making here and it doesn't really make sense in the context of customizability, reproducibility, and relocatability that are some of the most useful and powerful features of conda/conda-forge. I will gladly acknowledge that for an average user (e.g. myself a few months ago) I always ended up with pytorch and tensorflow from conda-forge that are cuda-endowed on a cpu machine just because I asked for pytorch and tensorflow, even though I didn't need the -gpu variants. So I understand we have a problem here that needs to be resolved; I just don't see this being a good resolution or a realistic path forward.

@ngam
Copy link
Contributor

ngam commented Feb 7, 2022

Fwiw: the tf2.x convention seems to be that a package without a variant specification (e.g. "tensorflow") contains both the tensorflow-cpu and tensorflow-gpu compilation, whereas the tensorflow-cpu is the one lacking the gpu features.

For more:

@hmaarrfk
Copy link
Contributor Author

hmaarrfk commented Feb 7, 2022

Please let's stop this conversation in this thread.

System dependency management is a comma wide thing, and we won't solve it here.

Please open an issue either with conda (conda/conda) or conda forge wide.

@ngam
Copy link
Contributor

ngam commented Feb 7, 2022

The conversation is only tangentially about the system-wide implications. This conversation is about the breaking change introduced in this PR and how it results in problems (without override). I have no interest in pursuing this any further, so I won't open an issue about it myself. I just think it should be addressed in this feedstock at some point...

@h-vetinari
Copy link
Member

that we offer crazy packages (e.g. cudatoolkit, the mother of all cuda stuff) that doesn't have the __cuda constraint.

Such a constraint has been there ever since conda-forge published its own cudatoolkit builds. That it's only formulated as a run_constrained (applicable-if-package-is-present) and not as a runtime-dependence is primarily (AFAICT) because we need to build GPU packages in a CI without GPUs.

This conversation is about the breaking change introduced in this PR and how it results in problems (without override). [...] I just think it should be addressed in this feedstock at some point...

The CONDA_OVERRIDE_CUDA approach unbreaks the very specific cases where something was broken, and so far it looks like the emergence of broad agreement that this is not just sufficient, but the solution (as it pertains to this feedstock). Anything beyond that is then a larger issue that should be discussed elsewhere, as @hmaarrfk noted.

@ngam
Copy link
Contributor

ngam commented Feb 8, 2022

But that's a REALLY bad idea!

No, it isn't.

@h-vetinari, sorry, you are right. I was wrong. This is a good idea with a minor caveat for some use cases. SORRY! Now hopefully it is clearer (through documentation) to anyone who would think like I did...

@h-vetinari
Copy link
Member

All good, I'm happy we turned this discussion into something positive - thanks for raising those PRs.

@h-vetinari
Copy link
Member

h-vetinari commented Feb 10, 2022

@maresb: TF2.8 was just now released on PyPI with 3.10 support. 🚀

Came into this thread again looking for something different, but just to note: conda-forge tf 2.7 already has builds for python 3.10.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants