Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tensorflow 2.7.0 (+ other improvements) #176

Closed
wants to merge 11 commits into from

Conversation

h-vetinari
Copy link
Member

@h-vetinari h-vetinari commented Nov 22, 2021

I hope that people aren't being stressed out by me opening this up right after #144 was merged; but given our iteration speed, I wanted to get a head start on this - we also have a new dependency that's not yet in conda-forge, so things are going to take a bit anyway.

Relative to the list in #175 (as of now), the only part that's not done yet is removing the bazel-workarounds from #144.

Shrunk down todo-list:

Closes #162 (hopefully!)
Closes #151

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@h-vetinari
Copy link
Member Author

h-vetinari commented Nov 22, 2021

The diff is pretty busted due to the shuffle with tensorflow-base, but --patience (unfortunately not available in github) yields a much better result:

>git show ac888473b051c91db720ff2ba05ab344b1f8809e --patience
commit ac888473b051c91db720ff2ba05ab344b1f8809e
Author: H. Vetinari <[email protected]>
Date:   Mon Nov 22 23:27:24 2021 +1100

    remove output tensorflow-base; now obsolete since tf-estimator is built here

diff --git a/recipe/meta.yaml b/recipe/meta.yaml
index add4358..7ba8de1 100644
--- a/recipe/meta.yaml
+++ b/recipe/meta.yaml
@@ -97,12 +97,18 @@ requirements:
     - keras >=2.7,<2.8

 outputs:
-  - name: tensorflow-base
+  # as of tensorflow 2.7.0, we don't need tensorflow-base anymore;
+  # we still build it as a compatibility output (see below), but the
+  # main package is being built in the "tensorflow" output directly.
+  - name: tensorflow
     script: build_pkg.sh  # [not win]
     script: build_pkg.bat  # [win]
     build:
       string: cuda{{ cuda_compiler_version | replace('.', '') }}py{{ CONDA_PY }}h{{ PKG_HASH }}_{{ PKG_BUILDNUM }}  # [cuda_compiler_version != "None"]
       string: cpu_py{{ CONDA_PY }}h{{ PKG_HASH }}_{{ PKG_BUILDNUM }}  # [cuda_compiler_version == "None"]
+      # weigh down cpu implementation and give cuda preference
+      track_features:
+        - tensorflow-cpu          # [cuda_compiler_version == "None"]
       entry_points:
         - toco_from_protos = tensorflow.lite.toco.python.toco_from_protos:main
         - tflite_convert = tensorflow.lite.python.tflite_convert:main
@@ -223,55 +229,16 @@ outputs:
         # all packages.
         # https://github.com/conda-forge/conda-forge.github.io/issues/1528
         - openssl
-        - {{ pin_subpackage('tensorflow-base', exact=True) }}
+        - {{ pin_subpackage('tensorflow', exact=True) }}
       run:
         - python
-        - {{ pin_subpackage('tensorflow-base', exact=True) }}
+        - {{ pin_subpackage('tensorflow', exact=True) }}
     test:
       requires:
         - pip
       commands:
         - pip check

-
-  - name: tensorflow
-    build:
-      string: cuda{{ cuda_compiler_version | replace('.', '') }}py{{ CONDA_PY }}h{{ PKG_HASH }}_{{ PKG_BUILDNUM }}  # [cuda_compiler_version != "None"]
-      string: cpu_py{{ CONDA_PY }}h{{ PKG_HASH }}_{{ PKG_BUILDNUM }}  # [cuda_compiler_version == "None"]
-
-      # weigh down cpu implementation and give cuda preference
-      track_features:
-        - tensorflow-cpu          # [cuda_compiler_version == "None"]
-
-    requirements:
-      build:
-        # Keep the other compilers here so as to help solve for the
-        # required version of libc
-        - {{ compiler('c') }}
-        - {{ compiler('cxx') }}
-        # Keep the cuda compiler here since it helps package solvers
-        # decide on the cuda variant
-        # https://github.com/conda-forge/tensorflow-feedstock/issues/162
-        - {{ compiler('cuda') }}  # [cuda_compiler_version != "None"]
-      host:
-        - python
-        # This ensures that a consistent version of openssl is chosen between
-        # all packages.
-        # https://github.com/conda-forge/conda-forge.github.io/issues/1528
-        - openssl
-        - {{ pin_subpackage('tensorflow-base', exact=True) }}
-      run:
-        - python
-        - {{ pin_subpackage('tensorflow-estimator', exact=True) }}
-        - {{ pin_subpackage('tensorflow-base', exact=True) }}
-    test:
-      requires:
-        - pip
-      imports:
-        - tensorflow
-      commands:
-        - pip check
-
   - name: libtensorflow
     script: cp_libtensorflow.sh
     build:

@h-vetinari h-vetinari mentioned this pull request Nov 23, 2021
12 tasks
@h-vetinari h-vetinari force-pushed the bump branch 2 times, most recently from a5df2bc to 1b61331 Compare November 23, 2021 11:05
@ngam
Copy link
Contributor

ngam commented Nov 24, 2021

@h-vetinari is anyone actively working on the tensorflow-io-gcs-filesystem pkg? It looked easy when I came across it, so I could try to help by giving it a go... or maybe you want to keep it within here?

@hmaarrfk
Copy link
Contributor

might as well give it a go.

It doesn't appear like anybody has made a PR yet:
https://github.com/conda-forge/staged-recipes/pulls?q=is%3Apr+is%3Aopen++tensorflow-io-gcs-filesystem

@ngam
Copy link
Contributor

ngam commented Nov 24, 2021

Okay, will try to do so soon. Desirably, I should follow your build strategy here, right? So that it all follows the same logic...

@hmaarrfk
Copy link
Contributor

Not sure if there is a build strategy to follow. Try to get the CPU version up first. Then we can work on GPU migration.

@h-vetinari
Copy link
Member Author

Sounds great @ngam! Just link it here once you have a PR and people who want to can provide feedback.

@ngam
Copy link
Contributor

ngam commented Nov 25, 2021

conda-forge/staged-recipes#17069

I am new to all of this business and so if something looks weird, you're likely right --- keen to hear any and all feedback :) and of course if you think I am slowing things down, you can ignore my pr

I am simply trying to follow their dev doc re building... and debugging against the tensorflow-feedstock...

@ngam
Copy link
Contributor

ngam commented Nov 25, 2021

As part of this pr, can someone look into this?

--define=PROTOBUF_INCLUDE_PATH=${PREFIX}/include

I think it should be --define=PROTOBUF_INCLUDE=${PREFIX}/include (not --define=PROTOBUF_INCLUDE_PATH=${PREFIX}/include)

@njzjz
Copy link
Member

njzjz commented Nov 25, 2021

export TF_SYSTEM_LIBS="
absl_py
astor_archive
astunparse_archive
boringssl
com_github_googlecloudplatform_google_cloud_cpp
com_github_grpc_grpc
com_google_protobuf
curl
cython
dill_archive
flatbuffers
gast_archive
gif
icu
libjpeg_turbo
org_sqlite
png
pybind11
snappy
zlib
"

Here I think we need to add nsync, as it's in the list of requirements:

@hmaarrfk
Copy link
Contributor

@ngam it seems that PROTOCOL_INCLUDE_PATH is correct:
https://github.com/tensorflow/tensorflow/blob/master/.bazelrc#L313

@ngam
Copy link
Contributor

ngam commented Nov 25, 2021

@ngam it seems that PROTOCOL_INCLUDE_PATH is correct:
https://github.com/tensorflow/tensorflow/blob/master/.bazelrc#L313

Thanks for checking; this confirms it. Weirdly, the io repo doesn't seem to have anything like this and it's building the bazelrc in the build steps, it seems.

However, to the point, when I tried to invoke --define=PROTOBUF_INCLUDE=${PREFIX}/include without _PATH the build goes further on a local linux64 (otherwise it fails saying something like /google/protobuf/... not found or something).

Anyway, I underestimated how difficult it is going to be to build the io thingy --- it will likely make more sense to build it as a split here because it depends on tensorflow and the whole thing is confusing (also, in their build docs, they say centos7 defaults are no good and so must be updated??? (see here) maybe that's why I am having much better luck --- while still failing --- locally on an ubuntu 20.04). I will keep trying and see what where I get... stay tuned.

@@ -79,33 +75,40 @@ requirements:
- zlib
# requirements specified by the package itself, see
# github.com/tensorflow/tensorflow/blob/v{{ version }}/tensorflow/tools/pip_package/setup.py
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we should add the new requirement libclang >= 9.0.1

See https://github.com/tensorflow/tensorflow/blob/c256c071bb26e1e13b4666d1b3e229e110bc914a/tensorflow/tools/pip_package/setup.py#L85

However, I think conda-forge's libclang package is not the same thing as pypi's libclang package.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we shouldn't match this particular requirement, because the compilers/runtimes/libs are taken care of much better by conda than by pip anyway.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then we may add a patch to setup.py, otherwise pip check will fail.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@h-vetinari
Copy link
Member Author

Here I think we need to add nsync, as it's in the list of requirements:

It's possible that we don't need TF_SYSTEM_LIBS in the build script at all anymore? At least, reading the upstream definition and how it's used, It just checks that the right versions are installed.

@h-vetinari
Copy link
Member Author

It's possible that we don't need TF_SYSTEM_LIBS in the build script at all anymore?

That was stupid(ly naïve), but I added a clearer comment about the situation.

recipe/meta.yaml Outdated
@@ -304,7 +297,9 @@ outputs:
- snappy
- sqlite
- zlib
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

opencl is missing here

@hmaarrfk
Copy link
Contributor

hmaarrfk commented Dec 5, 2021

I don't understand why this is blocked on tensorflow_io (and related depenency)

They are pretty explicit that it isn't a hard requirement (yet)
tensorflow/tensorflow#51460

We should be able to move forward without this for now.

- gast ==0.4.0
- python-flatbuffers >=1.12,<3
- six >=1.12
- tensorflow-io-gcs-filesystem >=0.21.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really do not think this is a dependency.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking upstream, I did not see that this was conditional.

However, I'm almost tempted to open an issue upstream about the circularity with tensorflow-io (if they make it unconditional, as announced).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should definitely open an issue to raise your concerns.

@hmaarrfk
Copy link
Contributor

hmaarrfk commented Dec 6, 2021

The thing with tensorflow-base was that it did help with the cyclic dependency on tensorflow-estimator. @h-vetinari maybe you can bring up the issue of cyclical dependency to tensorflow from a general package management standpoint.

For now, I think we should keep tensorflow-base as a "behind the scenes" package to help building.

@izahn
Copy link

izahn commented Dec 6, 2021

The thing with tensorflow-base was that it did help with the cyclic dependency on tensorflow-estimator. @h-vetinari maybe you can bring up the issue of cyclical dependency to tensorflow from a general package management standpoint.

For now, I think we should keep tensorflow-base as a "behind the scenes" package to help building.

Maybe OT here, but why does conda have a problem with packages depending on each other? Would it be difficult to fix conda so that it allows packages that have run time dependencies on each other?

@hmaarrfk
Copy link
Contributor

hmaarrfk commented Dec 6, 2021

@izahn the pacakges need to be installed at "build" time in the environment so that they are available for compilation.

The way that tensorflow is doing it now, is that they:

  1. They create a new dependency.
  2. Tensorflow does not depend on this dependency.
  3. That dependency depends on Tensorflow Version N
  4. They compile the new dependency.
  5. The next version of tensorflow, depends on the new dependency.

The problem is, that you now need to build all tensorflow from version "0" to build new dependencies.

In short, the runtime dependency doesn't matter so much, as the compile time dependnecy cycle.

@izahn
Copy link

izahn commented Dec 6, 2021

In short, the runtime dependency doesn't matter so much, as the compile time dependnecy cycle.

But I don't think tensorflow actually has a build-time dependency on tensorlow-estimator, only a run-time dependency. Conda however won't let you have mutual run-time dependencies as far as I can tell.

@hmaarrfk
Copy link
Contributor

hmaarrfk commented Dec 6, 2021

hmm, let me see if i can remove the build time dependency on tensorflow that might alleviate the issue.

@h-vetinari
Copy link
Member Author

The thing with tensorflow-base was that it did help with the cyclic dependency on tensorflow-estimator.

I'm happy to keep it if it's necessary, or even just if it makes our lives easier. I was only acting on the statement previously that it should be possible to remove now that tf-estimator is being built here. I'd be happy if that was the case (e.g. not a build dep), but it's not a crucial point for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Help mamba decide on CUDA variant Package pinnings
6 participants