Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't get running on Ubuntu 20.04 #4

Open
tkschuler opened this issue Feb 3, 2023 · 0 comments
Open

Can't get running on Ubuntu 20.04 #4

tkschuler opened this issue Feb 3, 2023 · 0 comments

Comments

@tkschuler
Copy link

I have had a lot of trouble installing and trying to get this package running. After several different attempts I got to the following point to at least successfully get the package installed without errors:

The most important thing was to get the right version of JAX and Tensorflow installed.

I am running cuda-nvcc 12.0.140, CUDAtoolkit 11.8.0 with CUDNN 8.4.1.50

  1. Install a GPU enabled tensorflow conda environment with cudatoolkit>=11.4 and cudnn >=8.2 for JAX support first before trying to install the BLE
    I have verified this is working with python3 -c "import tensorflow as tf; print(tf.config.list_physical_devices('GPU'))"
  2. the 'flax' module no longer supports 'optim', so downgrade
    python3.9 -m pip uninstall flax
    python3.9 -m pip install flax==0.5.3
  3. Install ble without acme

Now I can't run the benchmark example, I get this error:
ImportError: /home/schuler/anaconda3/envs/tf/lib/python3.9/site-packages/courier/python/libserialization_cc_proto.so: undefined symbol: _ZNK6google8protobuf7Message11GetTypeNameEv

Nor can I import the balloon enivronment, I get this error:
>>> env = balloon_env.BalloonEnv() 2023-02-03 13:25:28.522181: E external/org_tensorflow/tensorflow/compiler/xla/pjrt/pjrt_stream_executor_client.cc:2389] Execution of replica 0 failed: INTERNAL: Failed to execute XLA Runtime executable: run time error: custom call 'xla.gpu.custom_call' failed: jaxlib/gpu/prng_kernels.cc:33: operation gpuGetLastError() failed: out of memory. Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/home/schuler/anaconda3/envs/tf/lib/python3.9/site-packages/gin/config.py", line 1605, in gin_wrapper utils.augment_exception_message_and_reraise(e, err_str) File "/home/schuler/anaconda3/envs/tf/lib/python3.9/site-packages/gin/utils.py", line 41, in augment_exception_message_and_reraise raise proxy.with_traceback(exception.__traceback__) from None File "/home/schuler/anaconda3/envs/tf/lib/python3.9/site-packages/gin/config.py", line 1582, in gin_wrapper return fn(*new_args, **new_kwargs) File "/home/schuler/anaconda3/envs/tf/lib/python3.9/site-packages/balloon_learning_environment/env/balloon_env.py", line 145, in __init__ self.arena = balloon_arena.BalloonArena(self._feature_constructor_factory, File "/home/schuler/anaconda3/envs/tf/lib/python3.9/site-packages/balloon_learning_environment/env/balloon_arena.py", line 151, in __init__ self._atmosphere = standard_atmosphere.Atmosphere(jax.random.PRNGKey(0)) File "/home/schuler/anaconda3/envs/tf/lib/python3.9/site-packages/balloon_learning_environment/env/balloon/standard_atmosphere.py", line 74, in __init__ self.reset(key) File "/home/schuler/anaconda3/envs/tf/lib/python3.9/site-packages/balloon_learning_environment/env/balloon/standard_atmosphere.py", line 82, in reset alpha = jax.random.uniform(key).item() File "/home/schuler/anaconda3/envs/tf/lib/python3.9/site-packages/jax/_src/random.py", line 265, in uniform return _uniform(key, shape, dtype, minval, maxval) # type: ignore jaxlib.xla_extension.XlaRuntimeError: INTERNAL: Failed to execute XLA Runtime executable: run time error: custom call 'xla.gpu.custom_call' failed: jaxlib/gpu/prng_kernels.cc:33: operation gpuGetLastError() failed: out of memory. In call to configurable 'BalloonEnv' (<class 'balloon_learning_environment.env.balloon_env.BalloonEnv'>)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant