Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fbgemm_gpu] Set LD_LIBRARY_PATH in install script #2671

Closed
wants to merge 1 commit into from

Conversation

q10
Copy link
Contributor

@q10 q10 commented Jan 8, 2025

  • Set LD_LIBRARY_PATH in the fbgemm_gpu install script to fix the issue with fbgem_gpu being unable to locate libnvrtc.so on start

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 8, 2025
- Set LD_LIBRARY_PATH in the fbgemm_gpu install script to fix the issue with fbgem_gpu
being unable to locate libnvrtc.so on start
@q10 q10 force-pushed the bm/fix-fbgemm-install branch from 8e93669 to 5f1700c Compare January 8, 2025 19:52
@q10 q10 changed the title [WIP] Fix FBGEMM installation [fbgemm_gpu] Set LD_LIBRARY_PATH in install script Jan 8, 2025
@facebook-github-bot
Copy link
Contributor

@q10 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@TroyGarden has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@TroyGarden
Copy link
Contributor

thanks for the prompt fix.

@TroyGarden TroyGarden closed this Jan 8, 2025
@TroyGarden TroyGarden reopened this Jan 8, 2025
TroyGarden added a commit to TroyGarden/torchrec that referenced this pull request Jan 22, 2025
Summary:
# context
* to address the error when running github test
```
+++ conda run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
+++ local cmd=run
+++ case "$cmd" in
+++ __conda_exe run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
+++ /opt/conda/bin/conda run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
ERROR:root:Could not load the library 'fbgemm_gpu_tbe_index_select.so': /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/fbgemm_gpu_tbe_index_select.so)
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 62, in <module>
    _load_library(f"{library}.so")
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 21, in _load_library
    raise error
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 17, in _load_library
    main()
  File "/home/ec2-user/actions-runner/_work/torchrec/torchrec/test-infra/.github/scripts/run_with_env_secrets.py", line 98, in main
    run_cmd_or_die(f"docker exec -t {container_name} /exec")
  File "/home/ec2-user/actions-runner/_work/torchrec/torchrec/test-infra/.github/scripts/run_with_env_secrets.py", line 39, in run_cmd_or_die
    raise RuntimeError(f"Command {cmd} failed with exit code {exit_code}")
RuntimeError: Command docker exec -t d5cfe23625bf3b1538b808a1344090ae72ff3977990bc1f780c7a46435a384ec /exec failed with exit code 1
    torch.ops.load_library(os.path.join(os.path.dirname(__file__), filename))
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/torch/_ops.py", line 1357, in load_library
    ctypes.CDLL(path)
  File "/opt/conda/envs/build_binary/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/fbgemm_gpu_tbe_index_select.so)
```
* the issue was fixed before by D67949409 [pytorch#2671](pytorch#2671) in for another test
* this diff applies the same fix on the validate_binaries test.

# details
* previous failures
{F1974496108}

Differential Revision: D68511145
TroyGarden added a commit to TroyGarden/torchrec that referenced this pull request Jan 22, 2025
Summary:

# context
* to address the error when running github test
```
+++ conda run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
+++ local cmd=run
+++ case "$cmd" in
+++ __conda_exe run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
+++ /opt/conda/bin/conda run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
ERROR:root:Could not load the library 'fbgemm_gpu_tbe_index_select.so': /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/fbgemm_gpu_tbe_index_select.so)
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 62, in <module>
    _load_library(f"{library}.so")
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 21, in _load_library
    raise error
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 17, in _load_library
    main()
  File "/home/ec2-user/actions-runner/_work/torchrec/torchrec/test-infra/.github/scripts/run_with_env_secrets.py", line 98, in main
    run_cmd_or_die(f"docker exec -t {container_name} /exec")
  File "/home/ec2-user/actions-runner/_work/torchrec/torchrec/test-infra/.github/scripts/run_with_env_secrets.py", line 39, in run_cmd_or_die
    raise RuntimeError(f"Command {cmd} failed with exit code {exit_code}")
RuntimeError: Command docker exec -t d5cfe23625bf3b1538b808a1344090ae72ff3977990bc1f780c7a46435a384ec /exec failed with exit code 1
    torch.ops.load_library(os.path.join(os.path.dirname(__file__), filename))
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/torch/_ops.py", line 1357, in load_library
    ctypes.CDLL(path)
  File "/opt/conda/envs/build_binary/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/fbgemm_gpu_tbe_index_select.so)
```
* the issue was fixed before by D67949409 [pytorch#2671](pytorch#2671) in for another test
* this diff applies the same fix on the validate_binaries test.

# details
* previous failures
{F1974496108}

Differential Revision: D68511145
TroyGarden added a commit to TroyGarden/torchrec that referenced this pull request Jan 22, 2025
Summary:

# context
* to address the error when running github test
```
+++ conda run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
+++ local cmd=run
+++ case "$cmd" in
+++ __conda_exe run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
+++ /opt/conda/bin/conda run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
ERROR:root:Could not load the library 'fbgemm_gpu_tbe_index_select.so': /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/fbgemm_gpu_tbe_index_select.so)
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 62, in <module>
    _load_library(f"{library}.so")
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 21, in _load_library
    raise error
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 17, in _load_library
    main()
  File "/home/ec2-user/actions-runner/_work/torchrec/torchrec/test-infra/.github/scripts/run_with_env_secrets.py", line 98, in main
    run_cmd_or_die(f"docker exec -t {container_name} /exec")
  File "/home/ec2-user/actions-runner/_work/torchrec/torchrec/test-infra/.github/scripts/run_with_env_secrets.py", line 39, in run_cmd_or_die
    raise RuntimeError(f"Command {cmd} failed with exit code {exit_code}")
RuntimeError: Command docker exec -t d5cfe23625bf3b1538b808a1344090ae72ff3977990bc1f780c7a46435a384ec /exec failed with exit code 1
    torch.ops.load_library(os.path.join(os.path.dirname(__file__), filename))
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/torch/_ops.py", line 1357, in load_library
    ctypes.CDLL(path)
  File "/opt/conda/envs/build_binary/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/fbgemm_gpu_tbe_index_select.so)
```
* the issue was fixed before by D67949409 ([pytorch#2671](pytorch#2671)) in for another test
* this diff applies the same fix on the validate_binaries test.

# details
* previous failures
{F1974496108}

Differential Revision: D68511145
TroyGarden added a commit to TroyGarden/torchrec that referenced this pull request Jan 22, 2025
Summary:

# context
* to address the error when running github test
```
+++ conda run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
+++ local cmd=run
+++ case "$cmd" in
+++ __conda_exe run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
+++ /opt/conda/bin/conda run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
ERROR:root:Could not load the library 'fbgemm_gpu_tbe_index_select.so': /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/fbgemm_gpu_tbe_index_select.so)
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 62, in <module>
    _load_library(f"{library}.so")
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 21, in _load_library
    raise error
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 17, in _load_library
    main()
  File "/home/ec2-user/actions-runner/_work/torchrec/torchrec/test-infra/.github/scripts/run_with_env_secrets.py", line 98, in main
    run_cmd_or_die(f"docker exec -t {container_name} /exec")
  File "/home/ec2-user/actions-runner/_work/torchrec/torchrec/test-infra/.github/scripts/run_with_env_secrets.py", line 39, in run_cmd_or_die
    raise RuntimeError(f"Command {cmd} failed with exit code {exit_code}")
RuntimeError: Command docker exec -t d5cfe23625bf3b1538b808a1344090ae72ff3977990bc1f780c7a46435a384ec /exec failed with exit code 1
    torch.ops.load_library(os.path.join(os.path.dirname(__file__), filename))
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/torch/_ops.py", line 1357, in load_library
    ctypes.CDLL(path)
  File "/opt/conda/envs/build_binary/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/fbgemm_gpu_tbe_index_select.so)
```
* the issue was fixed before by D67949409 ([pytorch#2671](pytorch#2671)) in for another test
* this diff applies the same fix on the validate_binaries test.

# details
* previous failures
{F1974496108}

Differential Revision: D68511145
TroyGarden added a commit to TroyGarden/torchrec that referenced this pull request Jan 22, 2025
Summary:

# context
* to address the error when running github test
```
+++ conda run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
+++ local cmd=run
+++ case "$cmd" in
+++ __conda_exe run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
+++ /opt/conda/bin/conda run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
ERROR:root:Could not load the library 'fbgemm_gpu_tbe_index_select.so': /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/fbgemm_gpu_tbe_index_select.so)
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 62, in <module>
    _load_library(f"{library}.so")
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 21, in _load_library
    raise error
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 17, in _load_library
    main()
  File "/home/ec2-user/actions-runner/_work/torchrec/torchrec/test-infra/.github/scripts/run_with_env_secrets.py", line 98, in main
    run_cmd_or_die(f"docker exec -t {container_name} /exec")
  File "/home/ec2-user/actions-runner/_work/torchrec/torchrec/test-infra/.github/scripts/run_with_env_secrets.py", line 39, in run_cmd_or_die
    raise RuntimeError(f"Command {cmd} failed with exit code {exit_code}")
RuntimeError: Command docker exec -t d5cfe23625bf3b1538b808a1344090ae72ff3977990bc1f780c7a46435a384ec /exec failed with exit code 1
    torch.ops.load_library(os.path.join(os.path.dirname(__file__), filename))
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/torch/_ops.py", line 1357, in load_library
    ctypes.CDLL(path)
  File "/opt/conda/envs/build_binary/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/fbgemm_gpu_tbe_index_select.so)
```
* the issue was fixed before by D67949409 ([pytorch#2671](pytorch#2671)) in for another test
* this diff applies the same fix on the validate_binaries test.

# details
* previous failures
{F1974496108}

Differential Revision: D68511145
TroyGarden added a commit to TroyGarden/torchrec that referenced this pull request Jan 22, 2025
Summary:

# context
* to address the error when running github test
```
+++ conda run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
+++ local cmd=run
+++ case "$cmd" in
+++ __conda_exe run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
+++ /opt/conda/bin/conda run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
ERROR:root:Could not load the library 'fbgemm_gpu_tbe_index_select.so': /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/fbgemm_gpu_tbe_index_select.so)
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 62, in <module>
    _load_library(f"{library}.so")
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 21, in _load_library
    raise error
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 17, in _load_library
    main()
  File "/home/ec2-user/actions-runner/_work/torchrec/torchrec/test-infra/.github/scripts/run_with_env_secrets.py", line 98, in main
    run_cmd_or_die(f"docker exec -t {container_name} /exec")
  File "/home/ec2-user/actions-runner/_work/torchrec/torchrec/test-infra/.github/scripts/run_with_env_secrets.py", line 39, in run_cmd_or_die
    raise RuntimeError(f"Command {cmd} failed with exit code {exit_code}")
RuntimeError: Command docker exec -t d5cfe23625bf3b1538b808a1344090ae72ff3977990bc1f780c7a46435a384ec /exec failed with exit code 1
    torch.ops.load_library(os.path.join(os.path.dirname(__file__), filename))
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/torch/_ops.py", line 1357, in load_library
    ctypes.CDLL(path)
  File "/opt/conda/envs/build_binary/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/fbgemm_gpu_tbe_index_select.so)
```
* the issue was fixed before by D67949409 ([pytorch#2671](pytorch#2671)) in for another test
* this diff applies the same fix on the validate_binaries test.

# details
* previous failures
{F1974496108}

Differential Revision: D68511145
TroyGarden added a commit to TroyGarden/torchrec that referenced this pull request Jan 22, 2025
Summary:

# context
* to address the error when running github test
```
+++ conda run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
+++ local cmd=run
+++ case "$cmd" in
+++ __conda_exe run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
+++ /opt/conda/bin/conda run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
ERROR:root:Could not load the library 'fbgemm_gpu_tbe_index_select.so': /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/fbgemm_gpu_tbe_index_select.so)
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 62, in <module>
    _load_library(f"{library}.so")
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 21, in _load_library
    raise error
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 17, in _load_library
    main()
  File "/home/ec2-user/actions-runner/_work/torchrec/torchrec/test-infra/.github/scripts/run_with_env_secrets.py", line 98, in main
    run_cmd_or_die(f"docker exec -t {container_name} /exec")
  File "/home/ec2-user/actions-runner/_work/torchrec/torchrec/test-infra/.github/scripts/run_with_env_secrets.py", line 39, in run_cmd_or_die
    raise RuntimeError(f"Command {cmd} failed with exit code {exit_code}")
RuntimeError: Command docker exec -t d5cfe23625bf3b1538b808a1344090ae72ff3977990bc1f780c7a46435a384ec /exec failed with exit code 1
    torch.ops.load_library(os.path.join(os.path.dirname(__file__), filename))
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/torch/_ops.py", line 1357, in load_library
    ctypes.CDLL(path)
  File "/opt/conda/envs/build_binary/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/fbgemm_gpu_tbe_index_select.so)
```
* the issue was fixed before by D67949409 ([pytorch#2671](pytorch#2671)) in for another test
* this diff applies the same fix on the validate_binaries test.

# details
* previous failures
{F1974496108}

Differential Revision: D68511145
TroyGarden added a commit to TroyGarden/torchrec that referenced this pull request Jan 22, 2025
Summary:

# context
* to address the error when running github test
```
+++ conda run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
+++ local cmd=run
+++ case "$cmd" in
+++ __conda_exe run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
+++ /opt/conda/bin/conda run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
ERROR:root:Could not load the library 'fbgemm_gpu_tbe_index_select.so': /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/fbgemm_gpu_tbe_index_select.so)
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 62, in <module>
    _load_library(f"{library}.so")
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 21, in _load_library
    raise error
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 17, in _load_library
    main()
  File "/home/ec2-user/actions-runner/_work/torchrec/torchrec/test-infra/.github/scripts/run_with_env_secrets.py", line 98, in main
    run_cmd_or_die(f"docker exec -t {container_name} /exec")
  File "/home/ec2-user/actions-runner/_work/torchrec/torchrec/test-infra/.github/scripts/run_with_env_secrets.py", line 39, in run_cmd_or_die
    raise RuntimeError(f"Command {cmd} failed with exit code {exit_code}")
RuntimeError: Command docker exec -t d5cfe23625bf3b1538b808a1344090ae72ff3977990bc1f780c7a46435a384ec /exec failed with exit code 1
    torch.ops.load_library(os.path.join(os.path.dirname(__file__), filename))
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/torch/_ops.py", line 1357, in load_library
    ctypes.CDLL(path)
  File "/opt/conda/envs/build_binary/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/fbgemm_gpu_tbe_index_select.so)
```
* the issue was fixed before by D67949409 ([pytorch#2671](pytorch#2671)) in for another test
* this diff applies the same fix on the validate_binaries test.

# details
* previous failures
{F1974496108}

Differential Revision: D68511145
TroyGarden added a commit to TroyGarden/torchrec that referenced this pull request Jan 22, 2025
Summary:

# context
* to address the error when running github test
```
+++ conda run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
+++ local cmd=run
+++ case "$cmd" in
+++ __conda_exe run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
+++ /opt/conda/bin/conda run -n build_binary python -c 'import torch; import fbgemm_gpu; import torchrec'
ERROR:root:Could not load the library 'fbgemm_gpu_tbe_index_select.so': /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/fbgemm_gpu_tbe_index_select.so)
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 62, in <module>
    _load_library(f"{library}.so")
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 21, in _load_library
    raise error
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/__init__.py", line 17, in _load_library
    main()
  File "/home/ec2-user/actions-runner/_work/torchrec/torchrec/test-infra/.github/scripts/run_with_env_secrets.py", line 98, in main
    run_cmd_or_die(f"docker exec -t {container_name} /exec")
  File "/home/ec2-user/actions-runner/_work/torchrec/torchrec/test-infra/.github/scripts/run_with_env_secrets.py", line 39, in run_cmd_or_die
    raise RuntimeError(f"Command {cmd} failed with exit code {exit_code}")
RuntimeError: Command docker exec -t d5cfe23625bf3b1538b808a1344090ae72ff3977990bc1f780c7a46435a384ec /exec failed with exit code 1
    torch.ops.load_library(os.path.join(os.path.dirname(__file__), filename))
  File "/opt/conda/envs/build_binary/lib/python3.10/site-packages/torch/_ops.py", line 1357, in load_library
    ctypes.CDLL(path)
  File "/opt/conda/envs/build_binary/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /lib64/libstdc++.so.6: version `GLIBCXX_3.4.29' not found (required by /opt/conda/envs/build_binary/lib/python3.10/site-packages/fbgemm_gpu/fbgemm_gpu_tbe_index_select.so)
```
* the issue was fixed before by D67949409 ([pytorch#2671](pytorch#2671)) in for another test
* this diff applies the same fix on the validate_binaries test.

# details
* previous failures
{F1974496108}

Differential Revision: D68511145
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants