Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openmm in container openfe does not work with gpus #1091

Open
TakumiHirao opened this issue Jan 24, 2025 · 2 comments
Open

openmm in container openfe does not work with gpus #1091

TakumiHirao opened this issue Jan 24, 2025 · 2 comments
Assignees

Comments

@TakumiHirao
Copy link

We are attempting to run singularity openfe.

The sif file was obtained with the following command.

singularity pull oras://ghcr.io/openfreeenergy/openfe:latest-apptainer

We tested openmm in the container.

python -m openmm.testInstallation

The execution result of this command is shown below.

OpenMM Version: 8.1.1
Git Revision: ec797acabe5de4ce9f56c92d349baa889f4b0821
 
There are 3 Platforms available:
 
1 Reference - Successfully computed forces
2 CPU - Successfully computed forces
3 CUDA - Error computing forces with CUDA platform
 
CUDA platform error: Error initializing FFT: 5
 
Median difference in forces between platforms:
 
Reference vs. CPU: 6.2944e-06
 
All differences are within tolerance.

We have found CUDA, but it does not appear to be available.
Within openmm, the following error is emitted.

openmm.OpenMMException: Error initializing FFT: 5

Has anyone else faced similar problems?
We would like to get your advice.
The environment is as follows.

OS: Ubuntu 22.04.5 LTS
cuda driver: Driver Version: 560.35.03
CUDA Version: V12.6.77
GPU: NVIDIA GeForce RTX 4090
singularity: 3.10.3

Sincerely,

@IAlibay
Copy link
Member

IAlibay commented Feb 1, 2025

Hi @TakumiHirao,

Thanks for raising this issue and apologies for not getting back to you any sooner. I have seen similar FFT error messages on CUDA runs in the past, it's usually down to an incompatibility with certain cudatoolkit versions.

@atravitz @mikemhenry could you have a look at the cudatoolkit version that gets shipped with the singularity & docker instances? I believe 11.8 might be the cause of some of this weird behaviour.

@TakumiHirao could you possibly give the full output of nvidia-smi? That might help us narrow things down.

@TakumiHirao
Copy link
Author

TakumiHirao commented Feb 3, 2025

@IAlibay
Thanks for your answer to my problem.
This is the output of nvidia-smi.

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03              Driver Version: 560.35.03      CUDA Version: 12.6     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 4090        Off |   00000000:01:00.0 Off |                  Off |
|  0%   35C    P8             11W /  450W |      35MiB /  24564MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      1544      G   /usr/lib/xorg/Xorg                              8MiB |
|    0   N/A  N/A      1794      G   /usr/bin/gnome-shell                           10MiB |
+-----------------------------------------------------------------------------------------+

Sincerely,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants