You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm running CUDA 10.1 with the latest versions of TF and pyTorch a TeslaK80 and a 1080ti.
I'm running the stable version (0.1.1 -- I was unable to get the ESPnet version running) with a patched train.py implementing data_parallel_workaround() from master.
The model seems to be training -- but very inefficiently. If I watch GPU usage with nvidia-smi I see only intermittent GPU-Util spikes with CPU utilization at about 25% (8 cores @ 4.8 ghz).
hparams that may be relevant:
# Data loader
pin_memory=True,
num_workers=12,
# Training:
batch_size=12,
Do I just need to dramatically increase the num_workers to feed the GPUs more data? GPU temps look fine, data is on a super fast SSD, so I'm not sure what I'm doing wrong.
FWIW, here's what I show in the python.exe stack:
The text was updated successfully, but these errors were encountered:
I'm running CUDA 10.1 with the latest versions of TF and pyTorch a TeslaK80 and a 1080ti.
I'm running the stable version (0.1.1 -- I was unable to get the ESPnet version running) with a patched train.py implementing data_parallel_workaround() from master.
The model seems to be training -- but very inefficiently. If I watch GPU usage with nvidia-smi I see only intermittent GPU-Util spikes with CPU utilization at about 25% (8 cores @ 4.8 ghz).
hparams that may be relevant:
Do I just need to dramatically increase the num_workers to feed the GPUs more data? GPU temps look fine, data is on a super fast SSD, so I'm not sure what I'm doing wrong.
FWIW, here's what I show in the python.exe stack:
The text was updated successfully, but these errors were encountered: