-
Notifications
You must be signed in to change notification settings - Fork 88
Error when Running Training Command #28
Comments
The first error is probably because you are not using hydra-core==1.1.0 and hydra-submitit-launcher==1.1.5 . |
@medric49 Ah, ok. I had hydra-core==1.2.0 and hydra-submitit-launcher==1.2.0. I installed the correct versions and when I ran the training, I did not get the override error. I started the training again, so hopefully I do not get the second error that I listed in my original post. Thanks for the help and I will let you know if I run into any issues. |
@medric49 I ran the training command and after about 6 hours, I got this: | train | F: 916000 | S: 458000 | E: 916 | L: 1000 | R: 461.0975 | BS: 458000 | FPS: 0.7374 | T: 6:45:55 but then I got this error: RuntimeError: DataLoader worker (pid 320456) is killed by signal: Killed. Here is the full error. I'm not sure what's causing this error (maybe batch size or out of memory), so I would appreciate any help. |
I am not sure what the issue could be, but it seems to be an out-of-memory, and that's why your system killed the program. |
@medric49 Ah, ok. Currently, the |
Okay.
which is lower than 1000000. |
Hello,
When I ran the training command given in the readme
python train.py task=quadruped_walk
I got this error:File "/home/anavani/anaconda3/lib/python3.9/site-packages/hydra/_internal/defaults_list.py", line 168, in ensure_overrides_used raise ConfigCompositionException(msg) hydra.errors.ConfigCompositionException: Could not override 'task'. Did you mean to override task@_global_? To append to your default list use +task=quadruped_walk
I changed the command to
python train.py +task=quadruped_walk
and this seemed to fix the issue. However, after I let it train for a bit, I got this errorIt seems as if the +task=quadruped_walk is causing an EOF error, but I'm not sure what is causing the seocnd error. I would really appreciate any help. @denisyarats @Aladoro @desaixie @medric49
The text was updated successfully, but these errors were encountered: