Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File "h5py/h5f.pyx", line 85, in h5py.h5f.open OSError: Unable to open file (file signature not found) #2

Open
chowkamlee81 opened this issue Jan 14, 2019 · 8 comments

Comments

@chowkamlee81
Copy link

Kindly help us in solving issues..
I have tried with python3.5/python2.7, still smae issues . Do the needful

File "main.py", line 316, in
main()
File "main.py", line 190, in main
train(train_loader, model, criterion, optimizer, epoch, tsbd)
File "main.py", line 222, in train
for i, (img, speed, target, mask) in enumerate(loader):
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 623, in next
return self._process_next_batch(batch)
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 658, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
OSError: Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 138, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 138, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/ubuntu/Pytorch/carla_cil_pytorch-master/carla_loader.py", line 113, in getitem
with h5py.File(file_name, 'r') as h5_file:
File "/usr/local/lib/python3.5/dist-packages/h5py/_hl/files.py", line 394, in init
swmr=swmr)
File "/usr/local/lib/python3.5/dist-packages/h5py/_hl/files.py", line 170, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 85, in h5py.h5f.open
OSError: Unable to open file (file signature not found)

terminate called without an active exception
Aborted (core dumped)

@onlytailei
Copy link
Owner

Have you loaded the dataset correctly?

$ python main.py --batch-size 1000 --workers 16
    --train-dir "path/to/AgentHuman/SeqTrain/"
    --eval-dir "path/to/AgentHuman/SeqVal/"
    --gpu 0
    --id training

train-dir and eval-dir should point to where your dataset are located.
You should download the dataset from https://github.com/carla-simulator/imitation-learning/blob/master/README.md

@chowkamlee81
Copy link
Author

I have downloaded 24GB of SeqTrain and SeqVal. But it is unable to read h5 files but other files are read. Some exeptions are thrown because of which training is not happening. Kindly help in going forward

@chowkamlee81
Copy link
Author

I too have given same command as per your readme section pointing to SeqTrain and SeqVal below
Kindly help
python main.py --batch-size 1000 --workers 16
--train-dir "path/to/AgentHuman/SeqTrain/"
--eval-dir "path/to/AgentHuman/SeqVal/"
--gpu 0
--id training

@onlytailei
Copy link
Owner

I suggest that you should redownload the dataset and unzip them.

@zhangjunwang
Copy link

@chowkamlee81 I have the same problem. Have you solved it?
It can run several steps and then shows:

Epoch: [1][0/6578] Time 81.694 (81.694) Data 81.122 (81.122) Branch loss 0.471 (0.471) Speed loss 0.484 (0.484) Uncertain Loss 0.9547 (0.9547) Ori Loss 1.3344 (1.3344) Control Uncertain 1.3382 (1.3382) Speed Uncertain 0.8209 (0.8209)
[2019-07-11 21:52:43.142186]: Epoch: [1][10/6578] Time 2.939 (12.065) Data 2.923 (11.986) Branch loss 0.379 (0.417) Speed loss 0.159 (0.323) Uncertain Loss 0.5379 (0.7395) Ori Loss 1.1081 (1.3106) Control Uncertain 1.0327 (1.1592) Speed Uncertain 0.8081 (0.8032)
[2019-07-11 21:54:33.479006]: Epoch: [1][20/6578] Time 19.291 (11.574) Data 19.275 (11.520) Branch loss 0.347 (0.394) Speed loss 0.216 (0.266) Uncertain Loss 0.5627 (0.6600) Ori Loss 1.1855 (1.2432) Control Uncertain 0.9598 (1.0789) Speed Uncertain 0.7053 (0.7981)
Traceback (most recent call last):
File "main.py", line 400, in
main()
File "main.py", line 201, in main
train(train_loader, model, criterion, optimizer, epoch, tsbd)
File "main.py", line 236, in train
for i, (img, speed, target, mask) in enumerate(loader):
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 568, in next
return self._process_next_batch(batch)
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/dataloader.py", line 608, in _process_next_batch
raise batch.exc_type(batch.exc_msg)
OSError: Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/_utils/worker.py", line 99, in _worker_loop
samples = collate_fn([dataset[i] for i in batch_indices])
File "/usr/local/lib/python3.5/dist-packages/torch/utils/data/_utils/worker.py", line 99, in
samples = collate_fn([dataset[i] for i in batch_indices])
File "/home/zjw/Documents/imitation-learning/carla_cil_pytorch-uncertain_open/carla_loader.py", line 114, in getitem
with h5py.File(file_name, 'r') as h5_file:
File "/home/zjw/.local/lib/python3.5/site-packages/h5py/_hl/files.py", line 394, in init
swmr=swmr)
File "/home/zjw/.local/lib/python3.5/site-packages/h5py/_hl/files.py", line 170, in make_fid
fid = h5f.open(name, flags, fapl=fapl)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py/h5f.pyx", line 85, in h5py.h5f.open
OSError: Unable to open file (file signature not found)

@zhangjunwang
Copy link

@onlytailei

@MohanadOdema
Copy link

Hi,
I encountered the same error and I found that the problem is with one of the .h5 files 'data_06790.h5' in the training set SeqTrain/.
It is probably corrupt and wasn't saved probably. I removed it from the training dataset and training is now working fine.

@brightyoun
Copy link

Hi,
I encountered the same error and I found that the problem is with one of the .h5 files 'data_06790.h5' in the training set SeqTrain/.
It is probably corrupt and wasn't saved probably. I removed it from the training dataset and training is now working fine.

thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants