Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

not downloading all files, -R not working, many errors #321

Open
joaoluizcarvalho opened this issue Jun 2, 2024 · 6 comments
Open

not downloading all files, -R not working, many errors #321

joaoluizcarvalho opened this issue Jun 2, 2024 · 6 comments

Comments

@joaoluizcarvalho
Copy link

joaoluizcarvalho commented Jun 2, 2024

Every month I download the new set of Mame roms from bda.retroroms.info using the following command:

wget -r --level=1 -np -nH -nc --cut-dirs=2 --http-user=myusername --http-password=mypassword -R "index.html*" -R "desktop.ini" https://bda.retroroms.info:82/downloads/mame/mame-0266/

However, since Fedora upgraded to wget2, I have been observing the following issues:

  1. Prompt says "Errors: 100+"
  2. Only a small subset of the files in the directory is downloaded. If I delete the folder, and run again, a different subset is downloaded each time.
  3. The -R option doesn't seem to be working, because the index.html files are kept.

Here is an example prompt output:

9 files 100% [===========================================================================================================================================================================================>] 2.19M 1013KB/s
11 files 100% [===========================================================================================================================================================================================>] 20.83M 1.38MB/s
index.html?C=M&O=D 100% [===========================================================================================================================================================================================>] 2.16K --.-KB/s
index.html 100% [===========================================================================================================================================================================================>] 607 --.-KB/s
zerotimeb.zip 100% [===========================================================================================================================================================================================>] 494 --.-KB/s
[Files: 23 Bytes: 22.98M [1.23MB/s] Redirects: 0 Todo: 0 Errors: 113

Note that only 23 files were downloaded, but the directory I requested had 67 files and 1 empty folder. The complete file list can be seen here, by browsing into /downloads/mame/mame-0266/

@joaoluizcarvalho
Copy link
Author

joaoluizcarvalho commented Jun 4, 2024

I also tried adding --max-threads=1 but that did not solve the issue: 90+ errors.

30 files 100% [===========================================================================================================================================================================================>] 24.86M 1.88MB/s
[Files: 30 Bytes: 24.79M [1.32MB/s] Redirects: 0 Todo: 0 Errors: 91

@rockdaboot
Copy link
Owner

That is possibly a rate limiter on the server. Please check if --wait, --random-wait, --limit-rate helps here.

@joaoluizcarvalho
Copy link
Author

I tried --wait=2 and it downloaded all files successfully, but it still says "Errors: 69".

Is it possible to display the error messages? I tried --verbose, but it didn't help.

Also, the issue with -R "index.html*" not working remains.

@rockdaboot
Copy link
Owner

Use --progress=none to see the textual output including response codes.

I'll take a look into why -R keeps the files in a few days.

@joaoluizcarvalho
Copy link
Author

I just ran the command again, this time with --wait=2 and --progress=none. All files were downloaded and there were 68 errors: 1 error 404 (not found) and 68 errors 401 (unauthorized).

The 404 error and one of the 401 errors refer to https://bda.retroroms.info:82/robots.txt. I don't understand why it is trying to download that file, as it is not in the requested folder: https://bda.retroroms.info:82/downloads/mame/mame-0266/

This was the command:

wget -r --level=1 -np -nH -nc --cut-dirs=2 --http-user=myuser --http-password=mypassw --max-threads=1 --wait=2 -R "index.html*" -R "desktop.ini" --progress=none https://bda.retroroms.info:82/downloads/mame/mame-0266/

And these were the first lines of the output:

[0] Downloading 'https://bda.retroroms.info:82/robots.txt' ...
HTTP ERROR response 401 [https://bda.retroroms.info:82/robots.txt]
[0] Downloading 'https://bda.retroroms.info:82/robots.txt' ...
HTTP ERROR response 404 [https://bda.retroroms.info:82/robots.txt]

The remaining (67) 401 errors refer to each of the 67 zip files in the requested folder. Note that all these zip files were succesfully downloaded. For example:

[0] Downloading 'https://bda.retroroms.info:82/downloads/mame/mame-0266/aim65.zip' ...
HTTP ERROR response 401 [https://bda.retroroms.info:82/downloads/mame/mame-0266/aim65.zip]
[0] Downloading 'https://bda.retroroms.info:82/downloads/mame/mame-0266/aim65.zip' ...
Saving 'mame-0266/aim65.zip'
HTTP response 200 [https://bda.retroroms.info:82/downloads/mame/mame-0266/aim65.zip]

@joaoluizcarvalho
Copy link
Author

I also tried running with --progress=none, but without --wait=2. A bunch of 503 errors occurred, and only 16 of the 67 zip files were downloaded. This was successfully fixed using --wait=2, though (as commented above).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants