-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to optionally skip downloading the contents of a file #79
Conversation
htmlproofer/plugin.py
Outdated
# Download the entire contents as to not break previous behaviour. | ||
for _ in response.iter_content(chunk_size=1024): | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't actually care about downloading the contents (we seemingly don't do any verification of contents), then we could always just drop these lines, with stream=True
set the body of the response isn't downloaded automatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't as long as the url status is correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I just simplify this MR to just skip the download entirely and drop the read_max_bytes
configuration? I'm hesitant to break existing behaviour, so I'll list out a few possible routes this MR could go:
- Keep this MR as it is, with
read_max_bytes
limiting the size of downloaded content. - Opt for a different configuration value, such as
skip_downloads
, which triggers this new behaviour of skipping the download of the response's body. - Don't bother adding any configurable behaviour and just establish a connection via
stream=True
and skip any downloads.
@manuzhang Which would be your preference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer option 2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@manuzhang I've implemented option 2 and updated this MR, as well as the original issue title.
This PR does a couple of things:
1 . The first commit prevents the downloading of entire files into memory by streaming the response and reading small chunks as a time.
2. The second commit adds the configuration option to optionally skip downloading of a remote URLs content.
Closes #76.