-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
is_url: allow urls with parameters #92
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! One test case since the given example could be ambiguous in later iterations.
tests/test_idutils.py
Outdated
def test_url(): | ||
"""Test URL validation.""" | ||
assert idutils.is_url( | ||
"https://archive.softwareheritage.org/swh:1:dir:44ac666e75004dd2a27ca0e09e73aecc0e8b426f;origin=https://github.com/amykwebster/MIPseq_2021;visit=swh:1:snp:46bcbe75339295c3d68fcf1257022aae8e6bee40;anchor=swh:1:rev:27839dcc9ef1587086be195349310fb70fbfcaf1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also add an example above in the big list of sample identifiers
variable? I have a feeling that this should eventually also be identified as a swh
identifier.
We can do this later of course by including the http(s)://archive.softwareheritage.org
part in the regex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
swh was actually already in the identifiers list, but with a url that would have failed the validation check. I expanded the test so all the urls in the identifiers variable are now checked.
It is useful to link to software directly with a SWHID. #OR2024 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM – it was basically the same as my PR, but with a few extras.
as i've commented on my PR: i was denied permission to push to this branch, so i just added the test from here to my PR and merged that |
❤️ Thank you for your contribution!
Description
Software heritage uses urls with parameters, like https://archive.softwareheritage.org/swh:1:dir:44ac666e75004dd2a27ca0e09e73aecc0e8b426f;origin=https://github.com/amykwebster/MIPseq_2021;visit=swh:1:snp:46bcbe75339295c3d68fcf1257022aae8e6bee40;anchor=swh:1:rev:27839dcc9ef1587086be195349310fb70fbfcaf1 . However the idutils is_url function will reject these as not being a valid url.
This PR makes the is_url function more permissive. Software heritage ids and related urls are already in idutils, so it's a bit odd for idutlils to say that they aren't urls.
Checklist
Ticks in all boxes and 🟢 on all GitHub actions status checks are required to merge:
Third-party code
If you've added third-party code (copy/pasted or new dependencies), please reach out to an architect.
Reminder
By using GitHub, you have already agreed to the GitHub’s Terms of Service including that: