-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFE: disable metadata check for a given mirror's IP block #406
Comments
I guess we wouldn't want to let arbitrary mirrors just configure this themselves, as a bad mirror owner could maliciously include an IP in its block and force that IP to use content from their mirror. But we could allow it to be set by MM admins for specific use cases. |
Are you sure Each section gets a weighted shuffle by bandwidth to distribute load. We do not generate the same list twice (better: for each request we shuffle again) to better distribute the load. The expectation from my side was that DNF takes the list and goes through it until it finds one that matches the checksums from the metalink. So we definitely have a different assumption here how DNF works. The whole private mirror setup kind of relies on the fact that DNF tries the first mirror in the list first.
That sounds doable. Overall it feels like an extension of the private mirror concept.
Not sure if a metalink works if the checksums are missing, like you said.
That is no problem to do. We already have a couple of MirrorManager admin only options. |
My memory was that dnf always tries the first server in the list first for downloads beyond the initial repodata, but will do some kinda randomized round-robin thing for downloading the repodata. I might be wrong on that, though. I might have been remembering the randomization that applies to the creation of the list as you describe above. I don't think it's actually terribly important to this issue, though - for the use case in question, what we really want is basically a way to make the metalink system always result in the use of a single specific mirror. If you have control of the SUT's repo config you can of course just overwrite it somehow, but often we cannot or do not want to rewrite the SUT's repo config. edit: I'll try hand constructing a metalink of the format I'm envisioning and see if it actually works, that shouldn't be too hard. |
From my point of view the main question is how does DNF react if no checksums are in the metalink. If that works it should be doable to implement what you need. |
So in a quick test (sorry, I went on PTO...) this seems to work. I constructed a metalink file thus, with no timestamp, file size or verification block:
and saved it as /tmp/test.metalink . I created a repo file thus:
then I did |
I was talking to @thrix about mirroring issues for CI systems today. It seems that he's noticed a category of problems where CI systems hit issues with the metadata validity check system we have (the one where the metalink contains a list of 'valid' metadata checksums, and dnf hits mirrors until it finds metadata with a checksum in the list).
I suggested that maybe putting the CI systems in the IP block associated with a specific mirror that is close to them and known to be rapidly updated could help, but forgot that even if you're in the IP block for a mirror, dnf will still do the metadata validity check, and (IIRC) still use the "pick a mirror at random to do the metadata download" approach. So it doesn't really help with issues with the metadata validity check.
So I wondered - could we add a setting like 'hosts in the IP block use this mirror regardless', which would make that mirror the only one in the metalink response, and make the metalink response not contain the
<verification>
stuff, to disable the validity check (assuming dnf just skips the validity check if the metalink doesn't include that data)?For instance I could then use this for openQA to ensure it only ever uses the internal infra mirror and skips the metadata checks. Other CI systems could use it similarly to use a known-good mirror that is close to them in network terms.
The text was updated successfully, but these errors were encountered: