-
Notifications
You must be signed in to change notification settings - Fork 414
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pkg: compute checksums for packages without checksums #9384
Conversation
3c742f9
to
7f75cd1
Compare
It occurs to me that maybe we should add some indication that packages have had their hashes added by dune in this way. Hashes in opam have some air of legitimacy since they've at least gone through a manual code review process when the package was added to the repo, but if dune is just going to add its own hashes to packages that otherwise lack them people might start to trust our hashes to the same degree that they trust opam's. |
8b65abe
to
ceb21d8
Compare
I suppose we could print the hashes that we've had to compute to let the user know we're doing this. FYI, opam installs such packages without checksums without any ceremony either. I'm not even sure if it saves the hash it ends up downloading. |
b1378f4
to
f7b4e96
Compare
I added a message about how dune is generating its own checksum. It also prints the source archive url so users can vet suspicious-looking urls. Also I changed the test to use a text file rather than an archive since creating the archive (tar) was producing different results on different platforms. |
let target = Path.relative temp_dir "archive" in | ||
Fetch.fetch ~unpack:false ~checksum:None ~target (OpamUrl.of_string url) | ||
>>| function | ||
| Ok () -> Some (Dune_digest.file target |> Checksum.of_dune_digest) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, the internals of Fetch.fetch
can already give you the newly calculated checksum. It should be enough to just tweak the API to get it somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for a start, but I see a couple things that are fishy:
-
Why do we include checksums for sources computed from the local file system? Seems a bit pointless since they can't be shared anyway.
-
The build step is going to refetch the sources again. That seems wasteful. We need to reuse the build rules somehow when fetching here to avoid this.
User_message.print | ||
(User_message.make | ||
[ Pp.textf | ||
"Package %S has source archive which lacks a checksum." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Package %S has source archive which lacks a checksum." | |
"Package %S has a source archive which lacks a checksum." |
"Package %S has source archive which lacks a checksum." | ||
(Package_name.to_string package_name) | ||
; Pp.textf "The source archive will be downloaded from: %s" url | ||
; Pp.text "Dune will compute its own checksum for this source archive." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that it would be good to print the computed checksum as well, just for reference for the user.
> sed -e "s#$PWD#<pwd>#" | ||
> } | ||
|
||
$ echo "Hello, World!" > foo.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alternative would be to create a tarball and run md5sum and then compare it to the one in the lock file, without ever printing it.
file://<pwd>/foo.txt | ||
Dune will compute its own checksum for this source archive. | ||
Warning: Failed to retrieve source archive from: | ||
file://<pwd>/foo.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe also print what dune will do in that case (that is, not include a checksum).
I agree that we should avoid fetching the archive a second time. Conceptually it makes sense to have a rule which downloads the archive and a second rule which extracts it. How hard would this be to do in practice though? Currently we don't use rules when generating the lockdir or in The alternative would be to download the archive to a well-known place in _build and have |
It would be rather tricky. I'll try and think about it a little more.
If dune doesn't know something is a build target in _build it will consider it stale and just delete it. So it would have to live elsewhere. We could introduce this cache some other directory, but it seems a bit premature to do this. For now what you have is fine. Let's just document the issues and solve them later. |
Wouldn't |
It's an option, but not particularly good. The dune cache deduplicates all files so that makes it more efficient. Also, we'd have to write yet another cache cleanup mechanism. |
Btw, you can set |
Deduplication would happen automatically if you use |
The dune cache is much better than that. It deduplicates on a per file basis, so tarballs of different versions of the same package will end up sharing the majority of the contents. But even without that, creating a secondary cache system parallel to the one we already have is somewhat of a last resort. |
Signed-off-by: Stephen Sherratt <[email protected]>
f7b4e96
to
53dafd7
Compare
It sounds like there's no obvious idiomatic place to put the file so it can be reused. Since re-using the archive is an optimization and doesn't affect functionality, in the interest of avoiding dealing with ongoing merge conflicts would it make sense to merge this now and work out where to put it in a new issue? |
That sounds reasonable to me. Out of curiosity, how many packages do not have checksums in opam-repository? |
Not sure how many, but the |
fixes #9336