Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazy files #1190

Closed
wants to merge 1 commit into from
Closed

Lazy files #1190

wants to merge 1 commit into from

Conversation

jonludlam
Copy link
Contributor

The idea of this PR is to be able to register files in the fake filesystem but without loading their contents until actually accessed. This means that a directory list will show the file as present, but on actually opening the file the usual callback is called.

The motivation behind this is to reduce the size of toplevels - so rather than compiling in all of the cmi files as is currently done we just register the files lazily. The compiler is pretty good at not actually opening the files until they're needed so this seems to end up reducing the amount of data transferred quite a bit.

The end goal is to produce a javascript toplevel for each and every library in opam to be hosted on v3.ocaml.org, so keeping the size down is an important part of this.

Does this approach make sense?

@hhugo
Copy link
Member

hhugo commented Dec 15, 2021

I don't understand what you mean by the usual callback is called. Are you referring to the callback registered by Sys_js.mount ? Also, when do you plan to actually load the files' content ?
Note that cmis don't have to be included in the toplevel itself.

@jonludlam
Copy link
Contributor Author

jonludlam commented Dec 15, 2021

Yes, the callback registered by Sys_js.mount. The file's content is loaded when it's opened - the only change this PR makes is that it causes files registered with the new call to appear when listing the directory. I'm using it like this:

https://github.com/jonludlam/js_top_worker/blob/main/lib/worker.ml

The idea is that the toplevel is running in a worker thread, and the cmi file is only actually downloaded (synchronously) from the server when the compiler is trying to open it.

@hhugo
Copy link
Member

hhugo commented Dec 21, 2021

Have you looked at #833 / #435 by any chance ?

Regarding the MR, I'm not sure of the current approach. There is nothing to enforce that a the callback will return something for files registered lazily.

Have you considered introducing something like the following instead ?

val mount_lazy : path:string -> (prefix:string -> path:string -> string lazy option) -> unit

@jonludlam
Copy link
Contributor Author

I hadn't seen that, no - looks very interesting!

You're right, there's nothing enforcing that the file ends up with any content via the callback. I'll investigate your suggestion - thanks!

In general though, does this seem a reasonable approach for reducing the amount of data transferred?

@hhugo
Copy link
Member

hhugo commented Dec 21, 2021

The general idea seems reasonable.

I assume the toplevel first scans directories to know the files on dist and then only loads them on demands. Is that right ?

An alternative approach would be to use the feature introduced in ocaml/ocaml#706 which would allow to not rely on the filesystem for loading cmis.

@jonludlam
Copy link
Contributor Author

The Persistent_signature interface works perfectly thanks! Closing this PR in favour of using that.

@jonludlam jonludlam closed this Jan 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants