-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
treeshake imports based on guest content #117
treeshake imports based on guest content #117
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'll be excellent to have this!
One thing that concerns me a bit is that we're relying on a different lexer here to find the list of used imports. If that lexer disagrees with SpiderMonkey's for whatever reason, that could lead to really bizarre and hard to debug issues.
And I could immediately see that happening in practice, too: we support top-level await, which means code could dynamically decide to import component interfaces, which the lexer wouldn't be able to see.
We could alternatively consider making StarlingMonkey emit the required information itself: it could write out a file containing a list of imports as recorded during parsing/execution. That'd require changing StarlingMonkey to be able to receive the information that it should do so, and where to store the resulting file, but that'd all be doable, I think. @guybedford, do you see any blockers with that approach?
We do not currently support dynamic Note that ComponentizeJS performs source transforms using the same lexer already, as it has to perform a replacement of the world imports with the generated bindings modules. Separately, we don't currently support transitive imports at all - which we probably should as well. But since this is a missing feature, again could be a follow-up to this PR in supporting the transitive rewriting and not just for the top-level source. If we did want to move more of this process into StarlingMonkey, what would be needed is a full loader API - the ability to define source transforms and resolver hooks at the engine level. One problem here is how to communicate the source transform without causing the code for the source transform to be linked in unnecessarily to the final build. So a solution would likely need to be found for that communication problem (eg via a Wizer-time dedicated "resolve" / "loadSource" host call technique). |
Thank you for the excellent context, @guybedford! With all that understood, I think it's more than fine to land this with the external lexer, and eventually look into moving more of this and other things into StarlingMonkey itself. |
Signed-off-by: karthik2804 <[email protected]>
Signed-off-by: karthik2804 <[email protected]>
486735b
to
ff30f7d
Compare
I made an attempt to update this PR based on #116 (comment). I am not entirely sure if I am on the right track. I am also not sure how to debug the failing test (my suspicion is that it has something to do with transitive imports in the |
Signed-off-by: karthik2804 <[email protected]>
Signed-off-by: karthik2804 <[email protected]>
I have updated the PR to fix the failing test and also addressed the comment. |
I tested this out with a simple scenario and it seems to work well! I've added an update here to properly handle export aliases. It doesn't fully handle the alias deduping logic yet which is supposed to only support aliases when they don't conflict. To handle that we should move the alias conflict detection logic to a preprocessing stage that is not related to the It could be nice to even handle treeshaking for function import bindings off of an interface ie This code looks good to land to me, please test it out once more there @karthik2804 and let me know if you are happy to release. We will need a test that verifies the export and import tree-shaking in this PR to land it though. Just let me know if you want to discuss how to set up the wiring for that if you want more feedback on that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We just need a simple test to land this.
// TODO: move populate_export_aliases to a preprocessing | ||
// step that doesn't require esm_bindgen, so that we can | ||
// do alias deduping here as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may bring in possibly unused interfaces in some alias conflict cases, which would require the refactoring as discussed around alias preprcessing. I'm okay to land this without that to start.
I've pushed up a simple failing test case here to verify that imports and exports get treeshaking applied. In the process I discovered two remaining issues:
There is therefore still quite some work to do here. Sorry I can't be of more help. |
An alternative to (1) would be to carefully ensure that in all bindgen, exports ordering is defined to be the ordering after treeshaking. Either way though requires careful review of the ordering. |
@guybedford can you walk me through the 2 remaining issues a little more? I do not fully grasp it. My apologies if I am being slow. |
6bdbcef
to
067d072
Compare
@karthik2804 I took another look at this today and was able to resolve most of the remaining concerns I had:
The tests for imports treeshaking are now fully passing. There is only one final thing then that we need to discuss now and that is the exports treeshaking, which still has a failing test. Strictly speaking, I believe exports are supposed to be mandatory for component generation - that is a component can only be encoded to a world when it provides all of the expected exports. Not providing an export is typically seen as an error. Therefore I'd like to understand more about exports treeshaking and how it might be used in this case? And then if we want to support this exports treeshaking feature then we must also add support for removing unused exports from the world itself before encoding the component. |
@guybedford That sounds exciting! On the note of exports tree shaking - the original intent of the PR started with only the goal of imports tree shaking. The reason we started analyzing the guest exports was to implement what #116 took a stab at. (i.e) retain the fetch event if the guest does not export an explicit |
067d072
to
6f70bb3
Compare
for (k, _) in &engine_resolve.worlds[engine_world].imports { | ||
guest_imports.push(engine_resolve.name_world_key(k)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was taking a stab at this afternoon and found that this returned all the imports provided in the wit
definition and not just the ones actively used by the engine who's list I was able to get with the following code snippet.
// Componentize the Starlingmonkey engine so that we can pull in the imports that are actually used.
let bytes = include_bytes!("../../../StarlingMonkey/host-apis/wasi-0.2.0/preview1-adapter-release/wasi_snapshot_preview1.wasm");
let component = wit_component::ComponentEncoder::default()
.adapter("wasi_snapshot_preview1", bytes)
.map_err(|e| e.to_string())?
.module(&engine)
.map_err(|e| e.to_string())?
.encode()
.map_err(|e| e.to_string())?;
let decoded = wit_component::decode(&component).unwrap();
let resolve = decoded.resolve();
let packages = decoded.package();
let world_id = resolve.select_world(packages, None).unwrap();
let world = &resolve.worlds[world_id];
// merge the imports actually used by the engine with the imports from the guest content.
for (import_key, _) in world.imports.iter() {
guest_imports.push(resolve.name_world_key(import_key));
}
@karthik2804 okay, in that case, let's remove the exports treeshaking entirely for now then. As for the imports listing, we should probably have two lists like |
We could pre-compute the list of engine exports as a part of the build step possibly? run it between the build of Starlingmonkey and before the splicer is built? |
Also with removing the exports tree-shaking, would we still need #116? cc: @tschneidereit |
I've updated the exports treeshaking to instead provide a missing guest export error and added a test for that. For the incoming handler, we probably do still want to support that case, as that is about the engine itself? I'm happy to just document for now that If so, let's land and release. |
That makes sense to me! I really appreciate you guiding me and helping take this over the line! Thanks! |
Ok will do. This is a fantastic feature to have, thanks for working on it! Hopeful we can get to interface analysis for cases like |
Released in 0.10.0. |
Thank you! |
This is a WIP as I have not tested it much yet. I am not sure how transitive imports will be passed down. Since we only pass in the imports directly present in the guest content.