-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add in-place optimization for array map #75571
Conversation
r? @cramertj (rust_highfive has picked a reviewer for you, use r? to override) |
Since it effectively has two implementations now maybe the drop test should also be changed to exercise both code paths. |
Ah yea I will add further tests once I actually check that it compiles |
e4aeb96
to
c76be7c
Compare
r? @scottmcm |
r? @Amanieu |
@JulianKnodt Sorry for forgetting this PR. You might want to check out https://www.llvm.org/docs/CommandGuide/FileCheck.html#tutorial, as we use that for codegen tests. Do you have an example of the IR emitted with and without this change? We could maybe help identify the core things that are worth checking. Looking at what it currently emits in https://rust.godbolt.org/z/ezfPxr, maybe you're trying to get rid of that 1000-item Though, looking at the LLVM signature of the API,
I'm not sure that this in-place is actually a win, because at the llvm level it needs to be Makes me wonder if this just needs NRVO to be turned on, instead of actually needing code changes... |
@scottmcm no worries, I've been busy with other things so I've not been working on this PR as hard as I should be. Thank you for the update, I didn't know exactly how to view the LLVM output(and also I'm not particularly familiar with LLVM), but I think the main component is just to remove the |
Have you verified that this in fact produces better assembly code in practice? |
Hm? It appears that it compiles differently on CI as compared to my own machine as welll.... |
30bd2ee
to
7c1a86c
Compare
Triage: Seems CI is still red here. @JulianKnodt |
Ah yea I think there was some issue I was having where locally my test outputs were passing but they didn't work on CI here, I can double check |
This adds a codegen test which once annotated can check that the optimization is applied in the case expected. I'm not sure how to annotate it, though, so I've left it unmarked.
This instead changes the code to use one union type for the constructed array, which basically rms the numbers of moves that LLVM has to do as well better mimicking C-like ptrs allowing LLVM to optimize it better. I also updated the test to basically check for NRVO optimizations
Just pulling the latest seems to have fixed it, |
Good to hear the tests are passing in CI now! Would you might uploading (gist or whatever) the |
LLVM Output
|
Thanks. Unfortunately, that confirmed what I was afraid of. Putting these tests in godbolt https://rust.godbolt.org/z/9MoTeb says that %iter1.i.i = alloca %"std::iter::Map<std::array::IntoIter<u32, 4_usize>, [closure@/home/kadmin/projects/rust/src/test/codegen/array-map.rs:14:9: 14:21]>", align 16
%tmp.i.i.i.i = alloca i128, align 8
%array.i.i = alloca i128, align 8 I worry I might have steered you wrong earlier and that my demo was just getting lucky with unrolling or something. More generally, it might just not be possible to fix this until there's a resolution to #79914 or similar -- copying the arrays into wrappers, be that |
☔ The latest upstream changes (presumably #79607) made this pull request unmergeable. Please resolve the merge conflicts. Note that reviewers usually do not review pull requests until merge conflicts are resolved! Once you resolve the conflicts, you should change the labels applied by bors to indicate that your PR is ready for review. Post this as a comment to change the labels:
|
I'm going to leave this up, but probably won't update it too much while the manually drop issue is still pending. |
Triage: I'm gonna switch this to S-blocked now. |
@JulianKnodt Now that #82098 has gone in, I think I'm going to close this since it'll need redoing to with with that anyway, and as you say it's probably not worth doing that until the ManuallyDrop issue can be resolved. You can of course reopen at any point, or make a new one if you'd like. |
This adds an in-place optimization for array map when the input and output types are the same size. It's still a little awkward because
transmute
does not work between dependently sized types.Tracking Issue: #75243
cc: @the8472