Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose experimental LLVM features for automatic differentiation and GPU offloading #109

Open
3 of 9 tasks
nikomatsakis opened this issue Jul 22, 2024 · 8 comments
Open
3 of 9 tasks

Comments

@nikomatsakis
Copy link
Contributor

nikomatsakis commented Jul 22, 2024

Metadata
Owner(s) @ZuseZ4
Team(s) compiler, lang
Goal document 2024h2/Rust-for-SciComp

Summary

Expose experimental LLVM features for automatic differentiation and GPU offloading.

Tasks and status

  • Development (@ZuseZ4)
    • upstream the autodiff frontend
    • upstream the autodiff backend
    • upstream the autodiff middle-end
    • upstream the offload frontend
    • enable the offload backend
    • upstream the offload middle-end
  • Lang-team experiment (lang Team)
  • Standard reviews (compiler Team)
@nikomatsakis nikomatsakis added this to the 2024h2 milestone Jul 22, 2024
@rust-lang rust-lang locked and limited conversation to collaborators Jul 25, 2024
@nikomatsakis
Copy link
Contributor Author

This issue is intended for status updates only.

For general questions or comments, please contact the owner(s) directly.

@ZuseZ4
Copy link
Contributor

ZuseZ4 commented Aug 24, 2024

During the first month, I focused on automatic differentiation. I cleaned up my rustc fork and made my first two upstreaming PRs for the frontend and backend. Once they are merged, I will continue with posting PRs for the remaining middle-end. While waiting for reviews, I have been improving the docs a bit, mainly the pages about debugging Enzyme crashes. I am especially proud that due to those docs we recently got our first enzyme core issue with a full LLVM-IR reproducer from a Rust dev, even though the developer reporting that issue had no previous compiler/LLVM experience. Such detailed issues make fixing bugs for Enzyme core much easier.

On the GPU side, I mainly have to thank nikic, who reliably updates the LLVM backend of Rustc every few weeks or months. Thanks to his latest update rustc now supports a sufficiently new LLVM which ships most of the GPU/Offloading work that I want to expose on the Rust side. Once my first two Autodiff patches have settled, I'll look a bit more into setting up documentation for the GPU feature.

@ZuseZ4
Copy link
Contributor

ZuseZ4 commented Sep 12, 2024

During the last three weeks, my first autodiff PR for the backend, which includes the enzyme submodule and 13 additional files, got merged! I also got a ton of feedback from reviewers, especially for my frontend PR (thanks to jieyouxu). Now that the backend is merged, I did put up my third PR, covering the changes I made to rustc_codegen_llvm. I am currently at RustConf, so I won't be able to address much of the feedback this week, but I am happy to talk to everyone also visiting and will try to get both PRs ready to merge in the next week.
Once we then have the two open PRs merged, we should have my changes to ~55/85 files upstream, so we're making good progress.

On the GPU side again not many updates due to my current autodiff focus, but thanks to another llvm submodule update we can now use some nicer APIs for our development in rustc, which recently got merged into LLVM.

@ZuseZ4
Copy link
Contributor

ZuseZ4 commented Sep 16, 2024

And as another short update, my talk "When unsafe code is slow - Automatic Differentiation in Rust" got accepted as a techtalk for the LLVM dev mtg. There I'll present a lot of benchmarks and some analysis comparing Rust-Enzyme with the C++ frontend of Enzyme, and show one application which we had to port from Python/JAX to Rust/Enzyme.
The full program of the dev meeting is available here.
For that, I spent some time trying to fix the benchmark infrastructure in Enzyme core, to make sure everyone can reproduce our benchmarks.

@ZuseZ4
Copy link
Contributor

ZuseZ4 commented Sep 30, 2024

Thanks to some support from the bootstrap team, dist builds with autodiff support enabled now work.
That allowed us to add Rust to our autodiff fork of the compiler explorer: https://enzyme.mit.edu/explorer/
Unfortunately, we still have some dist issues about finding std in the compiler explorer build, so help here would be appreciated.
Other than that, this morning my PR to add Enzyme/autodiff support to the test infra got merged: rust-lang/rust#131044
This should allow to add this larger frontend PR to the merge queue later today: rust-lang/rust#129458

@ZuseZ4
Copy link
Contributor

ZuseZ4 commented Oct 22, 2024

I've been travelling a lot for the last two weeks, but hope to be able to get back to work next Monday. Since the last update we got:

  1. The Autodiff frontend got merged! This included over 2k LoC and 30 files, so the remaining diff is now much smaller.
  2. The Autodiff middle-end as the last missing AD piece is probably getting a re-design. Right now we use Enzyme as a library, which means that we must write ffi wrappers around Enzyme's C/C++ functions and have to differentiate functions one by one. If we switch over to an LLVM pass-based approach instead we can drop a lot of glue code (simplifying the review process) and can get us some features for free, which the pass already handles for us (i.e. differentiate higher-order derivatives in the right order). Julia also just moved over from the library to the pass-based approach. C/C++ always used the pass-based approach which in the past had a few limitations that recently got fixed. Finally, a pass-based approach has reproducibility improvements, since now all information will be in the llvm-ir. As summary, this seems like a good moment to also move Rust over.
  3. I opened a tracking issue for the GPU offload feature and made the first PR to enable LLVM's offload feature.
  4. I started working with some Enzyme and Bootstrap contributors to get a compiler explorer instance with Rust-AD to work.
  5. I am giving one tech talk and two workshop talks at the LLVM Dev Conference, I will share the slides (and videos if possible) afterwards. The three talks are about ML in Rust, GPU-Programming in Rust, and the performance benefits of safe over unsafe code.

Help Wanted:
I would appreciate if someone could look into fixing our Rust dist build used in the Enzyme Compiler explorer. I have spend quite a few hours trying different configurations, but have been unable to get rid of the error

error[E0463]: can't find crate for `std`

Any help would be appreciated, I can share more information if someone has time to investigate further.

@ZuseZ4
Copy link
Contributor

ZuseZ4 commented Nov 27, 2024

  1. The re-design of our autodiff middle/backend which I described in the last update was implemented. This led to a reduction from 2.5k to 1.1k LoC remaining to be upstreamed. I split the code up in two PRs (Autodiff Upstreaming - rustc_codegen_ssa, rustc_middle rust#133429 and Autodiff Upstreaming - rustc_codegen_llvm changes rust#130060). Both are now small enough to be reviewed, and got their first round of feedback, so they will hopefully land at the beginning of December. Afterwards, everything needed to run autodiff will be available on nightly (at least as MVP), so we can discuss to build and ship it by default.

  2. The talks which I gave at LLVM Dev caused some interesting follow-up discussions. Most companies still use Rust "only" for classical SWE, but given that it's getting more and more common I also see more interest outside of Academia in using it for (scientific) computing, ML, HPC, etc, which I find exciting. I also got some offers from people in industry to help with the GPU work.

  3. The preprint of the first paper making use of std::autodiff is available on Arxive! https://arxiv.org/abs/2411.17011v1
    The code is also available here: https://github.com/ChemAI-Lab/molpipx/. It includes both Python/JAX and Rust implementations, because JAX Jitting times are unbearably slow here. In certain configuration it takes more than a day to JIT, but only 30 minutes to compile in Rust.

  4. Once autodiff is upstreamed (especially including some small follow-up PRs which are needed to achieve the best performance), I will also publish some very promising runtime results that we have on a larger set of benchmarks.

  5. Last month I asked for help with our compiler explorer, and I'm happy that we indeed got the needed support since, thank you! fix rustc installation EnzymeAD/enzyme-explorer#15 Our compiler explorer for Rust with std::autodiff support is now available under https://enzyme.mit.edu/explorer/ (just select Rust).

@ZuseZ4
Copy link
Contributor

ZuseZ4 commented Jan 3, 2025

Happy New Year everyone! After a few more rounds of feedback, the next autodiff PR recently got merged: rust-lang/rust#130060
With that, I only have one last PR open to have a fully working autodiff MVP upstream. A few features had to be removed during upstreaming to simplify the reviewing process, but they should be easier to bring back as single PRs.

Beginning next week, I will also work on an MVP for the batching feature of LLVM/Enzyme, which enables some AoS and SoA vectorization. It mostly re-uses the existing autodiff infrastructure, so I expect the PRs for it to be much smaller.

On the GPU side, there has been a recent push by another developer to add a new AMD GPU target to the Rust compiler. This is something that I would have needed for the llvm offload project anyway, so I'm very happy to see movement here: rust-lang/compiler-team#823

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants