Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move supports_attn_softmax logic to build.rs #1085

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

EricLBuehler
Copy link
Owner

@sgrebnov this is the solution I mentioned in #935. What do you think?

@EricLBuehler EricLBuehler marked this pull request as draft January 23, 2025 00:26
Copy link

Code Metrics Report
  ===============================================================================
 Language            Files        Lines         Code     Comments       Blanks
===============================================================================
 C Header                2           35           28            0            7
 Dockerfile              1           41           22           10            9
 JSON                   12          105          104            0            1
 Python                 69         2926         2534           77          315
 Shell                   1           58           22           18           18
 Plain Text              3         3723            0         2413         1310
 TOML                   18          627          556            2           69
 YAML                    2           21           19            2            0
-------------------------------------------------------------------------------
 Jupyter Notebooks       4            0            0            0            0
 |- Markdown             2           77           32           31           14
 |- Python               2          205          178            1           26
 (Total)                            282          210           32           40
-------------------------------------------------------------------------------
 Markdown               46         3802            0         2891          911
 |- BASH                 6          103          100            0            3
 |- JSON                 1           12           12            0            0
 |- Python               7          121          109            0           12
 |- Rust                15          512          433            0           79
 |- TOML                 2           75           63            0           12
 (Total)                           4625          717         2891         1017
-------------------------------------------------------------------------------
 Rust                  309        99703        89365         1933         8405
 |- Markdown           149         1690           25         1540          125
 (Total)                         101393        89390         3473         8530
===============================================================================
 Total                 467       111041        92650         7346        11045
===============================================================================
  

@sgrebnov
Copy link
Contributor

@EricLBuehler - this looks great! I'm just trying to learn whether supports_attn_softmax is required to compile vs in runtime. My previous impression was that it is required in runtime as I was able to build the app with metal acceleration enabled using GH macos runner that does not support acceleration and run successfully on another device with metal acceleration supported.

@EricLBuehler
Copy link
Owner Author

EricLBuehler commented Jan 23, 2025

Hi @sgrebnov!

I'm just trying to learn whether supports_attn_softmax is required to compile vs in runtime.

The reason why we need the check is that the attn_softmax implementation for Metal uses intrinsic types only present in Metal 3.1+. So really, it only depends on the installed Metal framework version, which is why it should be fine to check this at build time (unless the Metal framework version changes during runtime?).

I don't love how this method modifies files in the project - do you perhaps have a better way to pass the information to attention.rs?

@Jeadie
Copy link
Contributor

Jeadie commented Jan 23, 2025

At least in our use case, the metal framework version changes at runtime. Why is this? We want to deliver compiled versions of our metal binary to arbitrary metal users. Therefore at compile time, the metal version is based on our CI computers. At runtime, it will be whichever version the user has locally.

@sgrebnov
Copy link
Contributor

sgrebnov commented Jan 27, 2025

@EricLBuehler - I'm learning about bfloat and it seems the app will fail if compiled with supports_attn_softmax but run on device with metal version < 310. As described by @Jeadie the scenario we would like to add support for is the following: the binaries are compiled once as part of CI/CD and then redistributed and run on different devices (different hardware/software version include Metal framework version).

As bf16 in Metal supported on all macOS 14+ devices, can we rely on OS version (in runtime) to detect where supports_attn_softmax should be enabled? Similar to metal-rs logic https://github.com/gfx-rs/metal-rs/blob/master/src/device.rs#L437?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants