aarch64: Support FEAT_LSFE #201

taiki-e · 2025-01-12T14:18:44Z

Armv9.6 added atomic float instructions for binary{16,32,64} and bfloat16 as FEAT_LSFE (Large System Float Extension).

This PR optimizes AArch64 {16,32,64}-bit atomic float add/sub/max/min when FEAT_LSFE is enabled.

LLVM's assembly support for FEAT_LSFE needs LLVM 20 (llvm/llvm-project@67ff5ba), so use .inst directive on LLVM 19 or older.

Run-time detection is also implemented, but at this time it is only used in testing. AFAIK no CPUs actually implement this feature yet, so we will only refer to the feature available at compile time at this time.

taiki-e added the O-aarch64 Target: Armv8-A, Armv8-R, or later processors in AArch64 mode label Jan 12, 2025

taiki-e mentioned this pull request Jan 12, 2025

Optimize atomic floats on nvptx #34

Open

taiki-e force-pushed the aarch64-lsfe branch from f666498 to 8ef05df Compare January 12, 2025 14:21

taiki-e added the A-float Area: related to atomic float label Jan 12, 2025

aarch64: Support FEAT_LSFE

b942dd3

taiki-e force-pushed the aarch64-lsfe branch from 8ef05df to b942dd3 Compare January 12, 2025 16:48

taiki-e force-pushed the main branch 5 times, most recently from 53c8409 to 378f6cd Compare January 15, 2025 15:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aarch64: Support FEAT_LSFE #201

aarch64: Support FEAT_LSFE #201

taiki-e commented Jan 12, 2025 •

edited

Loading

aarch64: Support FEAT_LSFE #201

Are you sure you want to change the base?

aarch64: Support FEAT_LSFE #201

Conversation

taiki-e commented Jan 12, 2025 • edited Loading

taiki-e commented Jan 12, 2025 •

edited

Loading