Skip to content
Romain Dolbeau edited this page Feb 3, 2025 · 18 revisions

What's going on here

Update: the arm-sve-clean branch was merged, the code should be available from FFTW3 3.3.11 onward. On branch arm-sve-clean masking is always used. --enable-sve creates codelets for 128, 256, 512, 1024 & 2048 bits SIMD. They are only used if the hardware has a width equal or larger than the codelet. As you need the Arm C Language Extension for SVE, this requires ARM HPC Compiler version 19.3 or newer (earlier version have a minor bug triggered by this code), or GCC 10 or newer.

Branch riscv-v-clean adds support support for version V1.0 of the 'V' extension using built-ins functions using the standard intrinsics.

rsic-v-clean uses a generated vtw.h file in simd-support/, it needs to migrate to the macro-based solutions used for SVE upstream.

SVE configuration

Currently the SVE option is not added to the compiler automatically by configure; so when configuring FFTW3 you need to

  • enable SVE (and probably NEON) explicitely
  • enable a counter for performance evaluation; the cntvct (recommended) is always available but sometimes of dubious accuracy [1] while the pmccntr is privileged by default but cycle-accurate, see https://github.com/rdolbeau/enable_arm_pmu
  • enable SVE in the compiler flags

For instance:

./configure --enable-neon --enable-sve --enable-fma --enable-armv8-cntvct-el0  CFLAGS="-O3 -march=armv8.2-a+sve" CXXFLAGS="-O3 -march=armv8.2-a+sve" FFLAGS="-O3 -march=armv8.2-a+sve"

[1] the counter increment at an implementation-specific rate; Linux reports it like this: arch_timer: cp15 timer(s) running at 54.00MHz (phys). This on a a Raspberry Pi 4 ; Fujitsu A64FX counter runs at 100 MHz, while the Graviton 3 counter runs at 1050 MHz and the Ampere Altra at 25 MHz.

Acknowledgements

This work has partly been done as part of the European Processor Initiative project.

The European Processor Initiative (EPI) (FPA: 800928) has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement EPI-SGA1: 826647

Clone this wiki locally