Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--enable-generic-simd256 causes memory error on fftw_plan_many_dft_r2c and fftw_plan_many_dft_c2 #328

Open
maxmarsc opened this issue Jun 9, 2023 · 0 comments

Comments

@maxmarsc
Copy link

maxmarsc commented Jun 9, 2023

First of all I think this issue might be related to :

I compiled both fftw and fftwf 3.3.10 for x86_64, using GCC 9, with the following flags :

    --enable-avx
    --enable-avx2
    --enable-avx512
    --enable-avx-128-fma
    --enable-generic-simd128
    --enable-generic-simd256

The issue I identified only happened with fftw (not fftwf)

The code to reproduce the bug would be :

#include "fftw3.h"
#include <stdlib.h>
#include <cmath>

int main() {
  int fft_size       = 256;
  int channels       = 1;
  int transform_size = std::floor(fft_size / 2) + 1;

  double* inplace_work_buffer = fftw_alloc_real(channels * transform_size * 2);

  int rank    = 1;          /* we are computing 1d transforms */
  int n[]     = {fft_size}; /* 1d transforms of length fftTransformSize */
  int howmany = channels;   /* how many transforms to compute */
  int idist   = transform_size * 2;
  int odist = transform_size;
  int istride  = 1;
  int ostride  = 1;
  int* inembed = nullptr;
  int* onembed = nullptr;

  auto* plan = fftw_plan_many_dft_r2c(rank, n, howmany, inplace_work_buffer, inembed,
                         istride, idist,
                         reinterpret_cast<fftw_complex*>(inplace_work_buffer),
                         onembed, ostride, odist, FFTW_MEASURE);
  fftw_destroy_plan(plan);
  fftw_free(inplace_work_buffer);
}

When running with ASan, here is the output it gives :

=================================================================
==1185224==ERROR: AddressSanitizer: unknown-crash on address 0x612000000430 at pc 0x5629290259c4 bp 0x7ffdc0967a50 sp 0x7ffdc0967a40
READ of size 32 at 0x612000000430 thread T0
    #0 0x5629290259c3 in LDA /foo/bar/build/source/stft/fftwf/src/fftwf/simd-support/simd-generic256.h:60
    #1 0x562929026df1 in n2fv_16 /foo/bar/build/source/stft/fftwf/src/fftwf/dft/simd/generic-simd256/../common/n2fv_16.c:284
    #2 0x56292936dcd6 in apply_extra_iter /foo/bar/build/source/stft/fftwf/src/fftwf/dft/direct.c:111
    #3 0x562927eef746 in fftw_dft_solve /foo/bar/build/source/stft/fftwf/src/fftwf/dft/solve.c:29
    #4 0x562927edeb8c in measure /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/timer.c:136
    #5 0x562927eded07 in fftw_measure_execution_time /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/timer.c:159
    #6 0x562927ed9376 in evaluate_plan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:460
    #7 0x562927ed9cd7 in search0 /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:529
    #8 0x562927eda15d in search /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:600
    #9 0x562927edafe9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:711
    #10 0x562927edd50c in fftw_mkplan_d /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:970
    #11 0x562927edd7bf in fftw_mkplan_f_d /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:986
    #12 0x562927eeb443 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/dft/indirect.c:206
    #13 0x562927ed96c2 in invoke_solver /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:486
    #14 0x562927ed9bd7 in search0 /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:529
    #15 0x562927eda15d in search /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:600
    #16 0x562927edafe9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:711
    #17 0x562927edd50c in fftw_mkplan_d /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:970
    #18 0x562927edd7bf in fftw_mkplan_f_d /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:986
    #19 0x562929355373 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/dft/buffered.c:199
    #20 0x562927ed96c2 in invoke_solver /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:486
    #21 0x562927ed9bd7 in search0 /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:529
    #22 0x562927eda15d in search /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:600
    #23 0x562927edafe9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:711
    #24 0x562927edd50c in fftw_mkplan_d /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:970
    #25 0x5629293746c1 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/rdft/ct-hc2c.c:198
    #26 0x562927ed96c2 in invoke_solver /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:486
    #27 0x562927ed9bd7 in search0 /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:529
    #28 0x562927eda15d in search /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:600
    #29 0x562927edafe9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:711
    #30 0x562927edd50c in fftw_mkplan_d /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:970
    #31 0x5629293727b5 in mkcldw /foo/bar/build/source/stft/fftwf/src/fftwf/rdft/ct-hc2c-direct.c:334
    #32 0x56292937409c in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/rdft/ct-hc2c.c:173
    #33 0x562927ed96c2 in invoke_solver /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:486
    #34 0x562927ed9bd7 in search0 /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:529
    #35 0x562927eda15d in search /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:600
    #36 0x562927edafe9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:711
    #37 0x562927ed358c in mkplan0 /foo/bar/build/source/stft/fftwf/src/fftwf/api/apiplan.c:42
    #38 0x562927ed35db in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/api/apiplan.c:56
    #39 0x562927ed39ca in fftw_mkapiplan /foo/bar/build/source/stft/fftwf/src/fftwf/api/apiplan.c:124
    #40 0x562927ed60a9 in fftw_plan_many_dft_r2c /foo/bar/build/source/stft/fftwf/src/fftwf/api/plan-many-dft-r2c.c:41
    #41 0x5629267f1666 in CATCH2_INTERNAL_TEST_4 /foo/bar/tests/fft_tests.cc:55
    #42 0x56292688a6bd in Catch::TestInvokerAsFunction::invoke() const src/catch2/internal/catch_test_case_registry_impl.cpp:149
    #43 0x56292687e866 in Catch::TestCaseHandle::invoke() const (/foo/bar/build/tests/libstft_tests+0x269866)
    #44 0x56292687d9bb in Catch::RunContext::invokeActiveTestCase() src/catch2/internal/catch_run_context.cpp:508
    #45 0x56292687d6f5 in Catch::RunContext::runCurrentTest(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) src/catch2/internal/catch_run_context.cpp:473
    #46 0x56292687bfde in Catch::RunContext::runTest(Catch::TestCaseHandle const&) src/catch2/internal/catch_run_context.cpp:238
    #47 0x562926828373 in execute src/catch2/catch_session.cpp:110
    #48 0x5629268297b3 in Catch::Session::runInternal() src/catch2/catch_session.cpp:332
    #49 0x5629268292cc in Catch::Session::run() src/catch2/catch_session.cpp:263
    #50 0x5629268211e6 in int Catch::Session::run<char>(int, char const* const*) src/catch2/../catch2/catch_session.hpp:41
    #51 0x5629268210d4 in main src/catch2/internal/catch_main.cpp:36
    #52 0x7fe9cf443082 in __libc_start_main ../csu/libc-start.c:308
    #53 0x5629267f02bd in _start (/foo/bar/build/tests/libstft_tests+0x1db2bd)

0x612000000440 is located 0 bytes to the right of 256-byte region [0x612000000340,0x612000000440)
allocated by thread T0 here:
    #0 0x7fe9cfa6b005 in __interceptor_memalign ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:169
    #1 0x562927ed67ea in fftw_kernel_malloc /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/kalloc.c:91
    #2 0x562927ed6548 in fftw_malloc_plain /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/alloc.c:28
    #3 0x5629293550b9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/dft/buffered.c:196
    #4 0x562927ed96c2 in invoke_solver /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:486
    #5 0x562927ed9bd7 in search0 /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:529
    #6 0x562927eda15d in search /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:600
    #7 0x562927edafe9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:711
    #8 0x562927edd50c in fftw_mkplan_d /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:970
    #9 0x5629293746c1 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/rdft/ct-hc2c.c:198
    #10 0x562927ed96c2 in invoke_solver /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:486
    #11 0x562927ed9bd7 in search0 /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:529
    #12 0x562927eda15d in search /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:600
    #13 0x562927edafe9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:711
    #14 0x562927edd50c in fftw_mkplan_d /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:970
    #15 0x5629293727b5 in mkcldw /foo/bar/build/source/stft/fftwf/src/fftwf/rdft/ct-hc2c-direct.c:334
    #16 0x56292937409c in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/rdft/ct-hc2c.c:173
    #17 0x562927ed96c2 in invoke_solver /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:486
    #18 0x562927ed9bd7 in search0 /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:529
    #19 0x562927eda15d in search /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:600
    #20 0x562927edafe9 in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/kernel/planner.c:711
    #21 0x562927ed358c in mkplan0 /foo/bar/build/source/stft/fftwf/src/fftwf/api/apiplan.c:42
    #22 0x562927ed35db in mkplan /foo/bar/build/source/stft/fftwf/src/fftwf/api/apiplan.c:56
    #23 0x562927ed39ca in fftw_mkapiplan /foo/bar/build/source/stft/fftwf/src/fftwf/api/apiplan.c:124
    #24 0x562927ed60a9 in fftw_plan_many_dft_r2c /foo/bar/build/source/stft/fftwf/src/fftwf/api/plan-many-dft-r2c.c:41
    #25 0x5629267f1666 in CATCH2_INTERNAL_TEST_4 /foo/bar/tests/fft_tests.cc:55
    #26 0x56292688a6bd in Catch::TestInvokerAsFunction::invoke() const src/catch2/internal/catch_test_case_registry_impl.cpp:149
    #27 0x56292687e866 in Catch::TestCaseHandle::invoke() const (/foo/bar/build/tests/libstft_tests+0x269866)
    #28 0x56292687d9bb in Catch::RunContext::invokeActiveTestCase() src/catch2/internal/catch_run_context.cpp:508
    #29 0x56292687d6f5 in Catch::RunContext::runCurrentTest(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) src/catch2/internal/catch_run_context.cpp:473

SUMMARY: AddressSanitizer: unknown-crash /foo/bar/build/source/stft/fftwf/src/fftwf/simd-support/simd-generic256.h:60 in LDA
Shadow bytes around the buggy address:
  0x0c247fff8030: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
  0x0c247fff8040: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
  0x0c247fff8050: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
  0x0c247fff8060: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
  0x0c247fff8070: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c247fff8080: 00 00 00 00 00 00[00]00 fa fa fa fa fa fa fa fa
  0x0c247fff8090: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c247fff80a0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c247fff80b0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c247fff80c0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c247fff80d0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==1185224==ABORTING

And valgrind --leak-check=full gives me :

==1280516== Memcheck, a memory error detector
==1280516== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==1280516== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==1280516== Command: ./build/tests/libstft_tests bug\ report
==1280516== 
==1280516== Invalid read of size 8
==1280516==    at 0x21B279A: LDA (simd-generic256.h:60)
==1280516==    by 0x21B36C4: n2fv_16 (n2fv_16.c:284)
==1280516==    by 0x24920C3: apply_extra_iter (direct.c:111)
==1280516==    by 0x13B8A3E: fftw_dft_solve (solve.c:29)
==1280516==    by 0x13B13B6: measure (timer.c:136)
==1280516==    by 0x13B1468: fftw_measure_execution_time (timer.c:159)
==1280516==    by 0x13AF1DA: evaluate_plan (planner.c:460)
==1280516==    by 0x13AF4E3: search0 (planner.c:529)
==1280516==    by 0x13AF695: search (planner.c:600)
==1280516==    by 0x13AFAB3: mkplan (planner.c:711)
==1280516==    by 0x13B073E: fftw_mkplan_d (planner.c:970)
==1280516==    by 0x13B088B: fftw_mkplan_f_d (planner.c:986)
==1280516==  Address 0x4fcc900 is 0 bytes after a block of size 256 alloc'd
==1280516==    at 0x483E340: memalign (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==1280516==    by 0x13AE134: fftw_kernel_malloc (kalloc.c:91)
==1280516==    by 0x13ADFFB: fftw_malloc_plain (alloc.c:28)
==1280516==    by 0x24858DC: mkplan (buffered.c:196)
==1280516==    by 0x13AF2C2: invoke_solver (planner.c:486)
==1280516==    by 0x13AF45B: search0 (planner.c:529)
==1280516==    by 0x13AF695: search (planner.c:600)
==1280516==    by 0x13AFAB3: mkplan (planner.c:711)
==1280516==    by 0x13B073E: fftw_mkplan_d (planner.c:970)
==1280516==    by 0x2494DE4: mkplan (ct-hc2c.c:198)
==1280516==    by 0x13AF2C2: invoke_solver (planner.c:486)
==1280516==    by 0x13AF45B: search0 (planner.c:529)
==1280516== 
==1280516== Invalid read of size 8
==1280516==    at 0x21B279E: LDA (simd-generic256.h:60)
==1280516==    by 0x21B36C4: n2fv_16 (n2fv_16.c:284)
==1280516==    by 0x24920C3: apply_extra_iter (direct.c:111)
==1280516==    by 0x13B8A3E: fftw_dft_solve (solve.c:29)
==1280516==    by 0x13B13B6: measure (timer.c:136)
==1280516==    by 0x13B1468: fftw_measure_execution_time (timer.c:159)
==1280516==    by 0x13AF1DA: evaluate_plan (planner.c:460)
==1280516==    by 0x13AF4E3: search0 (planner.c:529)
==1280516==    by 0x13AF695: search (planner.c:600)
==1280516==    by 0x13AFAB3: mkplan (planner.c:711)
==1280516==    by 0x13B073E: fftw_mkplan_d (planner.c:970)
==1280516==    by 0x13B088B: fftw_mkplan_f_d (planner.c:986)
==1280516==  Address 0x4fcc908 is 8 bytes after a block of size 256 alloc'd
==1280516==    at 0x483E340: memalign (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==1280516==    by 0x13AE134: fftw_kernel_malloc (kalloc.c:91)
==1280516==    by 0x13ADFFB: fftw_malloc_plain (alloc.c:28)
==1280516==    by 0x24858DC: mkplan (buffered.c:196)
==1280516==    by 0x13AF2C2: invoke_solver (planner.c:486)
==1280516==    by 0x13AF45B: search0 (planner.c:529)
==1280516==    by 0x13AF695: search (planner.c:600)
==1280516==    by 0x13AFAB3: mkplan (planner.c:711)
==1280516==    by 0x13B073E: fftw_mkplan_d (planner.c:970)
==1280516==    by 0x2494DE4: mkplan (ct-hc2c.c:198)
==1280516==    by 0x13AF2C2: invoke_solver (planner.c:486)
==1280516==    by 0x13AF45B: search0 (planner.c:529)
==1280516== 
==1280516== 
==1280516== HEAP SUMMARY:
==1280516==     in use at exit: 226,376 bytes in 2,457 blocks
==1280516==   total heap usage: 58,871 allocs, 56,414 frees, 34,196,978 bytes allocated
==1280516== 
==1280516== LEAK SUMMARY:
==1280516==    definitely lost: 0 bytes in 0 blocks
==1280516==    indirectly lost: 0 bytes in 0 blocks
==1280516==      possibly lost: 0 bytes in 0 blocks
==1280516==    still reachable: 226,376 bytes in 2,457 blocks
==1280516==         suppressed: 0 bytes in 0 blocks
==1280516== Reachable blocks (those to which a pointer was found) are not shown.
==1280516== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==1280516== 
==1280516== For lists of detected and suppressed errors, rerun with: -s
==1280516== ERROR SUMMARY: 16 errors from 2 contexts (suppressed: 0 from 0)

Note : you can see in the stack that I'm using catch2 rather than having the code inside a main function, but using a main function would reproduce the issue

Some more details I gathered

  • Same happens with the inverse with fftw_plan_many_dft_c2r with the same setup (simply switching odist and idist values)
  • Rebuilding without the --enable-generic-simd256 flag removes the issue
  • With fft_size values of 32, 64, 128, the bug does not appears
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant