You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Download them and then compute their product like so:
using MatrixMarket, NPZ, SparseArrays, CUDA, CUDA.CUSPARSE
#load from saved files
A_ub =SparseMatrixCSC{Float32}(MatrixMarket.mmread("A_ub.txt"))
c =Array{Float32}(npzread("c.txt"))
#transfer to GPU
A_ub =CuSparseMatrixCSR{Float32}(A_ub)
c =CuArray{Float32}(c)
#now do this a few times and observe that the numbers on the screen change.
A_ub*c
for example, on my first run I get
julia> A_ub * c
1002000-element CuArray{Float32, 1, CUDA.DeviceMemory}:495.9215511.35505.44818483.6953488.43948493.39288491.31964494.82507495.3517504.84515484.67084498.8777497.72232493.01538478.84924488.24252502.58365501.54272497.06326509.82855511.73764512.2767⋮
and on my second run I get
julia> A_ub * c
1002000-element CuArray{Float32, 1, CUDA.DeviceMemory}:495.92154511.35505.44818483.6953488.43948493.39288491.31964494.82507495.3517504.84515484.67084498.8777497.72232493.01538478.84924488.24252502.58365501.54272497.0633509.82855511.73764512.2767⋮
with the first and fourth-to-last numbers differing from the original.
I know that floating point arithmetic is non-associative and that parallelism of e.g. large sums can cause operations to be grouped differently across multiple evaluations. Is that what's going on here?
Expected behavior
Reproducible computation.
Version info
Details on Julia:
Julia Version 1.11.1
Commit 8f5b7ca12ad (2024-10-16 10:53 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: Linux (x86_64-linux-gnu)
CPU: 64 × AMD Ryzen Threadripper PRO 5975WX 32-Cores
WORD_SIZE: 64
LLVM: libLLVM-16.0.6 (ORCJIT, znver3)
Threads: 1 default, 0 interactive, 1 GC (on 64 virtual cores)
Environment:
I know that floating point arithmetic is non-associative and that parallelism of e.g. large sums can cause operations to be grouped differently across multiple evaluations. Is that what's going on here?
Yes, it's exactly what is happening here.
Foe CUSPARSE, we don't have any option to make the routines deterministic.
Describe the bug
I'm encountering what appears to be nondeterministic rounding when multiplying by a sparse matrix.
To reproduce
I tried to upload a matrix market and a npy file to github, but they were too large, so I updated both files to my website as .txt
A_ub file
c file
Download them and then compute their product like so:
for example, on my first run I get
and on my second run I get
with the first and fourth-to-last numbers differing from the original.
I know that floating point arithmetic is non-associative and that parallelism of e.g. large sums can cause operations to be grouped differently across multiple evaluations. Is that what's going on here?
Expected behavior
Reproducible computation.
Version info
Details on Julia:
Details on CUDA:
The text was updated successfully, but these errors were encountered: