Skip to content

Commit

Permalink
Fused block jacobi
Browse files Browse the repository at this point in the history
More performant path for block Jacobi case inside BTDS
(GPU only, BlockCrs only). Fuses residual and solve
into one kernel and doesn't convert vectors to SIMD-packed
format. Also inverts diag blocks fully in shared to speed up numeric.

Signed-off-by: Brian Kelley <[email protected]>
  • Loading branch information
brian-kelley committed Feb 26, 2025
1 parent 8eca3f9 commit 29fe448
Show file tree
Hide file tree
Showing 4 changed files with 1,132 additions and 59 deletions.
Loading

0 comments on commit 29fe448

Please sign in to comment.