Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: use DifferentiationInterface for sparse AD #468

Merged
merged 17 commits into from
Oct 3, 2024
Merged

Conversation

avik-pal
Copy link
Member

@avik-pal avik-pal commented Oct 1, 2024

The only part where we don't use DI is structured Jacobians. That still uses SparseDiffTools

TODO:

@avik-pal avik-pal requested a review from gdalle October 1, 2024 21:22
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Copy link
Collaborator

@gdalle gdalle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Congrats on the adaptation and thank you for trusting me with this!
I've never used LinearSolve before so take my comments with a grain of salt, but I find it needlessly complicated to mix the SparseDiffTools legacy syntax with the new sparse API of ADTypes. Do you think it is a problem?

docs/Project.toml Show resolved Hide resolved
docs/src/basics/sparsity_detection.md Show resolved Hide resolved
docs/src/basics/sparsity_detection.md Outdated Show resolved Hide resolved
docs/src/basics/sparsity_detection.md Show resolved Hide resolved
docs/src/basics/sparsity_detection.md Outdated Show resolved Hide resolved
docs/src/basics/sparsity_detection.md Outdated Show resolved Hide resolved
src/internal/jacobian.jl Outdated Show resolved Hide resolved
src/internal/jacobian.jl Show resolved Hide resolved
src/internal/jacobian.jl Outdated Show resolved Hide resolved
src/internal/jacobian.jl Outdated Show resolved Hide resolved
@avik-pal
Copy link
Member Author

avik-pal commented Oct 1, 2024

but I find it needlessly complicated to mix the SparseDiffTools legacy syntax with the new sparse API of ADTypes.

This preserves support for structured matrices. Once the support for that lands in SparseMatrixColorings we can just delete that check and everything will just work.

@avik-pal
Copy link
Member Author

avik-pal commented Oct 1, 2024

To clarify the rationale behind marking AutoSparse as deprecated:

Previously we had sparse backends (simple wrapper over dense backend) and we had sparsity specification inside the problem (sparsity, jac_prototype, colorvec). But now the AutoSparse backend holds the sparsity detection + coloring information. So this creates chances for ambiguity, say user says sparsity = Symbolics..Detector() and AutoSparse(..., Tracer...Detector()) what should take precedence? We could choose to make the autodiff via the solver have higher precedence. But the other way is also perfectly valid. This mostly causes confusion without a clear benefit (at least to me)

Now what does the new-API look like? Users always provide a dense_ad to the solver. If the function has sparsity information in the form of a detector/colorvec/prototype (basically any hint that the jacobian is sparse) we construct a AutoSparse backend using that information. From DI's perspective it always sees the final autodiff that we constructed and not the one user passed in.

@gdalle
Copy link
Collaborator

gdalle commented Oct 2, 2024

Right, so essentially you reconstruct an AutoSparse from

  • the backend provided to the solver
  • the sparsity detector / coloring provided to the function

That seems reasonable for now, but it might lead to a breaking change when we swing back to full AutoSparse support?

@gdalle
Copy link
Collaborator

gdalle commented Oct 2, 2024

This preserves support for structured matrices. Once the support for that lands in SparseMatrixColorings we can just delete that check and everything will just work.

Just to clarify, with structured matrices, there are two things we can optimize:

  • the coloring, for which an optimal solution is often known
  • the decompression, because storage is e.g. bandwise instead of columnwise

At the moment, if I understand correctly:

  • the coloring is in ArrayInterface.jl, reused by SparseDiffTools.jl?
  • the decompression is in these extensions of FiniteDiff.jl, reused by SparseDiffTools.jl

I'll try to come up with a prototype putting everything in SparseMatrixColorings.jl

@gdalle
Copy link
Collaborator

gdalle commented Oct 2, 2024

On this branch you have optimized coloring and decompression for Diagonal, Bidiagonal, Tridiagonal and BandedMatrix: gdalle/SparseMatrixColorings.jl#132. Wanna try it out?

@avik-pal
Copy link
Member Author

avik-pal commented Oct 2, 2024

On this branch you have optimized coloring and decompression for Diagonal, Bidiagonal, Tridiagonal and BandedMatrix: gdalle/SparseMatrixColorings.jl#132. Wanna try it out?

I will stack a PR on top of this and try it out

@avik-pal
Copy link
Member Author

avik-pal commented Oct 2, 2024

That seems reasonable for now, but it might lead to a breaking change when we swing back to full AutoSparse support?

What do you mean by full AutoSparse support?

I am generally against having 2 APIs to accomplish the exact same thing unless each API provides additional disjoint benefits. If you are referring to the coloring_algorithm, we can always rename colorvec to coloring and construct AutoSparse based on the type of coloring.

@ChrisRackauckas what do you think about this new API proposal for v4?

@gdalle
Copy link
Collaborator

gdalle commented Oct 2, 2024

What do you mean by full AutoSparse support?

I meant that for users who pass an AutoSparse backend to NonLinearSolve, it might be surprising to see that the choices of coloring_algorithm / sparsity_detector inside the backend are discarded, in favor of the choices colorvec / sparsity given in the function itself?

If you want to deprecate this way of handling sparsity, I would almost prefer throwing an error if someone gives you an AutoSparse.

@avik-pal
Copy link
Member Author

avik-pal commented Oct 2, 2024

If you want to deprecate this way of handling sparsity, I would almost prefer throwing an error if someone gives you an AutoSparse.

Yes I agree. The function currently throws a depwarn, that will become an error in v4

@gdalle
Copy link
Collaborator

gdalle commented Oct 2, 2024

I added support for BlockBandedMatrix and BandedBlockBandedMatrix to the same PR. Now we have the exact same colorings as ArrayInterface, although decompression could use a little more work (but it's not terrible).

@gdalle
Copy link
Collaborator

gdalle commented Oct 2, 2024

What else would you need to drop SparseDiffTools completely? In terms of functionality it should be all there, performance may still lag a bit behind but honestly I'm not even sure.

@avik-pal
Copy link
Member Author

avik-pal commented Oct 2, 2024

see the other PR. tests pass locally for me, so we can go ahead and remove sparsedifftools. I will make some additional pushes to remove sparse diff tools from the tests and docs

avik-pal and others added 3 commits October 2, 2024 12:55
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@avik-pal
Copy link
Member Author

avik-pal commented Oct 2, 2024

Added a big table to clarify the selection mechanism https://github.com/SciML/NonlinearSolve.jl/blob/ap/sparse_di/docs/src/basics/sparsity_detection.md

@oscardssmith
Copy link
Contributor

Is there anything we should be doing to pass down the sparsity info that we get down to LinearSolve? It seems like knowing that should be able to make the solve faster.

@avik-pal
Copy link
Member Author

avik-pal commented Oct 2, 2024

We are giving LinearSolve the exact matrix it needs to solve

@avik-pal
Copy link
Member Author

avik-pal commented Oct 2, 2024

We assume for SparseMatrixCSC jacobians the sparsity struture cannot arbitrarily change over iterations

@avik-pal avik-pal linked an issue Oct 2, 2024 that may be closed by this pull request
@gdalle
Copy link
Collaborator

gdalle commented Oct 2, 2024

Yeah that goes without saying for me but if you prepare DI's Jacobian with a given sparsity pattern, you'll get incorrect outputs if the sparsity changes between points

avik-pal and others added 2 commits October 2, 2024 13:16
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@gdalle
Copy link
Collaborator

gdalle commented Oct 2, 2024

Fair warning: you're probably going to see an epic slowdown when computing sparse Jacobians with FiniteDiff. The reason is that, in DI, sparse Jacobians rely on pushforwards, and I don't have a native pushforward operator in FiniteDiff so I make do with a derivative closure.

@oscardssmith
Copy link
Contributor

Can a native pushforward be added to FiniteDiff? it seems like it should be pretty straightforward...

@gdalle
Copy link
Collaborator

gdalle commented Oct 2, 2024

Knock yourselves out! I'll happily use it.

@gdalle
Copy link
Collaborator

gdalle commented Oct 2, 2024

In the meantime, maybe I'll try to use the sparse Jacobian inside FiniteDiff as a workaround

@oscardssmith
Copy link
Contributor

JuliaDiff/FiniteDiff.jl#191

@gdalle
Copy link
Collaborator

gdalle commented Oct 3, 2024

@avik-pal with the very newest versions of DifferentiationInterface (0.6.4) and SparseMatrixColorings (0.4.4), your structured tests from #470 should also pass.
If you just want the thing to work but you're okay with slightly suboptimal colorings until gdalle/SparseMatrixColorings.jl#139 is merged, then you can remove the dependency on a specific branch and use the latest from the registry. I added tests for structured matrices in gdalle/SparseMatrixColorings.jl#137 and they already succeed with the current version of the package, without the optimized implementation of gdalle/SparseMatrixColorings.jl#139.

@avik-pal avik-pal merged commit f1969a2 into master Oct 3, 2024
33 of 36 checks passed
@avik-pal avik-pal deleted the ap/sparse_di branch October 3, 2024 17:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Start using DifferentiationInterface.jl
3 participants