-
-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
switch callback rootfinding algorithm from InternalFalsi
to InternalITP
to Improve robustness
#917
Conversation
As suggested in #916, can you try converting this over to itp and seeing if that solves it as well? That should be a very good way to have this mixture. https://github.com/SciML/SimpleNonlinearSolve.jl/blob/main/src/itp.jl |
Sure, I'll take a look later (somehow missed the ITP file when eyeballing the SimpleNonlinearSolver repo) |
It just merged a few minutes ago... 😅 |
If the algorithm is now non-differentiable (i.e. bisection isn't a differentiable algorithm) then the derivative needs to be directly computed. I added the derivative calculation handling. |
It does look like this passes everything, though I'm curious what exactly the difference is between this an ITP. ITP to me seems like the same basic idea (mix Falsi with bisection) but with provable acceleration. Is there theory behind this version of the scheme? |
Not really, although I guess you could prove that it takes < min(2falsi_iters, 2bisection_iters) which gets rid of the worst-case issues. In any case I just swapped to the ITP method with the latest commit. |
So, the current status of the PR is that both of them are implemented and seem to work (at least for the local tests); I also added a couple of tests, including one for issue #916. There's a commented-out test in |
I ran the downstream callback examples with both methods, measuring the outcome in terms of
|
it might be worth playing around with the itp parameters. I'm not sure if our current ones are good. |
it would also be good to add bisection as an option since itp is supposed to be ~strictly better than it, and for 1d bounded rootfinding bisection should be taking less than 100 steps. |
And maybe @yash2798 can play around with some ITP parameters |
Yup, but just in case there is a misinterpretation: the numbers in each row are aggregated for the whole problem, not normalized per callback (there are likely multiple events within the solution) |
Ah that makes a lot more sense. |
Doing some research on the test failures, seems like this ITP version runs into a similar issue as the current Falsi-then-bisection on current master (the DDE test that fails passes with the interleaved version). |
Found the issue (will fix it here soon): SciML/SimpleNonlinearSolve.jl#72 |
ITP fixed; will re-run the benchmarks later, should be ready to launch the test suite again (in case the sign issue was the only root of the test failures) |
Alright, so this is what it will look like once I push the commit that handles the DDE test failure by improving the eps-level termination handling (possibly the AD ones?)
|
It's a little bit weird that Spring got worse, but other than that, looks good. |
There's a chance the previous result is "wrong" in the sense of taking fewer iterations because some of the crossings were incorrectly solved. |
With the latest commit, the DDE test that failed should pass. I'm not sure regarding the AD tests (the result was actually fairly close, so maybe they were just one floating point from success?), and I think everything else was due to an Actions outage (except for the formatting which I don't know how to handle, not sure JuliaFormatter is picking up the toml file). |
This downstream test failure from JumpProcesses seems like a real issue here: https://github.com/SciML/DiffEqBase.jl/actions/runs/5624812899/job/15243159120?pr=917#step:6:633 . Probably a floating point error. Is this returning the left end point by default correctly? |
Seems to be a numerical issue with loss of relative precision in the regula falsi formula. For example: left = 5.642777923986901
right = 5.6427779240133455
fl = -4.665215073563809e-11
fr = 1.92987234680276e-16
(fr * left - fl * right) / (fr - fl) # = 5.642777924013346 which is nextfloat(right) therefore it lands outside the [left, right] interval and screws up the next iteration, where exponentiation fails. Not sure if there is a more stable formula, but I guess we can always project into the interval to prevent this issue. |
Does reformulating as |
It's one point better for this example (same as right), though I'm not sure if that generalizes. In any case, I think there should be no harm in just pushing it inside like I did (though I put the pushing after truncation and projection, just to be extra safe). |
With the updated version I am ~90% sure both checks are unnecessary. The "proof" of this is that when it matters, |
IMHO the equality checks are needed anyway; I think you're right about the new formula not being OOB but it should still be possible to match the edge values (I saw that when fixing the tests) - and, in that case, you want to push it inside; otherwise you'll get stuck and maxiter. So if we already have the equality tests, I don't think there's a drawback to them being equality-or-greater tests. |
I think you don't need the nudge since even if the falsi returns an endpoint, it should get nudged away from the endpoint due to the attraction to the center that happens afterwards, but I've been looking at it less than you so I trust you if you say it's necessary (I'm just trying to save cycles around the edges anyway 😄 ) |
That's probably the case most of the time, it's just the edge cases where the attraction does nothing (in the example I was studying there was a 1e-19 or so bump, ie <<< eps(5.0)).
Fair enough, but in terms of per-iteration overhead we're already doing a float^float with this algorithm :P |
yeah that's a good point. I do kind of want us to fix the power to 1.5 or 2 so it turns into a multiplication (or a multiplication and sqrt). |
Either of them is a valid choice, and FWIW the benchmarks in the ITP paper use 2, so we could indeed roll with that until we have a tuning strategy |
Awesome! @DaniGlez can you make sure your fixes to ITP also end up in SimpleNonlinearSolve? That's the more externally used case. For dependency reasons we keep the two separate, but we should make sure the SimpleNonlinearSolve ITP is up to snuff. In fact, with this I'd say @yash2798 we should update the docs for interval nonlinear problems to recommend ITP (and tutorials) |
Absolutely, I'll get to it tomorrow |
Sure, i'll open a pr for updating the docs soon. |
InternalFalsi
to InternalITP
to Improve robustness
This PR interleaves the bisection and the regula falsi iterations to be able to handle tricky callbacks without going into maxiters. See discussion in #916