-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restart with free surface and latest deal.II is failing #5607
Comments
Just as a note to myself, this looks a bit like the problem we are observing for the GMG preconditioner that the solver crashes upon switching from cheap to expensive iterations. This test uses the AMG preconditioner however. GMG is using the GMRES solver, while AMG is using FGMRES, so maybe one of the recent changes in deal.II has transported the problem that was so far only in GMRES into the FGMRES code? For reference, the error message on my system looks different, instead of throwing an FPE immediately, I get the following:
|
There has been substantial churn in the GMRES implementation in deal.II. Are you able to narrow down which patch to deal.II might have caused this? @kronbichler FYI -- perhaps related to your patches, perhaps not. |
It is quite likely related to #16760, that's at least what I would take first. Let me try to have a look at the failing test here. |
I cannot reproduce this on any of my systems. I tried both with gdb and valgrind to see if any floating point exception or similar error would be triggered, but could not see any. The test still fails for me, here is the output I have:
and the runtime output is
|
As an additional comment, as #5607 (comment) mentions GMG, I just wanted to note that I see that AMG gets used on my system. It looks I am running the test in the wrong way that does not trigger the problem? |
Thanks for looking into this @kronbichler. I fixed the test in #5608 by adjusting the solver tolerances so that it doesnt fail anymore. If you want to see the failing test you need to check a version before that PR (e.g. 1671a65 or earlier). As an aside we are using an FGMRES solver for the part that fails in this test, so any change in deal.II that only affects GMRES is unlikely to be the reason (it looks like #16760 may only affect GMRES?). As I discussed with Wolfgang yesterday (but after he wrote the comment), the fact that the test is failing now may be caused by a slightly different "path" the solver is taking in terms of residual reduction. The test was always hard to solve, because we add a non-symmetric stabilization term to our matrix for this test, but inside the preconditioner use a CG solver to solve for an approximate inverse of one block of the matrix. Usually this non-symmetric term is small, but I would guess it can be large for this test, because of the nature of the test. It is inside this CG solver that the test was failing. We determined this likely requires us to change that solver, not necessarily that anything is wrong with the outer FGMRES solver. So maybe this whole issue just serves as a reminder that "something" changed inside the FGMRES solver, not necessarily that is is any worse (or better) than before. |
You are right, it seems that I forgot |
I think this can be closed as the test is fixed and the issue of our solver is tracked in #5613. |
It looks like there was a change in deal.II master that broke our restart feature if the free surface is active (see e.g. #5605 which fails the test
checkpoint_07_enable_free_surface_resume
with deal.II master even though it only changed documentation, deal.II 9.5 works fine, other affected PRs are #5606, #5604, and #5603). The test fails in the Stokes solver after the restart with:I havent been able to pin down the problematic change in deal.II yet, I just wanted to let everyone know that this is likely not a fault of any ASPECT PR (since deal.II 9.5 works fine). There have been a few changes in the FGMRES solver in deal.II lately, I suspect it could be something in there.
The text was updated successfully, but these errors were encountered: