Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Account for cases where the A block is not symmetric #5613

Merged
merged 2 commits into from
May 29, 2024

Conversation

gassmoeller
Copy link
Member

While looking into #5607 I found it weird that enabling the free surface would deteriorate the convergence behavior of the inner A block solver as much as it does. What was particularly weird was that the model would converge just fine with more cheap iterations. That made me look into the stabilization term that we apply for the free surface, namely here. This term is not necessarily symmetric with respect to the indices of the test functions. The crucial part is (scratch.phi_u[i]*g_hat) * (scratch.phi_u[j]*n_hat). You can imagine that scratch.phi_u[j] may be orthogonal to n_hat but not g_hat, which would result in different stabilization terms added if you flip the indices i and j. However, we choose CG as the solver for the A block in solver.cc, and CG requires the matrix to be symmetric. Using a non-symmetric matrix with a CG solver could be the reason for the convergence problems if the non-symmetric part of the matrix is large compared to the symmetric part.

So I tried replacing CG with FGMRES and the failing test checkpoint_07_enable_free_surface_resume would immediately pass with the solver settings that would fail with CG.

I think we can consider two options:

  1. Try to detect if the A block is not symmetric and then switch to a different solver for the A block, this is what I implemented in this PR. This is the straightforward approach, but it requires us to check for all reasons why the A block can become not symmetric (e.g. what about anisotropic viscosity?). Also it seems like the non-symmetric part of the A block is typically small, otherwise we would have seen more failures of the A block solver in the past. Thus, always using FGMRES for free surface models could cost performance, while not increasing the stability for most models.
  2. We could always try the CG solver and catch the exception if it fails and try again with FGMRES. This option is nice, because it simplifies the coding and we cannot accidentally 'miss' cases where the matrix becomes non-symmetric. However, it means that in some models we will try CG again for each outer GMRES iteration, even if it fails every time. Maybe we can store this state in the preconditioner (e.g. if CG fails once, always use GMRES as long as this preconditioner exists).

Has anyone else seen models where the inner A block solver would not converge in free surface models?
I would be grateful for feedback and opinions.

@bangerth
Copy link
Contributor

Anisotropic viscosity still leads to a symmetric matrix. You have a term of the form (eps_i, X eps_j) where X is a symmetric rank-4 tensor that must satisfy (eps_i, X eps_j) = (X eps_i, eps_j).

@bangerth
Copy link
Contributor

As for the term added to the matrix, are we sure that it is actually correct? It might be worth going back to the paper in which it was proposed and double-check.

@gassmoeller gassmoeller force-pushed the optimize_A_block_solver branch 2 times, most recently from d54f4a7 to cf7727d Compare April 1, 2024 13:32
@gassmoeller
Copy link
Member Author

I discussed this with @tjhei last week and I think this is the right path to go. As far as we know this is the right stabilization term unless we want to derive a completely new stabilization (which would likely require a new free-surface paper). I also implemented the switch for GMG now, and added an input parameter to force the new solver. This is useful, because we have occasionally observed models that crash in the A block preconditioner in the past (especially with GMG) and I want to be able to test Bicgstab for those models when I encounter them again to see if we accidentally have a non-symmetric operation in GMG somewhere (@tjhei and @jdannberg agreed this is a good idea).

@@ -2365,7 +2400,7 @@ namespace aspect
{
const auto &pbs = sim.geometry_model->get_periodic_boundary_pairs();

for (const auto p: pbs)
for (const auto &p: pbs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you need to rebase for this?

@gassmoeller gassmoeller force-pushed the optimize_A_block_solver branch from cf7727d to a6f53ad Compare April 4, 2024 16:44
@tjhei
Copy link
Member

tjhei commented Apr 5, 2024

BiCGStab also works if the system is indefinite and not positive definite (in contrast to CG) as far as I know. That might be another reason to make this switch. Is this worth mentioning?

Comment on lines 300 to 318
if (A_block_is_symmetric)
{
SolverCG<dealii::LinearAlgebra::distributed::Vector<double>> solver(solver_control,mem);
solver.solve(A_block, dst.block(0), utmp.block(0),
A_block_preconditioner);
}
else
{
// Use BiCGStab for non-symmetric matrices.
// Do not compute the exact residual, as this
// is more expensive, and we only need an approximate solution.
SolverBicgstab<dealii::LinearAlgebra::distributed::Vector<double>>
solver(solver_control,
mem,
SolverBicgstab<dealii::LinearAlgebra::distributed::Vector<double>>::AdditionalData(/*exact_residual=*/ false));
solver.solve(A_block, dst.block(0), utmp.block(0),
A_block_preconditioner);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a question for @tjhei: What exactly is the matrix-free preconditioner we are using in the situations where A is not symmetric? Does the matrix-free preconditioner even know about the stabilization term? And is the preconditioner able to deal with nonsymmetric matrices?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we implemented free surface terms. We are using Chebyshev, which should work okay for non-symmetric operators

using ASmootherType = PreconditionChebyshev<GMGABlockMatrixType,VectorType>;

Comment on lines +354 to +356
solver.solve(stokes_matrix.block(0,0), dst.block(0), utmp,
a_preconditioner);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine a_preconditioner here is the Trilinos AMG? If so, are we building that with a flag that says that the matrix is symmetric?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I couldnt find such a flag for the deal.II PreconditionAMG class, and neither could I find one in ASPECT. What would that flag be? The flags we set are in source/simulator/assembly.cc:444-510.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, ok. I remembered that one of them did.

Comment on lines 2001 to 2044
// The A block should be symmetric, unless there are free surface stabilization terms, or
// the user has forced the use of a different solver.
bool A_block_is_symmetric = true;
if (sim.mesh_deformation && sim.mesh_deformation->get_boundary_indicators_requiring_stabilization().empty() == false)
A_block_is_symmetric = false;
else if (sim.parameters.force_nonsymmetric_A_block_solver)
A_block_is_symmetric = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's awkward that this code block is repeated twice. Could we move that somewhere into class Simulator, or another central place? Or could we handle this in a different way (e.g., by having a signal that the different places that add to the matrix could attach to, and that could return a flag for "yes, we're adding nonsymmetric tems")?

Comment on lines -34 to +38
16 7.834139939590e+03 6.946604926601e+02 232 2487 1099 1099 0 8 204 351 945 "" -4.24736635e+03 1.03454127e+03 1.46649778e-01 4.32412417e-01
17 8.557022025369e+03 7.228820857786e+02 232 2487 1099 1099 0 8 203 335 951 "" -4.46158686e+03 1.03751816e+03 1.42299927e-01 4.16954513e-01
18 9.306644450925e+03 7.496224255564e+02 232 2487 1099 1099 0 8 202 298 937 "" -4.66800971e+03 1.03879161e+03 1.38655362e-01 4.03374485e-01
19 1.008145155748e+04 7.748071065529e+02 232 2487 1099 1099 0 8 204 367 967 output-advect_mesh_vertically/solution/solution-00001 -4.86664453e+03 1.03854521e+03 1.35606730e-01 3.91449902e-01
20 1.087985689896e+04 7.984053414826e+02 232 2487 1099 1099 0 8 201 270 933 "" -5.05764216e+03 1.03605211e+03 1.33084178e-01 3.81287482e-01
0 0.000000000000e+00 0.000000000000e+00 112 1252 551 551 0 0 29 31 147 output-advect_mesh_vertically/solution/solution-00000 0.00000000e+00 0.00000000e+00 4.13121111e-01 9.76074209e-01
1 3.204421975415e+02 3.204421975415e+02 112 1252 551 551 0 10 19 21 102 "" -2.98120932e+02 1.40693174e+02 3.72678474e-01 9.20843191e-01
2 6.600396137902e+02 3.395974162487e+02 112 1252 551 551 0 9 16 18 87 "" -5.92877177e+02 2.65474961e+02 3.45954229e-01 8.75035554e-01
3 1.017411216986e+03 3.573716031959e+02 112 1252 551 551 0 9 15 17 82 "" -8.85618107e+02 3.80416199e+02 3.20552313e-01 8.27810244e-01
4 1.395216935499e+03 3.778057185127e+02 112 1252 551 551 0 9 16 18 88 "" -1.17620125e+03 4.85741856e+02 2.97018455e-01 7.82260607e-01
5 1.795064994357e+03 3.998480588578e+02 112 1252 551 551 0 9 15 17 83 "" -1.46155984e+03 5.81161304e+02 2.76069213e-01 7.50424980e-01
6 2.211996178360e+03 4.169311840033e+02 136 1482 653 653 0 9 14 16 80 "" -1.74367151e+03 6.65722355e+02 2.56801962e-01 7.15037235e-01
7 2.649671268530e+03 4.376750901699e+02 136 1482 653 653 0 9 15 17 83 "" -2.01885582e+03 7.40213440e+02 2.38432655e-01 6.76166151e-01
8 3.112578719724e+03 4.629074511946e+02 136 1482 653 653 0 9 14 16 76 "" -2.28920278e+03 8.05083940e+02 2.21910521e-01 6.39259838e-01
9 3.602205867398e+03 4.896271476736e+02 136 1482 653 653 0 9 13 15 73 "" -2.55478374e+03 8.60548467e+02 2.07085325e-01 6.04371619e-01
10 4.120136388632e+03 5.179305212345e+02 136 1482 653 653 0 9 11 13 62 "" -2.81516434e+03 9.07046969e+02 1.94081345e-01 5.72112968e-01
11 4.667367515170e+03 5.472311265379e+02 190 2085 921 921 0 9 15 17 85 "" -3.07024403e+03 9.45141408e+02 1.83955069e-01 5.46774470e-01
12 5.238953994056e+03 5.715864788856e+02 190 2085 921 921 0 8 15 17 85 "" -3.31903241e+03 9.75741071e+02 1.74061970e-01 5.18832055e-01
13 5.841353373919e+03 6.023993798632e+02 190 2085 921 921 0 8 13 15 74 "" -3.56151073e+03 9.99284226e+02 1.65493151e-01 4.93370296e-01
14 6.474861718796e+03 6.335083448770e+02 190 2085 921 921 0 8 13 15 75 "" -3.79718369e+03 1.01639162e+03 1.58125916e-01 4.70308248e-01
15 7.139479446940e+03 6.646177281436e+02 190 2085 921 921 0 8 14 16 80 "" -4.02583721e+03 1.02786068e+03 1.51868566e-01 4.49988243e-01
16 7.834139939604e+03 6.946604926640e+02 232 2487 1099 1099 0 8 204 297 945 "" -4.24736635e+03 1.03454127e+03 1.46649778e-01 4.32412417e-01
17 8.557022025416e+03 7.228820858124e+02 232 2487 1099 1099 0 8 203 289 951 "" -4.46158686e+03 1.03751816e+03 1.42299927e-01 4.16954513e-01
18 9.306644450899e+03 7.496224254831e+02 232 2487 1099 1099 0 8 202 263 937 "" -4.66800971e+03 1.03879161e+03 1.38655362e-01 4.03374485e-01
19 1.008145155745e+04 7.748071065528e+02 232 2487 1099 1099 0 8 204 312 967 output-advect_mesh_vertically/solution/solution-00001 -4.86664453e+03 1.03854521e+03 1.35606730e-01 3.91449902e-01
20 1.087985689896e+04 7.984053415080e+02 232 2487 1099 1099 0 8 201 245 933 "" -5.05764216e+03 1.03605211e+03 1.33084178e-01 3.81287482e-01
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice improvement in number of iterations, BTW. I don't know how that translates into CPU time, but because the operations in the actual solver are so cheap compared to applying the preconditioner, I suspect that this also translates into CPU time.

@gassmoeller gassmoeller force-pushed the optimize_A_block_solver branch from a6f53ad to cdd857b Compare May 7, 2024 06:58
@gassmoeller
Copy link
Member Author

/rebuild

@gassmoeller gassmoeller force-pushed the optimize_A_block_solver branch from cdd857b to 5f9ed91 Compare May 15, 2024 15:13
@gassmoeller
Copy link
Member Author

I think I addressed all comments. I put the check for asymmetric A block into a helper function to avoid the duplication and have added a note to say that Bicgstab can also handle indefinite matrices. Is there anything else left to do for this PR?

Just a quick reference for the future: In my tests GMRES was actually somewhat faster than Bicgstab for all the test cases I could find. Bicgstab reduces the number of iterations compared to CG, but it also needs 2 matrix-vector products per iteration instead of 1 (which is what gmres and cg need). GMRES would become slow if it needs many iterations because of the many scalar products to minimize the residual, but for the A block preconditioner the tolerance is low, and so GMRES rarely needs more than 10 iterations or so, and 10 scalar products seems to be cheaper than one matrix-vector product. However, this may change for larger models, or for harder problems, so let's go with Bicgstab.

@gassmoeller gassmoeller force-pushed the optimize_A_block_solver branch from 5f9ed91 to 80565d6 Compare May 16, 2024 12:16
Copy link
Member

@tjhei tjhei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is good to go. @bangerth?

@bangerth bangerth merged commit 40eb9e7 into geodynamics:main May 29, 2024
8 checks passed
@gassmoeller gassmoeller deleted the optimize_A_block_solver branch May 30, 2024 02:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants