Fix flaky curve fitting tests #6916

mhucka · 2025-01-04T00:33:08Z

The test function test_calibrate_z_phases() in the file cirq-core/cirq/experiments/z_phase_calibration_test.py eventually ends up calling fit_exponential_decays() in xeb_fitting.py. That function uses SciPy's curve_fit() to fit an exponential function to data. For some unlucky combination of random seed and problem cases, curve_fit() can fail to find a solution, and raises RuntimeError when that happens.

This PR simply changes the call to curve_fit() to use a higher number for the maximum function evaluations. The following is a test case that fails without the change and succeeds with it:

./check/pytest -v --randomly-seed=3258636985 \
cirq-core/cirq/experiments/z_phase_calibration_test.py::test_calibrate_z_phases_no_options

Fixes #6906.

The test function ‘test_calibrate_z_phases()‘ in the file ‘cirq-core/cirq/experiments/z_phase_calibration_test.py‘ eventually ends up calling `fit_exponential_decays()` in `xeb_fitting.py`. That function uses SciPy's `curve_fit()` to fit an exponential function to data. For some unlucky combination of random seed and problem cases, `curve_fit()` can fail to find a solution, and raises `RuntimeError` when that happens. This change wraps the call to `curve_fit()` with a try-catch for `RuntimeError`; when it happens, the wrapper calls `curve_fit()` one more time with an explicit setting of higher max function evaluations. If that fails too, then the resulting new `RuntimeError` will not be caught. This is admittedly not a true solution to the problem. In my testing, this prevents the original CI error (with a particular random seed), but it's probably the case that the second run of `curve_fit()` could get unlucky again. Still, this may be good enough for most cases in practice. If we continue to see failures, we should explore a more sophisticated solution to this.

codecov · 2025-01-04T00:47:08Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 97.86%. Comparing base (3e16e15) to head (5d5a896).
Report is 2 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #6916   +/-   ##
=======================================
  Coverage   97.86%   97.86%           
=======================================
  Files        1084     1084           
  Lines       94309    94314    +5     
=======================================
+ Hits        92298    92304    +6     
+ Misses       2011     2010    -1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

pavoljuhas · 2025-01-07T21:51:37Z

cirq-core/cirq/experiments/xeb_fitting.py

+    def do_curve_fit():
+        curve_fit = partial(
+            optimize.curve_fit,
            exponential_decay,
            cycle_depths,
            fidelities,
            p0=(a_0, layer_fid_0),
            bounds=((0, 0), (1, 1)),
        )
+        try:
+            return curve_fit()
+        except RuntimeError:  # pragma: no cover
+            # Curve_fit didn't find a solution. Try once more w/ higher maxfev.
+            # Default (in SciPy v.1.14) is 100*(1+len(p0)) = 300 for our p0.
+            return curve_fit(maxfev=1000)
+
+    try:
+        (a, layer_fid), pcov = do_curve_fit()


The change in 0b6cccd is effectively the same as adding a maxfev=1000 argument. Let us just keep it simple.

Suggested change

def do_curve_fit():

curve_fit = partial(

optimize.curve_fit,

exponential_decay,

cycle_depths,

fidelities,

p0=(a_0, layer_fid_0),

bounds=((0, 0), (1, 1)),

)

try:

return curve_fit()

except RuntimeError: # pragma: no cover

# Curve_fit didn't find a solution. Try once more w/ higher maxfev.

# Default (in SciPy v.1.14) is 100*(1+len(p0)) = 300 for our p0.

return curve_fit(maxfev=1000)

try:

(a, layer_fid), pcov = do_curve_fit()

try:

(a, layer_fid), pcov = optimize.curve_fit(

exponential_decay,

cycle_depths,

fidelities,

p0=(a_0, layer_fid_0),

bounds=((0, 0), (1, 1)),

maxfev=1000,

)

pavoljuhas

Instead of repeating the fit, let us just run it once with a larger maxfev.

mhucka · 2025-01-07T22:19:38Z

So, the reason for the 2-step approach was the following. I noticed that when it was restarted, the second time around it seemed to take far fewer function evaluations. I admit I don't know if that's because it picks a different starting point, or I got lucky in the few times I inspected it, or what. But, when it failed, it seemed to do better the second time around. Since (a) 1000 was an arbitrary number and might not even be large enough for all cases, and (b) second runs after failures seemed to do better, I reasoned that this was a safer choice. ("Safer" in the sense of being more likely to lead to a finish and not another failure.)

This is admittedly weak reasoning, and not entirely logical. (E.g., maybe it can be just as unlucky the 2nd time around and not finish in 1000 steps.)

If you think this is not really sensible or worth it, it's fine with me to change it to the simpler one-shot version.

pavoljuhas · 2025-01-07T23:33:09Z

I noticed that when it was restarted, the second time around it seemed to take far fewer function evaluations.

The 2 curve_fit calls are reproducible with the initial parameters given by the p0 argument.
If the second call is changed to curve_fit() without the maxfev, the test would fail.
The one-call version with a large maxfev should be actually faster, because it does not need to repeat the first 300 or so iterations.

mhucka · 2025-01-07T23:41:12Z

OK, let's change it. I'll commit the changes later today.

pavoljuhas · 2025-01-08T00:05:04Z

Sounds good, please make sure to update the commit message as needed.

Discussions during code review resulted in the decision to not do a try-fail-retry approach, and instead simply call `curve_fit()` once but with an explicit, high value of `maxfev`. This solution also work on the original case. To check, here is the command that fails without this change and succeeds with the change: (on a Debian-based Linux system, Python 3.11, NumPy 1.24, and SciPy 1.15.0): ```bash ./check/pytest -v --randomly-seed=3258636985 \ cirq-core/cirq/experiments/z_phase_calibration_test.py::test_calibrate_z_phases_no_options ``` The above should be executed from the top of the Cirq source tree.

CirqBot added the size: S 10< lines changed <50 label Jan 4, 2025

Fix format test failure

b6adaa4

mhucka marked this pull request as ready for review January 4, 2025 01:09

mhucka requested review from mrwojtek, vtomole and a team as code owners January 4, 2025 01:09

mhucka requested review from viathor and pavoljuhas January 4, 2025 01:09

Merge branch 'main' into fix-issue-6906

eeef22b

pavoljuhas reviewed Jan 7, 2025

View reviewed changes

pavoljuhas changed the title ~~Fix issue #6906 – flaky curve fitting tests~~ Fix flaky curve fitting tests Jan 7, 2025

pavoljuhas requested changes Jan 7, 2025

View reviewed changes

mhucka and others added 3 commits January 7, 2025 19:25

Merge branch 'main' into fix-issue-6906

bb5ed1f

Merge branch 'main' into fix-issue-6906

5d5a896

pavoljuhas approved these changes Jan 8, 2025

View reviewed changes

pavoljuhas merged commit 9ada8a8 into quantumlib:main Jan 8, 2025
37 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix flaky curve fitting tests #6916

Fix flaky curve fitting tests #6916

mhucka commented Jan 4, 2025 •

edited

Loading

codecov bot commented Jan 4, 2025 •

edited

Loading

pavoljuhas Jan 7, 2025 •

edited

Loading

pavoljuhas left a comment

mhucka commented Jan 7, 2025

pavoljuhas commented Jan 7, 2025

mhucka commented Jan 7, 2025

pavoljuhas commented Jan 8, 2025 •

edited

Loading

Fix flaky curve fitting tests #6916

Fix flaky curve fitting tests #6916

Conversation

mhucka commented Jan 4, 2025 • edited Loading

codecov bot commented Jan 4, 2025 • edited Loading

Codecov Report

pavoljuhas Jan 7, 2025 • edited Loading

Choose a reason for hiding this comment

pavoljuhas left a comment

Choose a reason for hiding this comment

mhucka commented Jan 7, 2025

pavoljuhas commented Jan 7, 2025

mhucka commented Jan 7, 2025

pavoljuhas commented Jan 8, 2025 • edited Loading

mhucka commented Jan 4, 2025 •

edited

Loading

codecov bot commented Jan 4, 2025 •

edited

Loading

pavoljuhas Jan 7, 2025 •

edited

Loading

pavoljuhas commented Jan 8, 2025 •

edited

Loading