-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix flaky CI checks on Ubuntu #6906
Comments
This seems to be coming from calls to SciPy's |
What's frustrating about this is that I can't seem to get it to happen locally on a linux system, even if I explicitly set the values of the parameters |
I've finally been able to reproduce it. I had to set the random seed to the same value as one of the examples of a failure during a CI run. |
The test function ‘test_calibrate_z_phases()‘ in the file ‘cirq-core/cirq/experiments/z_phase_calibration_test.py‘ eventually ends up calling `fit_exponential_decays()` in `xeb_fitting.py`. That function uses SciPy's `curve_fit()` to fit an exponential function to data. For some unlucky combination of random seed and problem cases, `curve_fit()` can fail to find a solution, and raises `RuntimeError` when that happens. This change wraps the call to `curve_fit()` with a try-catch for `RuntimeError`; when it happens, the wrapper calls `curve_fit()` one more time with an explicit setting of higher max function evaluations. If that fails too, then the resulting new `RuntimeError` will not be caught. This is admittedly not a true solution to the problem. In my testing, this prevents the original CI error (with a particular random seed), but it's probably the case that the second run of `curve_fit()` could get unlucky again. Still, this may be good enough for most cases in practice. If we continue to see failures, we should explore a more sophisticated solution to this.
This will be fixed by PR #6906 once it's accepted. |
This workflow is designed to let you run pytest with various debugging options. It's meant to be executed manually using "Run workflow"on https://github.com/quantumlib/Cirq/actions/workflows/debug.yaml Clicking the "Run workflow" button there will make GitHub present a form interface with various options for the characteristics of the run. Options include: - give it a specific test to run (in the form of a path to a test file) - set the number of repetitions of the test - set the python version to use - set the random seed - turn on verbose logging This workflow was developed as an aid to debugging issue quantumlib#6906, involving curve fitting that failed on occasion. This workflow is general and I think it will be useful for other debugging needs.
Remove test dependence on the randomly-seed value by passing `random_state` argument to `calibrate_z_phases` and `z_phase_calibration_workflow`. Choose seed value from several trial runs with a faster fit convergence and shorter execution time. Related to quantumlib#6906
Noting here how I reproduced it on a Linux system. Using Python 3.11 and this command, ./check/pytest -v --randomly-seed=3258636985 \
cirq-core/cirq/experiments/z_phase_calibration_test.py::test_calibrate_z_phases_no_options results:
Attached is a file containing the full output, for historical reference: curve-fit-error.log |
Remove test dependence on the pytest randomly-seed value by passing `random_state` to `calibrate_z_phases` and `z_phase_calibration_workflow`. Choose seed value from several trial runs with a faster fit convergence and shorter execution time. Related to #6906
The current pytest runs on Ubuntu often fail during PR checks. It seems that sometimes the optimization steps take too long and time out:
I'm seeing this kind of thing happen on more than 50% of my PRs.
The text was updated successfully, but these errors were encountered: