-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: automated testing of migration compatibility #4323
Comments
There's a spike on an overhaul of the smoke-test logic here #4324 which provides a nice shape to extend into migration-testing. |
Can you elaborate on this a little? Isn't our expectation that clients should work across upgrade boundaries? |
If you try to run a client from current main against a public testnet endpoint, you'll see in incompatibility message related to ongoing auction work:
Those same dependencies carry over into the integration tests. |
Pushed a draft PR with a spike on local testing of migration logic, that can be promoted to a CI job once it's solid. Got surprised by a proto incompat error that may be spurious, so I'm going to run through the upgrade process manually to sanity-check that the scripting order is sound. |
Got it. I was assuming we would run the smoke test script from the original tag and then, post-upgrade, run the smoke test script from the new HEAD. |
Made substantial progress on this front. There are notably two types of testing going on:
The former is potentially suitable for per-PR runs, although so far the runtime is quite long: ~20m or so. We recently shaved a ton of per-PR CI runtime off with #4324, so it'd be a shame to knock it back up again, but for assurance it'd be worth it. This type of testing is great for catching problems like #4430. The latter case is more intensive, and isn't yet end-to-end automated yet. Given that its setup reuses the same architecture as the public testnet, it's able to catch more subtle bugs, like #4443. For now, I'll continue to use this setup as part of pre-release QA. |
Adds intra-CI standalone testing of migration behavior. This can be run locally on developer workstations, and also in CI. The job is currently taking >20m, and requires a lot of disk space, but it's worth it for the assurance. Building on the smoke-test rewrite to use process-compose, let's script the migration process, so that we can test current HEAD of the monorepo against a prior tagged version, and validate that necessary migrations are in place. One possible approach is to fetch prebuilt binaries from uploaded artifacts on Github. That's fine for `pd`, but doesn't work for running the smoke tests, due to client/server incompatibility. Therefore we'll clone the entire repo in a git-ignored subdir, and build the old binaries there. Heavy, but reliable. Adds a new rust crate, strictly for running the migration-test suite of integration tests, which is very similar in nature to the already-existing network-integration tests AKA smoke tests. Copy/pastes a lot of code from the smokes, which we can always factor out into reusable utils, but not bothering with that right now. Refs #4323.
Adds intra-CI standalone testing of migration behavior. This can be run locally on developer workstations, and also in CI. The job is currently taking >20m, and requires a lot of disk space, but it's worth it for the assurance. Building on the smoke-test rewrite to use process-compose, let's script the migration process, so that we can test current HEAD of the monorepo against a prior tagged version, and validate that necessary migrations are in place. One possible approach is to fetch prebuilt binaries from uploaded artifacts on Github. That's fine for `pd`, but doesn't work for running the smoke tests, due to client/server incompatibility. Therefore we'll clone the entire repo in a git-ignored subdir, and build the old binaries there. Heavy, but reliable. Adds a new rust crate, strictly for running the migration-test suite of integration tests, which is very similar in nature to the already-existing network-integration tests AKA smoke tests. Copy/pastes a lot of code from the smokes, which we can always factor out into reusable utils, but not bothering with that right now. Refs #4323.
Adds intra-CI standalone testing of migration behavior. This can be run locally on developer workstations, and also in CI. The job is currently taking >20m, and requires a lot of disk space, but it's worth it for the assurance. Building on the smoke-test rewrite to use process-compose, let's script the migration process, so that we can test current HEAD of the monorepo against a prior tagged version, and validate that necessary migrations are in place. One possible approach is to fetch prebuilt binaries from uploaded artifacts on Github. That's fine for `pd`, but doesn't work for running the smoke tests, due to client/server incompatibility. Therefore we'll clone the entire repo in a git-ignored subdir, and build the old binaries there. Heavy, but reliable. Adds a new rust crate, strictly for running the migration-test suite of integration tests, which is very similar in nature to the already-existing network-integration tests AKA smoke tests. Copy/pastes a lot of code from the smokes, which we can always factor out into reusable utils, but not bothering with that right now. Refs #4323.
Is your feature request related to a problem? Please describe.
When preparing a chain upgrade, manual testing of upgrades is an arduous process.
Describe the solution you'd like
We should have an integration test that runs a devnet based on the currently-active testnet, via the most recently released tag, runs smoke tests against it to generate txs, then stops the network, runs the migration, restarts the network, and reruns the smoket ests.
Describe alternatives you've considered
Alternatives are manual testing, which is both slow and error-prone. Longer-term we want a capable "sudo mode" for testing upgrades, which is tracked in #4265.
Additional context
We'll need to be careful about endpoint compatibility: we cannot run an older version of
pd
and run the most recent smoke tests against it, because the view server implementations will not be compatible. We can sidestep this by running the smoke tests from the tagged release. Ideally, we'd be able to swap out the path to binaries within the tests via an env var to make it a bit more to test prior versions, without rebuilding from source every time.The text was updated successfully, but these errors were encountered: