Skip to content

Commit

Permalink
feat(tests): automated migration testing
Browse files Browse the repository at this point in the history
Building on the smoke-test rewrite to use process-compose,
let's script the migration process, so that we can test
current HEAD of the monorepo against a prior tagged version,
and validate that necessary migrations are in place.

One possible approach is to fetch prebuilt binaries from uploaded
artifacts on Github. That's fine for `pd`, but doesn't work for
running the smoke tests, due to client/server incompatibility.
Therefore we'll clone the entire repo in a git-ignored subdir,
and build the old binaries there. Heavy, but reliable.

Updated to use the concise `pd migrate` UX from #4339.
Previously, there were missing AuctionParams, resolved by #4338.
It's imperative that we use the vote-to-halt mechanism to prepare
for the upgrade, otherwise the priv validator state munging will not
work. That logic still TK.
  • Loading branch information
conorsch committed May 8, 2024
1 parent 0ece706 commit 5b08436
Show file tree
Hide file tree
Showing 8 changed files with 332 additions and 3 deletions.
38 changes: 38 additions & 0 deletions .github/workflows/migration.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
---
name: Migration test
on:
pull_request:

jobs:
smoke_test:
runs-on: buildjet-16vcpu-ubuntu-2204
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
steps:
- uses: actions/checkout@v4
with:
lfs: true
fetch-tags: true
fetch-depth: 0

- name: Load rust cache
uses: astriaorg/[email protected]
with:
# Cache the git worktree for faster builds
workspaces: |-
.
./deployments/worktrees/v0.73.1
- name: Install cometbft binary
run: ./deployments/scripts/install-cometbft

- name: Install process-compose
run: >-
sh -c "$(curl --location https://raw.githubusercontent.com/F1bonacc1/process-compose/main/scripts/get-pc.sh)" --
-d -b ~/bin
- name: Run migration test
run: |
export PATH="$HOME/bin:$PATH"
./deployments/scripts/migration-test
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -50,8 +50,11 @@ deployments/relayer/configs/penumbra-local.json

# Logs, and other files from smoke tests
deployments/logs/
deployments/worktrees/
deployments/bin/
crates/bin/pcli/proposal.toml


# Memory profiler, via bytehound or otherwise
*.dat

Expand Down
59 changes: 59 additions & 0 deletions deployments/compose/process-compose-migration-test-1.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
---
# A process-compose configuration for running penumbra migration-tests.

# Interleave logs from all services in single file, so it's greppable.
log_location: deployments/logs/migration-test-1-combined.log
is_strict: true

processes:
build-code:
working_dir: deployments/worktrees/v0.73.1

# Create network configuration, for running a pd validator locally.
network-generate:
working_dir: deployments/worktrees/v0.73.1
command: >
cargo run --quiet --release --bin pd -- testnet generate
--unbonding-delay 50
--epoch-duration 50
--proposal-voting-blocks 50
--timeout-commit 500ms
# Run pd validator based on generated network.
pd:
working_dir: deployments/worktrees/v0.73.1

# Run `pclientd` integration tests.
test-pclientd:
working_dir: deployments/worktrees/v0.73.1
log_location: deployments/logs/migration-test-1-pclientd.log

# Run `pcli` integration tests.
test-pcli:
working_dir: deployments/worktrees/v0.73.1
log_location: deployments/logs/migration-test-1-pcli.log
# We add `--skip delegate_and_undelegate` because the old smoke-test has an unreliable
# regex, that's since been fixed.
command: >-
cargo test --release --features sct-divergence-check,download-proving-keys --package pcli --
--ignored --test-threads 1 --nocapture
--skip delegate_and_undelegate
depends_on:
pd:
condition: process_healthy
cometbft:
condition: process_started
test-pclientd:
condition: process_completed
availability:
restart: exit_on_failure

test-pd:
working_dir: deployments/worktrees/v0.73.1

# Finalizer task, which will wait until all test suites have finished.
# This allows us to ensure that.
summary:
# The `command` only runs if all tests were succesful,
# otherwise the process exits due to dep failure.
command: echo "migration tests phase 1 finished"
28 changes: 28 additions & 0 deletions deployments/compose/process-compose-migration-test-2.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
---
# A process-compose configuration for running penumbra migration-tests.
# This series of commands represents performing the actual migration,
# then starting up the network again.

# Interleave logs from all services in single file, so it's greppable.
log_location: deployments/logs/migration-test-2-combined.log
is_strict: true

processes:
# Don't generate, since we already did that on the old tag.
network-generate:
command: echo "skipping network generation, deferring to migration..."

# Run `pclientd` integration tests.
test-pclientd:
log_location: deployments/logs/migration-test-2-pclientd.log

# Run `pcli` integration tests.
test-pcli:
log_location: deployments/logs/migration-test-2-pcli.log

# Finalizer task, which will wait until all test suites have finished.
# This allows us to ensure that.
summary:
# The `command` only runs if all tests were succesful,
# otherwise the process exits due to dep failure.
command: echo "migration tests phase 2 finished"
8 changes: 5 additions & 3 deletions deployments/compose/process-compose-smoke-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,9 +37,11 @@ processes:
# Create network configuration, for running a pd validator locally.
network-generate:
command: >
cargo run --quiet --release --bin pd --
testnet generate --unbonding-delay 50
--epoch-duration 50 --timeout-commit 500ms --gas-price-simple 1000
cargo run --quiet --release --bin pd -- testnet generate
--unbonding-delay 50
--epoch-duration 50
--proposal-voting-blocks 50
--timeout-commit 500ms
depends_on:
build-code:
condition: process_completed_successfully
Expand Down
70 changes: 70 additions & 0 deletions deployments/compose/process-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
---
# A process-compose configuration for running a local Penumbra devnet.
# This isn't used in scripts anywhere (yet?) but serves as a reference point.
# Potentially could be integrated with e.g. https://www.jetify.com/devbox later on.
#
version: "0.5"

# Env vars set here will be accessible to all processes.
environment:
- "RUST_LOG=info,network_integration=debug,pclientd=debug,pcli=info,pd=info,penumbra=info"

log_level: info
is_strict: true
# Interleave logs from all services in single file, so it's greppable.
log_location: deployments/logs/dev-env-combined.log

# By default, build pd from the workspace. Support overriding via a deeper git-worktree,
# so that an older version of pd can be built and run. This helps when running older
# networks locally, to debug migrations.
vars:
WORKING_DIR: .
# WORKING_DIR: deployments/worktrees/v0.73.1

processes:
# Build latest version of local code. We do this once, up front,
# so that each test suite runs immediately when ready, without iterative building.
build-code:
working_dir: "{{ .WORKING_DIR }}"
command: |-
echo "Building source code..."
cargo --quiet build --release --all-targets
# Generate network from latest code, only if network does not already exist.
generate-network-nice:
working_dir: "{{ .WORKING_DIR }}"
command: |-
if [[ -d ~/.penumbra/testnet_data/node0 ]] ; then
>&2 echo "Testnet data exists locally, reusing it"
else
cargo run --quiet --release --bin pd -- testnet generate
--unbonding-delay 50
--epoch-duration 50
--proposal-voting-blocks 50
--timeout-commit 500ms
fi
depends_on:
build-code:
condition: process_completed_successfully

# Run pd validator based on generated network.
pd:
command: "cargo run --release --bin pd -- start"
readiness_probe:
http_get:
host: 127.0.0.1
scheme: http
path: "/"
port: 8080
period_seconds: 5
working_dir: "{{ .WORKING_DIR }}"
depends_on:
generate-network-nice:
condition: process_completed_successfully

# Run CometBFT for pd p2p.
cometbft:
command: "cometbft --home ~/.penumbra/testnet_data/node0/cometbft start"
depends_on:
pd:
condition: process_healthy
121 changes: 121 additions & 0 deletions deployments/scripts/migration-test
Original file line number Diff line number Diff line change
@@ -0,0 +1,121 @@
#!/usr/bin/env bash
# CI script to test migration compatibility. Orchestrates the following:
#
# * [x] checks out prior tag and builds its binaries and tests
# * [x] creates local devnet via that target version
# * [x] make sure devnet has very fast voting
# * [x] runs smoke tests against devnet, to generate txs
# * [ ] submit governance proposal for chain halt
# * [ ] vote for proposal to pass
# * [ ] wait for halt
# * [x] stops devnet ~ ends phase 1
# * [x] migrates devnet via latest (i.e. HEAD) pd version ~ begins phase 2
# * [x] restarts devnet
# * [ ] re-runs smoke tests, from latest version, to validate
#
# After all that's done, we only know that the migration didn't badly
# break things. We should also have dedicated integration tests
# for pre- and post-migration checks.

set -euo pipefail

# Ideally we'd look up the most recently released version of `pd` and use that to test against.
# The current test framework has a lot of `v0.73.1` hardcoded throughout, though, so let's
# mandate explicit versions for now.
target_version="${1:-}"
if [[ -z "$target_version" ]] ; then
>&2 echo "ERROR: no version specified for initial network"
>&2 echo "Usage: $0 <initial_network_version>"

fi
shift 1
>&2 echo "Beginning migration test from '$target_version' -> '$(git rev-parse HEAD) (HEAD)'"

# Check for interactive terminal session, enable TUI if yes.
if [[ -t 1 ]] ; then
use_tui="true"
else
use_tui="false"
fi

# Look up some git facts for logging and preparing work trees.
repo_root="$(git rev-parse --show-toplevel)"
worktree_dir="${repo_root}/deployments/worktrees/${target_version}"

# Create a local git-worktree so that we can check out a prior
# tag and build its version of tests and suchwhat.
prepare_local_worktree() {
if [[ ! -d "$worktree_dir" ]] ; then
>&2 echo "Creating new git worktree: $worktree_dir"
git worktree add "$worktree_dir" "$target_version"
fi
git -C "$worktree_dir" checkout "$target_version"
}

# Run phase 1 of the migration test: create devnet on target version tag,
# run smoke tests against it from that checkout, then exit.
function run_migration_test_phase_1() {
>&2 echo "Running smoke-tests against pre-migration devnet, phase 1..."
# Override the pc API port 8080 -> 9191, to avoid conflict with pd.
if ! process-compose \
--config deployments/compose/process-compose-smoke-test.yml \
--config deployments/compose/process-compose-migration-test-1.yml \
--port 9191 \
-t="$use_tui" \
; then
>&2 echo "ERROR: migration tests phase 1 failed"
>&2 echo "Review logs in: deployments/logs/migration-test-1*.log"
find "${repo_root}/deployments/logs/migration-test-1"*".log" | sort >&2
exit 1
else
echo "SUCCESS! Migration test phase 1 complete."
fi
}

# Perform most recent migration against local state, according to steps in
# https://guide.penumbra.zone/main/node/pd/chain-upgrade.html
function perform_migration() {
node0_dir="${HOME}/.penumbra/testnet_data/node0"

# TEMPORARY: backup entire state during development so it's easy to recover
tarball_backup="${node0_dir}/../node0-state-backup.tar"
if [[ ! -e "$tarball_backup" ]] ; then
tar -cf "$tarball_backup" "$node0_dir"
fi
# Ensure tar changes are synced to disk. This shouldn't be necessary,
# but this script will run in a lot of contexts, so better to be sure.
sync
RUST_LOG="info,pd=debug,penumbra=debug" \
cargo run --quiet --release --bin pd -- migrate
}

# Post-migration, restart the network, and rerun the smoke tests against it.
function run_migration_test_phase_2() {
>&2 echo "Running smoke-tests against post-migration devnet, phase 2..."
if ! process-compose \
--config deployments/compose/process-compose-smoke-test.yml \
--config deployments/compose/process-compose-migration-test-2.yml \
--port 9191 \
-t="$use_tui" \
; then
>&2 echo "ERROR: migration tests phase 2 failed"
>&2 echo "Review logs in: deployments/logs/migration-test-2*.log"
find "${repo_root}/deployments/logs/migration-test-2"*".log" | sort >&2
exit 1
else
echo "SUCCESS! Migration test phase 2 complete."
fi
}

# The whole shebang, soup to nuts.
function main() {
prepare_local_worktree
run_migration_test_phase_1
>&2 echo '===================='
>&2 echo 'PERFORMING MIGRATION'
>&2 echo '===================='
perform_migration
run_migration_test_phase_2
}

main
8 changes: 8 additions & 0 deletions justfile
Original file line number Diff line number Diff line change
@@ -1,4 +1,12 @@
migration-test:
# resetting network state
cargo run --release --bin pd -- testnet unsafe-reset-all || true
./deployments/scripts/migration-test v0.73.1

smoke:
# resetting network state
cargo run --release --bin pd -- testnet unsafe-reset-all || true
./deployments/scripts/smoke-test.sh

dev:
process-compose up --port 9191 --config ./deployments/compose/process-compose.yml

0 comments on commit 5b08436

Please sign in to comment.