Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ref(workflow): move most scripts to their own executables #8005

Merged
merged 22 commits into from
Dec 12, 2023
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
74bf254
ref(workflow): move most scripts to their own executable
gustavovalverde Nov 23, 2023
3d09d78
Merge branch 'main' into ref-workflow-scripts
teor2345 Nov 27, 2023
775d72c
Merge branch 'main' into ref-workflow-scripts
oxarbitrage Nov 28, 2023
8c8fd20
debug: JSON value
gustavovalverde Dec 5, 2023
924e5f4
fix(scripts): move remaining script to its own file
gustavovalverde Dec 5, 2023
4aaec91
fix(script): revert to the correct disk search logic
gustavovalverde Dec 5, 2023
a5bd70d
fix(scripts)
gustavovalverde Dec 5, 2023
4ea5fb4
fix(scripts): use correct NETWORK with lowercase
gustavovalverde Dec 5, 2023
29719ca
fix: typo
gustavovalverde Dec 5, 2023
9fd7769
Merge branch 'main' into ref-workflow-scripts
oxarbitrage Dec 5, 2023
e8dfa7f
fix(script): wrong variable assignment
gustavovalverde Dec 5, 2023
f386beb
fix(script): use correct return values inside a function
gustavovalverde Dec 5, 2023
afddaaa
fix(script): fix value assigment
gustavovalverde Dec 5, 2023
e839b49
test: debug
gustavovalverde Dec 5, 2023
6457786
fix(script): make disk conditions simpler
gustavovalverde Dec 6, 2023
910c12a
fix(script): export variables to the `shell` executing the script
gustavovalverde Dec 6, 2023
b9c7551
fix(script): do not fail on expected unbound variables
gustavovalverde Dec 6, 2023
a96ae3d
test: output
gustavovalverde Dec 8, 2023
b519981
fix(scripts): do not `echo` a variable more than once
gustavovalverde Dec 8, 2023
516fe1c
fix(scripts): typo
gustavovalverde Dec 8, 2023
263f002
Merge branch 'main' into ref-workflow-scripts
gustavovalverde Dec 11, 2023
4978352
docs(workflow): adds a description at the top of each file (#8009)
gustavovalverde Dec 12, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
162 changes: 15 additions & 147 deletions .github/workflows/chore-delete-gcp-resources.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,13 @@
# TODO: rename this action name and filename to Delete infra resources
# This workflow is designed to delete old Google Cloud Platform (GCP) resources to save on costs.
#
# 1. Deletes specific instances in GCP older than a defined number of days.
# 2. Deletes instance templates older than a set number of days.
# 3. Deletes older disks not currently in use, with certain ones prefixed by commit hashes or "zebrad-".
# 4. Deletes cache images from GCP, retaining a specified number of the latest images for certain types like zebrad checkpoint cache, zebrad tip cache, and lightwalletd + zebrad tip cache.
# 5. Deletes unused artifacts from Google Artifact Registry older than a defined number of hours while retaining the latest few.
#
# It uses the gcloud CLI for most of its operations and also leverages specific GitHub Actions like the gcr-cleaner for deleting old images from the Google Artifact Registry.
# The workflow is scheduled to run daily at 0700 UTC.
name: Delete GCP resources

on:
Expand Down Expand Up @@ -56,29 +65,11 @@ jobs:
# so it can't be shell-quoted.
- name: Delete old instances
run: |
DELETE_BEFORE_DATE=$(date --date="$DELETE_INSTANCE_DAYS days ago" '+%Y%m%d')

IFS=$'\n'
INSTANCES=$(gcloud compute instances list --sort-by=creationTimestamp --filter="name~-[0-9a-f]{7,}$ AND creationTimestamp < $DELETE_BEFORE_DATE" --format='value(NAME,ZONE)' | \
sed 's/\(.*\)\t\(.*\)/\1 --zone=\2/')

for INSTANCE_AND_ZONE in $INSTANCES
do
IFS=$' '
gcloud compute instances delete --verbosity=info ${INSTANCE_AND_ZONE} --delete-disks=all || continue
IFS=$'\n'
done

./.github/workflows/scripts/gcp-delete-old-instances.sh
# Deletes all the instance templates older than $DELETE_AGE_DAYS days.
- name: Delete old instance templates
run: |
DELETE_BEFORE_DATE=$(date --date="$DELETE_AGE_DAYS days ago" '+%Y%m%d')
TEMPLATES=$(gcloud compute instance-templates list --sort-by=creationTimestamp --filter="name~-[0-9a-f]{7,}$ AND creationTimestamp < $DELETE_BEFORE_DATE" --format='value(NAME)')

for TEMPLATE in $TEMPLATES
do
gcloud compute instance-templates delete "${TEMPLATE}" || continue
done
./.github/workflows/scripts/gcp-delete-old-templates.sh

# Deletes all mainnet and testnet disks older than $DELETE_AGE_DAYS days.
#
Expand All @@ -89,31 +80,7 @@ jobs:
# so it can't be shell-quoted.
- name: Delete old disks
run: |
DELETE_BEFORE_DATE=$(date --date="$DELETE_AGE_DAYS days ago" '+%Y%m%d')

IFS=$'\n'
# Disks created by PR jobs, and other jobs that use a commit hash
COMMIT_DISKS=$(gcloud compute disks list --sort-by=creationTimestamp --filter="name~-[0-9a-f]{7,}$ AND creationTimestamp < $DELETE_BEFORE_DATE" --format='value(NAME,LOCATION,LOCATION_SCOPE)' | \
sed 's/\(.*\)\t\(.*\)\t\(.*\)/\1 --\3=\2/')

for DISK_AND_LOCATION in $COMMIT_DISKS
do
IFS=$' '
gcloud compute disks delete --verbosity=info ${DISK_AND_LOCATION} || continue
IFS=$'\n'
done

IFS=$'\n'
# Disks created by managed instance groups, and other jobs that start with "zebrad-"
ZEBRAD_DISKS=$(gcloud compute disks list --sort-by=creationTimestamp --filter="name~^zebrad- AND creationTimestamp < $DELETE_BEFORE_DATE" --format='value(NAME,LOCATION,LOCATION_SCOPE)' | \
sed 's/\(.*\)\t\(.*\)\t\(.*\)/\1 --\3=\2/')

for DISK_AND_LOCATION in $ZEBRAD_DISKS
do
IFS=$' '
gcloud compute disks delete --verbosity=info ${DISK_AND_LOCATION} || continue
IFS=$'\n'
done
./.github/workflows/scripts/gcp-delete-old-disks.sh

# Deletes mainnet and testnet cache images older than $DELETE_AGE_DAYS days.
#
Expand All @@ -125,108 +92,9 @@ jobs:
#
# TODO:
# - refactor out repeated shell script code
- name: Delete old cache disks
- name: Delete old cache images
run: |
DELETE_BEFORE_DATE=$(date --date="$DELETE_AGE_DAYS days ago" '+%Y%m%d')

# As of April 2023, these disk names look like:
# zebrad-cache-6039-merge-62c8ecc-v25-mainnet-checkpoint-053559
#
# Mainnet zebrad checkpoint
ZEBRAD_MAINNET_CHECKPOINT_IMAGES=$(gcloud compute images list --sort-by=~creationTimestamp --filter="name~^zebrad-cache-.*-mainnet-checkpoint AND creationTimestamp < $DELETE_BEFORE_DATE" --format='value(NAME)')
KEPT_IMAGES=0
for IMAGE in $ZEBRAD_MAINNET_CHECKPOINT_IMAGES
do
if [[ "$KEPT_IMAGES" -lt "$KEEP_LATEST_IMAGE_COUNT" ]];
then
KEPT_IMAGES=$((KEPT_IMAGES+1))
echo "Keeping image $KEPT_IMAGES named $IMAGE"
continue
fi

gcloud compute images delete "${IMAGE}" || continue
done

# Testnet zebrad checkpoint
ZEBRAD_TESTNET_CHECKPOINT_IMAGES=$(gcloud compute images list --sort-by=~creationTimestamp --filter="name~^zebrad-cache-.*-testnet-checkpoint AND creationTimestamp < $DELETE_BEFORE_DATE" --format='value(NAME)')
KEPT_IMAGES=0
for IMAGE in $ZEBRAD_TESTNET_CHECKPOINT_IMAGES
do
if [[ "$KEPT_IMAGES" -lt "$KEEP_LATEST_IMAGE_COUNT" ]];
then
KEPT_IMAGES=$((KEPT_IMAGES+1))
echo "Keeping image $KEPT_IMAGES named $IMAGE"
continue
fi

gcloud compute images delete "${IMAGE}" || continue
done

# As of April 2023, these disk names look like:
# zebrad-cache-6556-merge-a2ca4de-v25-mainnet-tip(-u)?-140654
#
# Mainnet zebrad tip
ZEBRAD_MAINNET_TIP_IMAGES=$(gcloud compute images list --sort-by=~creationTimestamp --filter="name~^zebrad-cache-.*-mainnet-tip AND creationTimestamp < $DELETE_BEFORE_DATE" --format='value(NAME)')
KEPT_IMAGES=0
for IMAGE in $ZEBRAD_MAINNET_TIP_IMAGES
do
if [[ "$KEPT_IMAGES" -lt "$KEEP_LATEST_IMAGE_COUNT" ]];
then
KEPT_IMAGES=$((KEPT_IMAGES+1))
echo "Keeping image $KEPT_IMAGES named $IMAGE"
continue
fi

gcloud compute images delete "${IMAGE}" || continue
done

# Testnet zebrad tip
ZEBRAD_TESTNET_TIP_IMAGES=$(gcloud compute images list --sort-by=~creationTimestamp --filter="name~^zebrad-cache-.*-testnet-tip AND creationTimestamp < $DELETE_BEFORE_DATE" --format='value(NAME)')
KEPT_IMAGES=0
for IMAGE in $ZEBRAD_TESTNET_TIP_IMAGES
do
if [[ "$KEPT_IMAGES" -lt "$KEEP_LATEST_IMAGE_COUNT" ]];
then
KEPT_IMAGES=$((KEPT_IMAGES+1))
echo "Keeping image $KEPT_IMAGES named $IMAGE"
continue
fi

gcloud compute images delete "${IMAGE}" || continue
done

# As of April 2023, these disk names look like:
# lwd-cache-main-fb3fec0-v25-mainnet-tip(-u)?-061314
#
# Mainnet lightwalletd tip
LWD_MAINNET_TIP_IMAGES=$(gcloud compute images list --sort-by=~creationTimestamp --filter="name~^lwd-cache-.*-mainnet-tip AND creationTimestamp < $DELETE_BEFORE_DATE" --format='value(NAME)')
KEPT_IMAGES=0
for IMAGE in $LWD_MAINNET_TIP_IMAGES
do
if [[ "$KEPT_IMAGES" -lt "$KEEP_LATEST_IMAGE_COUNT" ]];
then
KEPT_IMAGES=$((KEPT_IMAGES+1))
echo "Keeping image $KEPT_IMAGES named $IMAGE"
continue
fi

gcloud compute images delete "${IMAGE}" || continue
done

# Testnet lightwalletd tip
LWD_TESTNET_TIP_IMAGES=$(gcloud compute images list --sort-by=~creationTimestamp --filter="name~^lwd-cache-.*-testnet-tip AND creationTimestamp < $DELETE_BEFORE_DATE" --format='value(NAME)')
KEPT_IMAGES=0
for IMAGE in $LWD_TESTNET_TIP_IMAGES
do
if [[ "$KEPT_IMAGES" -lt "$KEEP_LATEST_IMAGE_COUNT" ]];
then
KEPT_IMAGES=$((KEPT_IMAGES+1))
echo "Keeping image $KEPT_IMAGES named $IMAGE"
continue
fi

gcloud compute images delete "${IMAGE}" || continue
done
./.github/workflows/scripts/gcp-delete-old-cache-images.sh

# We're using a generic approach here, which allows multiple registries to be included,
# even those not related to GCP. Enough reason to create a separate job.
Expand Down
19 changes: 1 addition & 18 deletions .github/workflows/release-crates-io.yml
Original file line number Diff line number Diff line change
Expand Up @@ -93,26 +93,9 @@ jobs:
#
# These steps should be kept up to date with the release checklist.
#
# TODO: move these steps into a script which is run in the release checklist and CI
- name: Crate release dry run
run: |
set -ex
git config --global user.email "[email protected]"
git config --global user.name "Automated Release Test"
# This script must be the same as:
# https://github.com/ZcashFoundation/zebra/blob/main/.github/PULL_REQUEST_TEMPLATE/release-checklist.md#update-crate-versions
# with an extra `--no-confirm` argument for non-interactive testing.
cargo release version --verbose --execute --no-confirm --allow-branch '*' --workspace --exclude zebrad beta
cargo release version --verbose --execute --no-confirm --allow-branch '*' --package zebrad patch
cargo release replace --verbose --execute --no-confirm --allow-branch '*' --package zebrad
cargo release commit --verbose --execute --no-confirm --allow-branch '*'
# Check the release will work using a dry run
#
# Workaround unpublished dependency version errors by skipping those crates:
# https://github.com/crate-ci/cargo-release/issues/691
#
# TODO: check all crates after fixing these errors
cargo release publish --verbose --dry-run --allow-branch '*' --workspace --exclude zebra-consensus --exclude zebra-utils --exclude zebrad
./.github/workflows/scripts/release-crates-dry-run.sh

# TODO: actually do the release here
#release-crates:
Expand Down
45 changes: 45 additions & 0 deletions .github/workflows/scripts/gcp-delete-old-cache-images.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
#!/usr/bin/env bash

# Function to handle image deletion logic
delete_images() {
local image_type="$1"
local filter="$2"
local kept_images=0

echo "Processing ${image_type} images"
images=$(gcloud compute images list --sort-by=~creationTimestamp --filter="${filter} AND creationTimestamp < ${DELETE_BEFORE_DATE}" --format='value(NAME)')

for image in ${images}; do
if [[ "${kept_images}" -lt "${KEEP_LATEST_IMAGE_COUNT}" ]]; then
((kept_images++))
echo "Keeping image ${kept_images} named ${image}"
else
echo "Deleting image: ${image}"
gcloud compute images delete "${image}" || echo "Failed to delete image: ${image}"
fi
done
}

# Check if necessary variables are set
if ! [[ "${DELETE_AGE_DAYS}" =~ ^[0-9]+$ && "${KEEP_LATEST_IMAGE_COUNT}" =~ ^[0-9]+$ ]]; then
echo "ERROR: One or more required variables are not set or not numeric"
exit 1
fi

# Set pipefail
set -o pipefail

# Calculate the date before which images should be deleted
DELETE_BEFORE_DATE=$(date --date="${DELETE_AGE_DAYS} days ago" '+%Y%m%d')

# Mainnet and Testnet zebrad checkpoint
delete_images "Mainnet zebrad checkpoint" "name~^zebrad-cache-.*-mainnet-checkpoint" # As of April 2023, these disk names look like: zebrad-cache-6556-merge-a2ca4de-v25-mainnet-tip(-u)?-140654
delete_images "Testnet zebrad checkpoint" "name~^zebrad-cache-.*-testnet-checkpoint"

# Mainnet and Testnet zebrad tip
delete_images "Mainnet zebrad tip" "name~^zebrad-cache-.*-mainnet-tip"
delete_images "Testnet zebrad tip" "name~^zebrad-cache-.*-testnet-tip"

# Mainnet and Testnet lightwalletd tip
delete_images "Mainnet lightwalletd tip" "name~^lwd-cache-.*-mainnet-tip"
delete_images "Testnet lightwalletd tip" "name~^lwd-cache-.*-testnet-tip"
49 changes: 49 additions & 0 deletions .github/workflows/scripts/gcp-delete-old-disks.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
#!/usr/bin/env bash

# Check if DELETE_AGE_DAYS is set and is a number
if ! [[ "${DELETE_AGE_DAYS}" =~ ^[0-9]+$ ]]; then
echo "ERROR: DELETE_AGE_DAYS is not set or not a number"
exit 1
fi

# Set pipefail to catch errors in pipelines
set -o pipefail

# Calculate the date before which disks should be deleted
DELETE_BEFORE_DATE=$(date --date="${DELETE_AGE_DAYS} days ago" '+%Y%m%d')

# Fetch disks created by PR jobs, and other jobs that use a commit hash
if ! COMMIT_DISKS=$(gcloud compute disks list --sort-by=creationTimestamp --filter="name~-[0-9a-f]{7,}$ AND creationTimestamp < ${DELETE_BEFORE_DATE}" --format='value(NAME,LOCATION,LOCATION_SCOPE)' | sed 's/\(.*\)\t\(.*\)\t\(.*\)/\1 --\3=\2/'); then
echo "Error fetching COMMIT_DISKS."
exit 1
fi

# Delete commit disks if any are found
IFS=$'\n'
for DISK_AND_LOCATION in ${COMMIT_DISKS}; do
IFS=$' '
echo "Deleting disk: ${DISK_AND_LOCATION}"
if ! gcloud compute disks delete --verbosity=info "${DISK_AND_LOCATION}"; then
echo "Failed to delete disk: ${DISK_AND_LOCATION}"
fi
IFS=$'\n'
done
IFS=$' \t\n' # Reset IFS to its default value

# Fetch disks created by managed instance groups, and other jobs that start with "zebrad-"
if ! ZEBRAD_DISKS=$(gcloud compute disks list --sort-by=creationTimestamp --filter="name~^zebrad- AND creationTimestamp < ${DELETE_BEFORE_DATE}" --format='value(NAME,LOCATION,LOCATION_SCOPE)' | sed 's/\(.*\)\t\(.*\)\t\(.*\)/\1 --\3=\2/'); then
echo "Error fetching ZEBRAD_DISKS."
exit 1
fi

# Delete zebrad disks if any are found
IFS=$'\n'
for DISK_AND_LOCATION in ${ZEBRAD_DISKS}; do
IFS=$' '
echo "Deleting disk: ${DISK_AND_LOCATION}"
if ! gcloud compute disks delete --verbosity=info "${DISK_AND_LOCATION}"; then
echo "Failed to delete disk: ${DISK_AND_LOCATION}"
fi
IFS=$'\n'
done
IFS=$' \t\n' # Reset IFS to its default value
42 changes: 42 additions & 0 deletions .github/workflows/scripts/gcp-delete-old-instances.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
#!/usr/bin/env bash

# Check if DELETE_INSTANCE_DAYS is set and is a number
if ! [[ "${DELETE_INSTANCE_DAYS}" =~ ^[0-9]+$ ]]; then
echo "ERROR: DELETE_INSTANCE_DAYS is not set or not a number"
exit 1
fi

# Set pipefail to catch errors in pipelines
set -o pipefail

# Calculate the date before which instances should be deleted
DELETE_BEFORE_DATE=$(date --date="${DELETE_INSTANCE_DAYS} days ago" '+%Y%m%d')

# Check if gcloud command is available
if ! command -v gcloud &> /dev/null; then
echo "ERROR: gcloud command not found"
exit 1
fi

# Fetch the list of instances to delete
if ! INSTANCES=$(gcloud compute instances list --sort-by=creationTimestamp --filter="name~-[0-9a-f]{7,}$ AND creationTimestamp < ${DELETE_BEFORE_DATE}" --format='value(NAME,ZONE)' | sed 's/\(.*\)\t\(.*\)/\1 --zone=\2/'); then
echo "Error fetching instances."
exit 1
fi

# Delete instances if any are found
if [[ -n "${INSTANCES}" ]]; then
IFS=$'\n'
for INSTANCE_AND_ZONE in ${INSTANCES}; do
IFS=$' '
echo "Deleting instance: ${INSTANCE_AND_ZONE}"
gcloud compute instances delete --verbosity=info "${INSTANCE_AND_ZONE}" --delete-disks=all || {
echo "Failed to delete instance: ${INSTANCE_AND_ZONE}"
continue
}
IFS=$'\n'
done
IFS=$' \t\n' # Reset IFS to its default value
else
echo "No instances to delete."
fi
33 changes: 33 additions & 0 deletions .github/workflows/scripts/gcp-delete-old-templates.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
#!/usr/bin/env bash

# Check if DELETE_AGE_DAYS is set and is a number
if ! [[ "${DELETE_AGE_DAYS}" =~ ^[0-9]+$ ]]; then
echo "ERROR: DELETE_AGE_DAYS is not set or not a number"
exit 1
fi

# Set pipefail to catch errors in pipelines
set -o pipefail

# Calculate the date before which templates should be deleted
DELETE_BEFORE_DATE=$(date --date="${DELETE_AGE_DAYS} days ago" '+%Y%m%d')

# Check if gcloud command is available
if ! command -v gcloud &> /dev/null; then
echo "ERROR: gcloud command not found"
exit 1
fi

# Fetch the list of instance templates to delete
if ! TEMPLATES=$(gcloud compute instance-templates list --sort-by=creationTimestamp --filter="name~-[0-9a-f]{7,}$ AND creationTimestamp < ${DELETE_BEFORE_DATE}" --format='value(NAME)'); then
echo "Error fetching instance templates."
exit 1
fi

# Delete templates if any are found
for TEMPLATE in ${TEMPLATES}; do
echo "Deleting template: ${TEMPLATE}"
if ! gcloud compute instance-templates delete "${TEMPLATE}"; then
echo "Failed to delete template: ${TEMPLATE}"
fi
done
Loading
Loading