-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CONFIG-329][CONFIG-330] End to end snapshot integrity check #124
Conversation
370ddb7
to
a09dc8d
Compare
Testing section in the PR description is updated with code changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work 🔥
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! Super minor nit about CheckSum -> Checksum, otherwise, let's 🚢 🇮🇹
Description
Why
Initiated by a recent novel error from ctlstore-reflector
exec dml statement error: database disk image is malformed
we suspect the culprit might be the s5cmd thing that it either:In both cases k8s's retry mechanism of the init container won't get triggered. So we introduce an end-to-end integrity check to have the container bail out with a failure exit code.
Testing
Testing completed successfully in
stage-euw1:eu-west-1:centrifuge-destinations/ctlstore
:By enforcing a false negative, the init container will exit with 1. But because if a Pod's init container fails, the kubelet repeatedly restarts that init container until it succeeds