Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to recover from failed deploy when using ASG lifecycle scripts? #42

Open
attekei opened this issue May 22, 2016 · 4 comments
Open
Labels

Comments

@attekei
Copy link

attekei commented May 22, 2016

We just started using CodeDeploy with ASG and ELB, and we achieve zero downtime when we use register_with_elb.sh and deregister_from_elb.sh. Today I simulated a situation where a deploy step before deregister_from_elb.sh fails. In that case, the server instance in ASG stays in Standby mode and the capacity of ASG is decreased by one because deregister_from_elb.sh is never executed.

Currently, I have to manually increase the ASG capacity back to normal and move the server instance out of Standby. This isn't acceptable. Is there a way to do proper cleanup automatically after a failed deploy?

The only way that came to my mind is to add error handling logic to every deploy script I run between register_with_elb.sh and deregister_from_elb.sh scripts.

@yyolk
Copy link

yyolk commented May 23, 2016

Not sure if this helps, but I've also experienced this issue in the past when a code revision is cancelled or fails.

@attekei
Copy link
Author

attekei commented May 25, 2016

@yyolk Interesting! I think that CodeDeploy should have a more general mechanism for running a cleanup script after an error in a deploy.

Actually that kind of functionality could be implemented to aws-codedeploy-agent. I could do a feature request there or even do an experimental pull request if it would help to prioritize the feature.

One remotely similar mechanism is the rescue block in Ansible:
http://docs.ansible.com/ansible/playbooks_blocks.html

@amartyag
Copy link

Hi,

I have added a cleanup hook in our feature requests.

Thanks,
Amartya

@Jmcfar Jmcfar added the bug label Jul 5, 2016
@schmohlio
Copy link

schmohlio commented Apr 19, 2017

is there a reason you can't run deregister_from_elb.sh firstly in your ApplicationStart, guaranteeing that it happens first? did you mean register_from_elb.sh?

In the case of rollback can't you make your ApplicationStop and ApplicationStart scripts idempotent?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants