Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flakey TestManifestPulledDoesNotDependOnContainerOrdering integration test #4303

Merged
merged 1 commit into from
Aug 20, 2024

Conversation

tinnywang
Copy link
Contributor

Summary

Fix flakey TestManifestPulledDoesNotDependOnContainerOrdering integration test.

--- FAIL: TestManifestPulledDoesNotDependOnContainerOrdering (23.82s)
    --- FAIL: TestManifestPulledDoesNotDependOnContainerOrdering/0 (6.14s)
        engine_integ_test.go:650: 
            	Error Trace:	/opt/amazon-ecs-agent/go/src/github.com/aws/amazon-ecs-agent/agent/engine/engine_integ_test.go:650
            	Error:      	Not equal: 
            	            	expected: 4
            	            	actual  : 6
            	Test:       	TestManifestPulledDoesNotDependOnContainerOrdering/0

The test runs a task with two containers and asserts that the first container is in the RUNNING state. Occasionally, the first container exits and enters the STOPPED state before the assertion is made, so the test fails.

first := createTestContainerWithImageAndName(testRegistryImage, "first")
first.Command = []string{"sh", "-c", "sleep 60"}

The container appears to run for 60 seconds, but it actually dies almost immediately after starting because it's using our netkitten image, which does not accept sh -c "sleep 60" as its command.

Reproducing this locally yields

> make netkitten
> docker run amazon/amazon-ecs-netkitten:make sh -c "sleep 60"
2024/08/20 00:40:42 Error connecting to target: dial tcp: address sh: missing port in address

Implementation details

Change the container command to -loop=true, which will keep the container running indefinitely until it's told to stop.

// GetLongRunningCommand returns the command that keeps the container running for the container
// that uses the default integ test image (amazon/amazon-ecs-netkitten for unix)
func GetLongRunningCommand() []string {
return []string{"-loop=true"}
}

Testing

Ran the flakey test 50 times and it completed successfully.

go test -tags integration ./agent/engine -count 50 -v -run TestManifestPulledDoesNotDependOnContainerOrdering

New tests cover the changes: no

Licensing

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@tinnywang tinnywang marked this pull request as ready for review August 20, 2024 01:50
@tinnywang tinnywang requested a review from a team as a code owner August 20, 2024 01:50
@tinnywang tinnywang merged commit 68d868a into aws:dev Aug 20, 2024
40 checks passed
@tinnywang tinnywang deleted the fix_flakey_integ_test branch August 20, 2024 17:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants