Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Flaky Test]: <TestConnInfoConnCloseThenAnotherConn> – failed to start connection credentials listener: listen unix ...: bind: invalid argument #6977

Open
belimawr opened this issue Feb 21, 2025 · 2 comments
Labels
flaky-test Unstable or unreliable test cases. Team:Elastic-Agent Label for the Agent team

Comments

@belimawr
Copy link
Contributor

belimawr commented Feb 21, 2025

Failing test case

TestConnInfoConnCloseThenAnotherConn

Error message

conn_info_server_test.go:190: failed to start connection credentials listener: listen unix /var/folders/1w/pb98dgl15sd6jdcx2yy6j2500000gn/T/TestConnInfoConnCloseThenAnotherConn3117723250/001/.teaci.sock: bind: invalid argument

Build

Local run of the tests

OS

Linux, Mac

Stacktrace and notes

This test is failing because the test creates a unix socket on a path that can exceed the maximum size for the OS.

man unix on darwin says:

UNIX-domain addresses are variable-length filesystem pathnames of at most 104 characters.

On the listed example the path is 111 characters long.

The issue can also affect Linux:

$ cat /usr/include/linux/un.h | grep "define UNIX_PATH_MAX"
#define UNIX_PATH_MAX   108

The test usually passes on Linux because the generated path is < 100 characters. E.g: /tmp/TestConnInfoConnCloseThenAnotherConn233963498/001/.teaci.sock

tiago@Not-A-Linux~/devel/elastic-agent/pkg/component/runtime % go test -run=TestConnInfoConnCloseThenAnotherConn -v -count=1
=== RUN   TestConnInfoConnCloseThenAnotherConn
=== RUN   TestConnInfoConnCloseThenAnotherConn/port
=== RUN   TestConnInfoConnCloseThenAnotherConn/local
    conn_info_server_test.go:190: failed to start connection credentials listener: listen unix /var/folders/1w/pb98dgl15sd6jdcx2yy6j2500000gn/T/TestConnInfoConnCloseThenAnotherConn3117723250/001/.teaci.sock: bind: invalid argument
--- FAIL: TestConnInfoConnCloseThenAnotherConn (0.04s)
    --- PASS: TestConnInfoConnCloseThenAnotherConn/port (0.00s)
    --- FAIL: TestConnInfoConnCloseThenAnotherConn/local (0.04s)
FAIL
exit status 1
FAIL    github.com/elastic/elastic-agent/pkg/component/runtime  0.456s
1:WARN tiago@Not-A-Linux~/devel/elastic-agent/pkg/component/runtime %

The problem comes from the getAddress called by runTests function that generates the address for the sockets without doing any length validation.

func getAddress(dir string, isLocal bool) string {
if isLocal {
u := url.URL{}
u.Path = "/"
if runtime.GOOS == "windows" {
u.Scheme = "npipe"
return u.JoinPath("/", testSock).String()
}
u.Scheme = "unix"
return u.JoinPath(dir, testSock).String()
}
return fmt.Sprintf("127.0.0.1:%d", testPort)
}

func runTests(t *testing.T, fn func(*testing.T, string)) {
sockdir := t.TempDir()
tests := []struct {
name string
address string
}{
{
name: "port",
address: getAddress("", false),
},
{
name: "local",
address: getAddress(sockdir, true),
},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
fn(t, tc.address)
})
}
}

@belimawr belimawr added flaky-test Unstable or unreliable test cases. Team:Elastic-Agent Label for the Agent team labels Feb 21, 2025
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

@belimawr
Copy link
Contributor Author

belimawr commented Feb 21, 2025

We already have some code to handle socket paths that are too long:

Using it in our tests should fix the problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flaky-test Unstable or unreliable test cases. Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

No branches or pull requests

2 participants