-
Notifications
You must be signed in to change notification settings - Fork 617
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add read/write lock in the fault handler #4309
Conversation
af3afb4
to
5589a96
Compare
0f2cd58
to
02cb576
Compare
agent/handlers/v4/tmdsstate.go
Outdated
@@ -161,7 +161,7 @@ func (s *TMDSAgentState) getTaskMetadata(v3EndpointID string, includeTags bool) | |||
Path: task.GetNetworkNamespace(), | |||
NetworkInterfaces: []*tmdsv4.NetworkInterface{ | |||
{ | |||
DeviceName: "", | |||
DeviceName: "ethx", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some context:
The DeviceName
is required for network latency / network packet loss faults. We are exposing this information (and other task network config) via AgentState to facilitate the fault injection handler to start/stop/check network fault.
Why we need a non-empty value here which is fake: because in this PR, we introduce a new check (link) to see if the deviceName is empty. So if that's empty, these unit test will fail.
@@ -416,11 +473,17 @@ func (h *FaultHandler) CheckNetworkPacketLoss() func(http.ResponseWriter, *http. | |||
|
|||
// Obtain the task metadata via the endpoint container ID | |||
// TODO: Will be using the returned task metadata in a future PR | |||
_, err = validateTaskMetadata(w, h.AgentState, requestType, r) | |||
taskMetadata, err := validateTaskMetadata(w, h.AgentState, requestType, r) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this in all check methods? Since we're only calling tc q
and check if "kind":"netem"
exists, we don't need to know the network namespace path
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for awsvpc mode, all commands should be running in the given task network namespace. So we also need this check here.
Summary
This PR is to add a field / RWMutex in the fault handler which can help to avoid race condition where multiple clients would manipulate the same network resource at the same time.
Implementation details
Testing
New tests cover the changes:
No. Existing unit test will cover this.
Description for the changelog
Feature: Add RWMutex in the fault handler.
Additional Information
Does this PR include breaking model changes? If so, Have you added transformation functions?
Does this PR include the addition of new environment variables in the README?
Licensing
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.