-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug fix for inter-node traffic #339
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #339 +/- ##
==========================================
- Coverage 72.87% 72.73% -0.14%
==========================================
Files 19 19
Lines 2853 2861 +8
==========================================
+ Hits 2079 2081 +2
- Misses 602 605 +3
- Partials 172 175 +3
Flags with carried forward coverage won't be shown. Click here to find out more.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you rebase main?
pkg/intermediate/aggregate.go
Outdated
// In this case, the record from the conntrack table will be considered to do correlation job and the ReadyToSend | ||
// is false until we receive another record from PacketIn from another node. However, the record From PacketIN | ||
// is not correlationRequired, so we won't send this kind of connection due to this conflict. | ||
aggregationRecord.ReadyToSend = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see similar logic later in the function, can we unify?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I've moved it to the bottom and set ReadyToSend and areCorrelatedFieldsFilled to true if the flow is not correlationRequired.
pkg/intermediate/aggregate.go
Outdated
// For inter-node traffic with deny np or drop anp, we may receive the record from the conntrack table first. | ||
// In this case, the record from the conntrack table will be considered to do correlation job and the ReadyToSend | ||
// is false until we receive another record from PacketIn from another node. However, the record From PacketIN | ||
// is not correlationRequired, so we won't send this kind of connection due to this conflict. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully understand what's going on and I'll probably defer to @heanlan and @dreamtalen, but I do not like that the comment is so specific to the Antrea FE implementation, with mentions of conntrack and PacketIn. While I know that this code is pretty specific to Antrea, this seems too specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the comment. Thanks.
11cda21
to
9d8d3ac
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully understand the code changes. Based on my understanding of the PR description, the conntrack records without deny NP metadata should not be sent out by FA, right? However, their correlationRequired
are true, and they will be marked as ReadyToSend
in your code, which contradicts with the PR purpose?
// For flows that do not need correlation, ReadyToSend should always be true.
if !correlationRequired {
aggregationRecord.ReadyToSend = true
}
// For intra-node and external traffic, areCorrelatedFieldsFilled is always true. | ||
// For inter-node with allow np/anp action, its areExternalFieldsFilled will be set to true once the correlation job is finished. | ||
if flowType != registry.FlowTypeInterNode { | ||
aggregationRecord.areCorrelatedFieldsFilled = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like areCorrelatedFieldsFilled
is only being used in UT? If that's the case, do we still want to keep it? maybe I missed something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be set to true in line#355. And areCorrelatedFieldsFilled will be used in the flowaggregator.go in Antrea repo as well.
No, for deny NP/or drop ANP, we will receive two records. (one is from PacketIn [correlationRequired false] and the other one is from the conntrack table [correlationRequired true]) I think the logic should be:
|
0ac4515
to
847c6bc
Compare
051c676
to
f369d83
Compare
I think I made a mistake. For inter-node traffic with egress/ingress np with action drop, we will receive records from PacketIn and the conntrack table. For the ingress case, there is no issue as we will receive the records from both nodes and these two records are both correlationRequired. For the egress case, we can overwrite the existing record if it is from the conntrack table. But I think it would be enough to change the order where we append expired records before exporting them from the exporter. And the chance that a record from the conntrack will be exported earlier than a record from PacketIn is very low. For these reasons, I think we can close this PR as these changes might not be really helpful. |
In this commit, we do: 1. Set ReadyToSend to true for flows don’t need correlation. 2. Set areCorrelatedFieldsFilled to true for flows don’t belong to inter-node traffic. For flows need to do correlation, its areCorrelatedFieldsFilled will be set to true once the correlation job is finished. Signed-off-by: Yun-Tang Hsu <[email protected]>
@yuntanghsu I want to confirm my understanding with you, because I feel like I am missing something.
Let me know if theses statements don't match your understanding or observations. |
If the traffic is not a service traffic. That's correct. But I observed we will receive records from PacketIn and the conntrack table for a service traffic. (Observe this behavior for the egress case only) I1218 21:30:37.187849 1 deny_connections.go:115] "New deny connection added" connection={"ID":0,"Timeout":0,"StartTime":"2023-12-18T21:30:37.187791036Z","StopTime":"2023-12-18T21:30:37.187791036Z","LastExportTime":"2023-12-18T21:30:37.187791036Z","IsActive":true,"IsPresent":false,"ReadyToDelete":false,"Zone":0,"Mark":19,"StatusFlag":0,"Labels":null,"LabelsMask":null,"FlowKey":{"SourceAddress":"10.244.0.7","DestinationAddress":"10.244.1.8","Protocol":6,"SourcePort":57252,"DestinationPort":5201},"OriginalPackets":1,"OriginalBytes":60,"SourcePodNamespace":"testflowaggregator-0e73pgnx","SourcePodName":"perftest-a","DestinationPodNamespace":"","DestinationPodName":"","DestinationServicePortName":"testflowaggregator-0e73pgnx/perftest-e:","OriginalDestinationAddress":"10.96.176.70","OriginalDestinationPort":9999,"IngressNetworkPolicyName":"","IngressNetworkPolicyNamespace":"","IngressNetworkPolicyType":0,"IngressNetworkPolicyRuleName":"","IngressNetworkPolicyRuleAction":0,"EgressNetworkPolicyName":"test-flow-aggregator-anp-egress-drop","EgressNetworkPolicyNamespace":"testflowaggregator-0e73pgnx","EgressNetworkPolicyType":2,"EgressNetworkPolicyRuleName":"test-egress-rule-name","EgressNetworkPolicyRuleAction":2,"PrevPackets":0,"PrevBytes":0,"ReversePackets":0,"ReverseBytes":0,"PrevReversePackets":0,"PrevReverseBytes":0,"TCPState":"","PrevTCPState":"","FlowType":0,"EgressName":"","EgressIP":""} I1218 21:30:38.106123 1 conntrack_connections.go:284] "New Antrea flow added" connection={"ID":2467408996,"Timeout":119,"StartTime":"2023-12-18T21:30:37.187206191Z","StopTime":"2023-12-18T21:30:38.097239785Z","LastExportTime":"2023-12-18T21:30:37.187206191Z","IsActive":true,"IsPresent":true,"ReadyToDelete":false,"Zone":65520,"Mark":19,"StatusFlag":424,"Labels":null,"LabelsMask":null,"FlowKey":{"SourceAddress":"10.244.0.7","DestinationAddress":"10.244.1.8","Protocol":6,"SourcePort":57252,"DestinationPort":5201},"OriginalPackets":1,"OriginalBytes":60,"SourcePodNamespace":"testflowaggregator-0e73pgnx","SourcePodName":"perftest-a","DestinationPodNamespace":"","DestinationPodName":"","DestinationServicePortName":"testflowaggregator-0e73pgnx/perftest-e:","OriginalDestinationAddress":"10.96.176.70","OriginalDestinationPort":9999,"IngressNetworkPolicyName":"","IngressNetworkPolicyNamespace":"","IngressNetworkPolicyType":0,"IngressNetworkPolicyRuleName":"","IngressNetworkPolicyRuleAction":0,"EgressNetworkPolicyName":"","EgressNetworkPolicyNamespace":"","EgressNetworkPolicyType":0,"EgressNetworkPolicyRuleName":"","EgressNetworkPolicyRuleAction":0,"PrevPackets":0,"PrevBytes":0,"ReversePackets":0,"ReverseBytes":0,"PrevReversePackets":0,"PrevReverseBytes":0,"TCPState":"SYN_SENT","PrevTCPState":"","FlowType":0,"EgressName":"","EgressIP":""} |
@yuntanghsu thanks, you are correct. For Service traffic, we will do DNAT (which commits to CT) before policy enforcement in Antrea. Although I wonder if this is the best approach, maybe I will bring it up at the next community meeting. For ingress drop, I assume my understanding is correct? With regards to your PR, I don't know what the best approach is. To me it seems that we should avoid exporting 2 separate records from the same Node in the FlowExporter in the first place. It is not right IMO to have the same connection in both connection stores. Maybe @heanlan and @dreamtalen have some inputs here. |
Yes, that's correct for non-service and service traffic. I agree with you. I think the best solution approach is avoiding exporting the record in the conntrack in this case. I think the other acceptable solution is changing the order where we append expired records before exporting them from our exporter. We should prioritize the record in the deny connection store and I think the chance that a record from the conntrack will be exported earlier than a record from PacketIn is very low.
I think it's better to keep it? This kind of connection is still useful for detecting DoS attacks? |
@heanlan could you release vmware/go-ipfix 0.9.0, with Yun-Tang's improvements to the "reference" IPFIX collector |
Anlan is on PTO this week, I can help to do a release |
Salvatore already release vmware/go-ipfix 0.8.2 |
@dreamtalen can you do the changelog PR? @yuntanghsu should we close this PR then? |
Yanjun has finished that :) |
Close this PR as we make an alternative fix in antrea-io/antrea#5770. |
In this PR, we do: