-
Notifications
You must be signed in to change notification settings - Fork 489
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EVPN Multi Home Support #2084
EVPN Multi Home Support #2084
Conversation
- SAI_OBJECT_TYPE_ISOLATION_GROUP_MEMBER isogrp_mbr_oid_31 corresponding to isogrp_oid_3+bp_lag_oid_2 ( Traffic from VTEP-3 will be dropped on LAG-2 ) | ||
- SAI_OBJECT_TYPE_ISOLATION_GROUP_MEMBER isogrp_mbr_oid_41 corresponding to isogrp_oid_4+bp_lag_oid_1 ( Traffic from VTEP-4 will be dropped on LAG-1 ) | ||
- SAI_OBJECT_TYPE_ISOLATION_GROUP_MEMBER isogrp_mbr_oid_42 corresponding to isogrp_oid_4+bp_lag_oid_2 ( Traffic from VTEP-4 will be dropped on LAG-2 ) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please double check the naming of isogrp_mbr_oid. I think the first two can be renamed to isogrp_mbr_oid_22 and isogrp_mbr_oid_32 respectively.
nit: swap lines 3 and 4, so 3 lines for LAG-2 are together.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
vector<sai_attribute_t> attrs; | ||
|
||
attr.id = SAI_FDB_ENTRY_ATTR_TYPE; | ||
attr.value.s32 = SAI_FDB_ENTRY_TYPE_DYNAMIC; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be or TYPE STATIC ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it should be STATIC. Changed the doc.
attr.value.oid = bp_nh_grp_1_oid; | ||
attrs.push_back(attr); | ||
|
||
attr.id = SAI_FDB_ENTRY_ATTR_ALLOW_MAC_MOVE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this operation have any relation with saihost "SAI_HOSTIF_TRAP_TYPE_STATIC_FDB_MOVE". Default command is set to "drop". Does application need to configure explicit trap object with action "forward" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that ALLOW_MAC_MOVE is set to true there will be no need to
trap/drop the packets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that ALLOW_MAC_MOVE is set to true there will be no need to trap/drop the packets.
So the assumption here is that "ALLOW_MAC_MOVE = true" on specific fdb entry overrides the packet command configured on SAI_HOSTIF_TRAP_TYPE_STATIC_FDB_MOVE ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check the discussion on the above in the below PR where ALLOC_MAC_MOVE was introduced.
inc/saivlan.h
Outdated
* @brief Indicates if the bridge port is set to drop the broadcast, unknown unicast and multicast traffic | ||
* | ||
* When set to true, ingress BUM traffic will be dropped | ||
* Valid only when the SAI_VLAN_MEMBER_ATTR_BRIDGE_PORT_ID is of type PORT. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For these new attributes, can we indicate the conditions using "@Validonly" flag ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SAI_VLAN_MEMBER_ATTR_BRIDGE_PORT_ID is an oid. So I am not clear how one can formalise this in a @validonly statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SAI_VLAN_MEMBER_ATTR_BRIDGE_PORT_ID is an oid. So I am not clear how one can formalise this in a @validonly statement.
Looks like SAI infra has no way to mention this. We can go by comment description. Thanks.
@@ -85,6 +85,8 @@ typedef enum _sai_bridge_port_type_t | |||
/** Bridge tunnel port */ | |||
SAI_BRIDGE_PORT_TYPE_TUNNEL, | |||
|
|||
/** Bridge port next hop group */ | |||
SAI_BRIDGE_PORT_TYPE_BRIDGE_PORT_NEXT_HOP_GROUP, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have this named as SAI_BRIDGE_PORT_TYPE_NEXT_HOP_GROUP ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is specific to NHG of type bridgeport and hence this name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JaiOCP Please comment if the extra BRIDGE_PORT in the name can be dropped ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rational for naming it like is to be explicit in intent that the NHG type is BRIDGE_PORT. Though the attributes are for bridge port object and it may seem redundant but the NHG are of many type and one used here is of type bridge port.
It would be good to clarify this in comments to avoid such confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated comments.
* @objects SAI_OBJECT_TYPE_NEXT_HOP_GROUP | ||
* @condition SAI_BRIDGE_PORT_ATTR_TYPE == SAI_BRIDGE_PORT_TYPE_BRIDGE_PORT_NEXT_HOP_GROUP | ||
*/ | ||
SAI_BRIDGE_PORT_ATTR_BRIDGE_PORT_NEXT_HOP_GROUP_ID, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above comment, can we name it as SAI_BRIDGE_PORT_ATTR_NEXT_HOP_GROUP_ID ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same response as above.
* | ||
* When set to true, egress BUM traffic will be dropped | ||
* Valid only when the SAI_VLAN_MEMBER_ATTR_BRIDGE_PORT_ID is of type PORT. | ||
* Valid only for .1Q bridge ports. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the comment be made explicit and state it as below to include both port and lag ?
- Valid only when the SAI_VLAN_MEMBER_ATTR_BRIDGE_PORT_ID is of type SAI_BRIDGE_PORT_TYPE_PORT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack
|
||
- Tunnels of peer mode P2P are only considered here. The isolation group object is used to achieve the split horizon functionality. | ||
- There is no change to the Isolation group and group member definition. | ||
- Bridgeport of type TUNNEL will also now have the isolation group attribute set. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we make the comment more explicit, stating only 'SAI_ISOLATION_GROUP_TYPE_BRIDGE_PORT' is applicable to realize split horizon filtering - with applicable bridge port types - SAI_BRIDGE_PORT_TYPE_PORT and SAI_BRIDGE_PORT_TYPE_TUNNEL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack
* @allownull true | ||
* @default SAI_NULL_OBJECT_ID | ||
*/ | ||
SAI_BRIDGE_PORT_ATTR_BRIDGE_PORT_PROTECTION_NEXT_HOP_GROUP_ID, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we avoid repeating 'BRIDGE_PORT' in the attribute names ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Responded earlier.
|
||
### 3.2.4 Fast Failover support | ||
|
||
A new Bridge port attribute (for type PORT) is introduced to specify the protection nexthop group ID. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please be explicit and use 'SAI_BRIDGE_PORT_TYPE_PORT' when you say PORT. As all these cases are applicable to port and lag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack.
* | ||
* @type sai_object_id_t | ||
* @flags CREATE_AND_SET | ||
* @objects SAI_OBJECT_TYPE_NEXT_HOP_GROUP |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the next hop group type be explicitly stated as SAI_NEXT_HOP_GROUP_TYPE_PROTECTION ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The NH group type should be of type BRIDGE_PORT. Will mention this
* @type bool | ||
* @flags CREATE_AND_SET | ||
* @default false | ||
* @validonly SAI_BRIDGE_PORT_ATTR_TYPE == SAI_BRIDGE_PORT_TYPE_PORT or SAI_BRIDGE_PORT_ATTR_TYPE == SAI_BRIDGE_PORT_TYPE_SUB_PORT or SAI_BRIDGE_PORT_ATTR_TYPE == SAI_BRIDGE_PORT_TYPE_TUNNEL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need bridge port of type "SAI_BRIDGE_PORT_TYPE_TUNNEL" in DF election ? Any use case?
It would be helpful if we can extend "4.3 DF workflow" with bridge port type as SAI_BRIDGE_PORT_TYPE_TUNNEL.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are for "Interconnect ES" (RFC 9014) wherein the ES could be a Tunnel.
### 3.2.5 Single Active Redundancy Mode support | ||
|
||
Single Active redundancy mode requires unicast and BUM traffic to be dropped in the ingress and egress directions | ||
on the client port. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we shall use "multi-homed host/server" or "Multihomed ES port" instead of client, applicable for all places wherever client port is being used in this document
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case of single active redundancy the client is still single homed whereas in other scenarios it is dual homed client. Since the diagrams make it explicit I don't think it needs to be elaborated.
* @flags MANDATORY_ON_CREATE | CREATE_ONLY | ||
* @condition SAI_NEXT_HOP_ATTR_TYPE == SAI_NEXT_HOP_TYPE_IP or SAI_NEXT_HOP_ATTR_TYPE == SAI_NEXT_HOP_TYPE_MPLS or SAI_NEXT_HOP_ATTR_TYPE == SAI_NEXT_HOP_TYPE_TUNNEL_ENCAP or SAI_NEXT_HOP_ATTR_TYPE == SAI_NEXT_HOP_TYPE_BRIDGE_PORT | ||
*/ | ||
SAI_NEXT_HOP_ATTR_IP, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need attr SAI_NEXT_HOP_ATTR_IP for SAI_NEXT_HOP_TYPE_BRIDGE_PORT? Is this for SAI_TUNNEL_PEER_MODE_P2MP?
if yes, please mention this under @brief .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes for P2MP. Same behaviour as existing SAI_NEXT_HOP_TYPE_TUNNEL_ENCAP and hence not mentioning it explicitly.
nhg_attr.value.s32 = SAI_NEXT_HOP_GROUP_TYPE_BRIDGE_PORT; | ||
nhg_attrs.push_back(nhg_attr); | ||
|
||
sai_object_id_t next_hop_group_id; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we shall use "sai_object_id_t nh_grp_oid_1"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack
* @allownull true | ||
* @default SAI_NULL_OBJECT_ID | ||
*/ | ||
SAI_BRIDGE_PORT_ATTR_BRIDGE_PORT_PROTECTION_NEXT_HOP_GROUP_ID, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, instead of having the protectin at nexthop group level, having protection for each nexthop(NHG member) might helps.
Assume, VTEP-5, had a nexthop group with members VTEP-{1,2,3,4}. what will be the backup members for this NHG. instead VTEP-1, can be protected with one member from VTEP-{2,3,4} and same way, VTEP-2 can be protected with one of the VTEP from VTEP-{1,3,4}.
Generally, not all members in a give nexthop group will not go down, so under what condition, switchover to backup Nexthop group is Initiated. if backup NHG also constituted from same VTEP list{1,2,3,4} by skipping one member. how does NOS knowns which member will go down
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you look at section 4.4 Fast Failover workflow and see if it clarifies ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it didn't explain how the backup members are constructed. SONIC HLD mentions that we can remove the, member from L2-NHG when VTEP advertises ES_ID Down notification, so that all FDBs learnt on the Ethernet segment will be updated with a single forwading update(NHG member delete). but for that we don't need backup protection next-hop group, simple member removal from Primary NHG, when VTEP advertises ES_ID down notification.
Still not clear why the backup NHG is needed and how it will be constructed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
4.4 has no VTEP-5 in the diagram.
As mentioned in 4.4,
Step1 - Traffic from Host1 received on LAG1 and fwded to Host2 over LAG2.
This is due to Local bias.
Step2 - When LAG2 fails, Traffic will be sent to a backup NHG comprising of
the remaining members sharing the ES.
This is the reason why the backup exists and is not for the scenario you mention.
This is a different flow from 4.1 which you seem to be referring to.
As to how the backup NHG is constructed and for more queries on the need for this usecase please raise this query in the SONiC EVPN-MH HLD PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explaination, it addressed my query
mem_attr[0].id = SAI_ISOLATION_GROUP_MEMBER_ATTR_ISOLATION_GROUP_ID; | ||
mem_attr[0].value.oid = isogrp_oid_2; | ||
mem_attr[1].id = SAI_ISOLATION_GROUP_MEMBER_ATTR_ISOLATION_OBJECT; | ||
mem_attr[1].value.oid = bp_po_oid_2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is bp_po_oid_2 ? No other reference to this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can attaching the isolation group also be covered in the workflow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can workflow for realizing local-bias be included ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bp_po_oid_2 should be renamed as bp_lag_oid_2. Will do the change.
Following lines show isolation group being attached to the bridgeport.
505 attr.id = SAI_BRIDGE_PORT_ATTR_ISOLATION_GROUP;
506 attr.value.oid = isogrp_oid_2;
507 status = sai_bridge_api->set_bridge_port_attribute(bp_tnl_oid_2, &attr);
Local bias is basically MAC being HW learnt locally or MAC learnt over BGP from the peers being made to point to the local ES. These are existing constructs and fairly basic and hence did not cover them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srj102 , regarding local bias, Assume this scenario. In section "4.3 DF workflow".
a) BUM traffic originates from a single homed server (say HOST-3) connected to VTEP-4.
b) This BUM packet reaches VTEP-3, which is the DF-node. It should not forward the packet to HOST-2.
c) At VTEP-3, this filtering should be based on 'ES and SIP of VTEP-4'.
How is this expected to be realized in your proposal ? I understand you are not considering P2MP now and the split horizon explained in 4.2 is expected to take effect. But is this proposal future extensible to support P2MP later ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This proposal does not add anything specific for Split Horizon+Local bias.
It just suggests a mechanism using existing SAI spec to realise split horizon + Local bias with P2P as an example.
For P2MP, ACLs could be used and this would work for P2P as well.
SAI_ACL_ENTRY_ATTR_FIELD_TUNNEL_TERMINATED
SAI_ACL_ENTRY_ATTR_FIELD_SRC_IP
SAI_ACL_ENTRY_ATTR_ACTION_EGRESS_BLOCK_PORT_LIST
Note however that the EGRESS_BLOCK_PORT_LIST has the below comment.
This action would be deprecated in future release. To achieve this
* functionality use isolation group
So whenever this is deprecated, the isolation group will have to be enhanced to support P2MP tunnels.
please fix build errors, you will need to squash your commits and force push, since validator is checking each commit separately |
cd78928
to
cb197ed
Compare
Signed-off-by: JaiOCP <[email protected]> Updated EVPN MH workflow document Signed-off-by: Rajesh Sankaran <[email protected]>
Can all the reviewers signoff if there are no outstanding comments to be handled ? |
@helloanandhi @sathk-marvell @skumar041 @kcudnik The proposal has been reviewed in the community call, if there is no further comments, please signoff and approve |
Signed-off-by: Rajesh Sankaran <[email protected]>
@tjchadaga could you please advise why the below check is stuck ? "opencomputeproject.SAIExpected — Waiting for status to be reported" |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
@kcudnik - could you please help sign off? |
Host may be reachable via multiple VTEPs in the EVPN topology. In this case load balancing need to happen across VTEPs.
This is achieved using a new bridge port type SAI_BRIDGE_PORT_TYPE_BRIDGE_PORT_NEXT_HOP_GROUP that is going to use SAI_BRIDGE_PORT_ATTR_BRIDGE_PORT_NEXT_HOP_GROUP_ID for nexthop selection. Nexthop wll be the one of the available VTEP tunnels.
There can be an associated protection NHG configured SAI_BRIDGE_PORT_ATTR_BRIDGE_PORT_PROTECTION_NEXT_HOP_GROUP_ID, as a backup path in case the bridge port goes down.
This PR is open in lieu of
#2058