Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a new attribute SAI_BFD_SESSION_ATTR_NEXT_HOP_ID to saibfd.h #2117

Closed
wants to merge 0 commits into from

Conversation

baorliu
Copy link

@baorliu baorliu commented Dec 4, 2024

Adding a new attribute SAI_BFD_SESSION_ATTR_NEXT_HOP_ID to saibfd.h to support forwarding single hop bfd packet to specific nexthop.

The proposed usage is:
1, this attribute can be provided both in create_bfd_session and set_bfd_session_attribute.
2, if SAI_BFD_SESSION_ATTR_USE_NEXT_HOP (optional, default if false) is set to true, BFD session will get next hop from SAI_BFD_SESSION_ATTR_NEXT_HOP_ID value and forward the bfd packet to the next hop. If SAI_BFD_SESSION_ATTR_USE_NEXT_HOP is false, attribute SAI_BFD_SESSION_ATTR_NEXT_HOP_ID will be ignored.
3, when both SAI_BFD_SESSION_ATTR_HW_LOOKUP_VALID and SAI_BFD_SESSION_ATTR_USE_NEXT_HOP, the implementation is vender dependent.

@baorliu baorliu force-pushed the master branch 2 times, most recently from 5a8814a to 78a2f39 Compare December 4, 2024 22:25
@baorliu baorliu marked this pull request as ready for review December 10, 2024 00:41
@anamehra
Copy link

Hi @abdosi , @kcudnik , Please review. Thanks

meta/parse.pl Outdated Show resolved Hide resolved
inc/saibfd.h Outdated
* @flags CREATE_ONLY
* @default false
*/
SAI_BFD_SESSION_ATTR_USE_NEXT_HOP,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed? If next hop is null it will not be used, does make sense to make only 1 attribute next hop without second one which acts flag ?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @kcudnik ,
SAI_NULL_OBJECT_ID is valid for SAI_BFD_SESSION_ATTR_NEXT_HOP_ID here.
The reason is that, when interface becomes DOWN, the corresponding nexthop will be removed. In this case, orchagent update SAI_BFD_SESSION_ATTR_NEXT_HOP_ID SAI attribute with value SAI_NULL_OBJECT_ID to that single hop BFD session, and update NPU/ASIC eventually, make NPU/ASIC drop the egress BFD packet on purpose.
When the interface becomes UP, orchagent update BFD session with a normal nexthop_id, so the BFD TX packet resume.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From your description I see no need for bool flag you described exactly how this can be done without that flag

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bool flag SAI_BFD_SESSION_ATTR_USE_NEXT_HOP is trying to direct SAI to use next_hop approach or to use "SAI_BFD_SESSION_ATTR_HW_LOOKUP_VALID == false" + SAI_BFD_SESSION_ATTR_DST_MAC_ADDRESS approach.

because 0 is a valid mac address (even it might not be valid for SAI_BFD_SESSION_ATTR_DST_MAC_ADDRESS), with SAI_BFD_SESSION_ATTR_USE_NEXT_HOP=true, we can avoid checking SAI_BFD_SESSION_ATTR_HW_LOOKUP_VALID==false && SAI_BFD_SESSION_ATTR_DST_MAC_ADDRESS==0

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got confused here, what mac has to do wiht hthat ?

for me you have 4 options here:

  1. use_next_hop = true, && next_hop = valid_oid
  2. use_next_hop = true, && next_hop = NULL
  3. use_next_hop = false, && next_hop = valid_oid
  4. use_next_hop = false, && next_hop = NULL

in 2 3 and 4 you will not use next hop, since next hop is ether NULL or use_next_hop is false
so the only scenario when possible to use is scenario 1
thus this logic points that having use_next_hop is unnecessary flag, setting it to true when next hop is NULL have no effect, so actual value of next_hop NULL/valid_oid is the same as flag false/true

Copy link
Author

@baorliu baorliu Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the confusion here, let's forget SAI_BFD_SESSION_ATTR_HW_LOOKUP_VALID and SAI_BFD_SESSION_ATTR_DST_MAC_ADDRESS related things, and focus on NEXT_HOP only here.

  1. use_next_hop = true, && next_hop = valid_oid --> it tells HW to construct BFD packet using the nexthop info directly and skips HW_LOOKUP.
  2. use_next_hop = true, && next_hop = NULL --> it tells HW to use nexthop, but because of null value, HW is not able to construct TX BFD packet. The system stops BFD TX in this way. use_next_hop = true, && next_hop = NULL is a valid combination, NOS stops BFD session TX on purpose in the skip HW_LOOKUP case.
    The use case in second case, for portchannel, even portchannel is DOWN, part of its members might be still UP and packet can get through physically. if nexthop is used and BFD packet is sent directly to TX port, single hop BFD packet can reach peer system which is not expected when portchannel is DOWN.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im even more confused XD i dont know the workflow of pipeline here XD

@@ -3133,11 +3133,11 @@ void check_attr_condition_met(
* SAI_PORT_ATTR_MEDIA_TYPE.
*/

attrs[idx].id ^= (uint32_t)(-1);
attrs[idx].value.s32 ^= (int32_t)(-1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong, idea was that we chznde id of attribute to bie invalid so default value would be used in condition compare, please revert this change

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please read comments. In this entire function

Copy link
Author

@baorliu baorliu Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic here should be similar to line 3226:
else if (md->validonlytype == SAI_ATTR_CONDITION_TYPE_AND)
The original logic is not correct. It is a false assert META_ASSERT_FALSE here, so if change attr id only, the default value of the attribute could make the condition true and fails this assertion.
so to make sure the assertion pass, need to change attr value here, not attr id.

I hit this issue when I add the following condition for SAI_BFD_SESSION_ATTR_VIRTUAL_ROUTER attribute:

  • @condition SAI_BFD_SESSION_ATTR_HW_LOOKUP_VALID == true and
    SAI_BFD_SESSION_ATTR_USE_NEXT_HOP == false

Copy link
Collaborator

@kcudnik kcudnik Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the logic here is to remove each of condition attributes (make id of attribute invalid) to see if default value will be used to get that attribute value to met condition, this technique was used to make attribute invalid, instead of removing it from list and then create new shorter list, this solution is more elegant

Copy link
Author

@baorliu baorliu Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original logic is not correct here, it should change value, not the id.
It copies condition to a attr_list, and inverts attrs id try to make it invalid (not the md->conditions).

   for (idx = 0; idx < count; ++idx)
    {
        attrs[idx].id = md->conditions[idx]->attrid;
        attrs[idx].value = md->conditions[idx]->condition; /* copy */
    }

and then assert the condition check to be false. The problem is, the default value will be used if a condition id cannot be found in attrs list. The default value of that attribute can make the condition TRUE. so the META_ASSERT_FALSE fails.

Use the above AND condition (SAI_BFD_SESSION_ATTR_HW_LOOKUP_VALID == true and
SAI_BFD_SESSION_ATTR_USE_NEXT_HOP == false) as an example, the following log (by adding some printout in original code -- without the bug fix here) shows why assertion fails:

check attr SAI_BFD_SESSION_ATTR_VIRTUAL_ROUTER for SAI_ATTR_CONDITION_TYPE_AND here 
saimetadatautils.c:549 sai_metadata_is_condition_met: :- BFD DEBUG: condition type 2 on SAI_BFD_SESSION_ATTR_VIRTUAL_ROUTER 
saimetadatautils.c:448 sai_metadata_is_and_condition_list_met: :- BFD DEBUG: sai_metadata_is_and_condition_list_met, condition idx 0
saimetadatautils.c:407 sai_metadata_is_single_condition_met: :- BFD DEBUG sai_metadata_is_single_condition_met condition meta data SAI_BFD_SESSION_ATTR_HW_LOOKUP_VALID: 
saimetadatautils.c:301 sai_metadata_get_attr_by_id: :- BFD DEBUG attr_list[0] id -2, target id 1
saimetadatautils.c:301 sai_metadata_get_attr_by_id: :- BFD DEBUG attr_list[1] id 41, target id 1
saimetadatautils.c:420 sai_metadata_is_single_condition_met: :- BFD DEBUG sai_metadata_is_single_condition_met condition meta data SAI_BFD_SESSION_ATTR_HW_LOOKUP_VALID, use default value 1 , return value 1
saimetadatautils.c:448 sai_metadata_is_and_condition_list_met: :- BFD DEBUG: sai_metadata_is_and_condition_list_met, condition idx 1
saimetadatautils.c:407 sai_metadata_is_single_condition_met: :- BFD DEBUG sai_metadata_is_single_condition_met condition meta data SAI_BFD_SESSION_ATTR_USE_NEXT_HOP: 
saimetadatautils.c:301 sai_metadata_get_attr_by_id: :- BFD DEBUG attr_list[0] id -2, target id 41
saimetadatautils.c:301 sai_metadata_get_attr_by_id: :- BFD DEBUG attr_list[1] id 41, target id 41
saimetadatautils.c:428 sai_metadata_is_single_condition_met: :- BFD DEBUG sai_metadata_is_single_condition_met condition meta data SAI_BFD_SESSION_ATTR_USE_NEXT_HOP, use attr value 0 , return value 1
 ASSERT FAILED (on line 3032): expected false 'sai_metadata_is_condition_met(md, count, attrs)': condition should be met

I added the function name to the above log, please check the call stack and logic for details.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

look also at OR condition in here:
https://github.com/opencomputeproject/SAI/pull/2117/files#diff-cea899955f033063bbb1dbec061d9713c1c91e8dd809b5181ebf1355f3488781L3114

notice also that the condition don't check type there, and jus changes id, so actual condition can be bool/s32/u32/u64/s8 etc, and chanigng s32 is not correct for all attributes, so it cannot be changed, that's why id is changed to invalid, so default value would be uses to check condition when calling sai_metadata_is_condition_metm, even if condition is false

if you get error by adding your check maybe some ohter part is broken, i think we didn;t have condition == false so far,

also same logic applies to validonly condition

Copy link
Author

@baorliu baorliu Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you walkthrough this call stack including attribute value and returned condition check result (return value) for the above AND logic? you can have a local run to see details.
The attr id -2 is the one after id invalidated. it was SAI_BFD_SESSION_ATTR_HW_LOOKUP_VALID, and its default value is 1 (true). The attr id 41 is new added attr id SAI_BFD_SESSION_ATTR_USE_NEXT_HOP. you can see that both single condition checking returns 1 (true), so the AND condition result is 1 (true).
but the assertion expects false. so the assertion failed by just inverting attr id.

check attr SAI_BFD_SESSION_ATTR_VIRTUAL_ROUTER for SAI_ATTR_CONDITION_TYPE_AND here 
saimetadatautils.c:549 sai_metadata_is_condition_met: :- BFD DEBUG: condition type 2 on SAI_BFD_SESSION_ATTR_VIRTUAL_ROUTER 
saimetadatautils.c:448 sai_metadata_is_and_condition_list_met: :- BFD DEBUG: sai_metadata_is_and_condition_list_met, condition idx 0
saimetadatautils.c:407 sai_metadata_is_single_condition_met: :- BFD DEBUG sai_metadata_is_single_condition_met condition meta data SAI_BFD_SESSION_ATTR_HW_LOOKUP_VALID: 
saimetadatautils.c:301 sai_metadata_get_attr_by_id: :- BFD DEBUG attr_list[0] id -2, target id 1
saimetadatautils.c:301 sai_metadata_get_attr_by_id: :- BFD DEBUG attr_list[1] id 41, target id 1
saimetadatautils.c:420 sai_metadata_is_single_condition_met: :- BFD DEBUG sai_metadata_is_single_condition_met condition meta data SAI_BFD_SESSION_ATTR_HW_LOOKUP_VALID, use default value 1 , return value 1
saimetadatautils.c:448 sai_metadata_is_and_condition_list_met: :- BFD DEBUG: sai_metadata_is_and_condition_list_met, condition idx 1
saimetadatautils.c:407 sai_metadata_is_single_condition_met: :- BFD DEBUG sai_metadata_is_single_condition_met condition meta data SAI_BFD_SESSION_ATTR_USE_NEXT_HOP: 
saimetadatautils.c:301 sai_metadata_get_attr_by_id: :- BFD DEBUG attr_list[0] id -2, target id 41
saimetadatautils.c:301 sai_metadata_get_attr_by_id: :- BFD DEBUG attr_list[1] id 41, target id 41
saimetadatautils.c:428 sai_metadata_is_single_condition_met: :- BFD DEBUG sai_metadata_is_single_condition_met condition meta data SAI_BFD_SESSION_ATTR_USE_NEXT_HOP, use attr value 0 , return value 1
 ASSERT FAILED (on line 3032): expected false 'sai_metadata_is_condition_met(md, count, attrs)': condition should be met

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as mentioned before, we didn't have == false attribute previously maybe that's the case, i will cherry pick this and test locally

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, i figured this out, the problem is that in condition case we didn't have so far condition that was based on default value and "AND" condition, so far all attributes were mandatory on create, your change is adding default value condition SAI_BFD_SESSION_ATTR_USE_NEXT_HOP, this was mentioned already in line 3130 and very similar case is handled in validonly case in line 3243, this is actually good that this case was catched, fix here: #2126, with that change your PR will pass, and you don't need to touch sanitycheck, please update your PR

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The above run log, it fails at SAI_BFD_SESSION_ATTR_HW_LOOKUP_VALID default value first. I see that your fix changes the attr value if the the condition attr has a default value (the original design changing attr id is wrong in this case).
with this fix in saisainitycheck.c, I will remove the change in saisanitycheck.c in this PR.

@baorliu
Copy link
Author

baorliu commented Jan 13, 2025

replaced by the PR #2127.
Because this PR from a forked master branch, when sync the fork with conflicts in master branch, the forked commit history were removed, so I created a different branch in the forked repo for the PR.

@kcudnik
Copy link
Collaborator

kcudnik commented Jan 14, 2025

You could force push here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants