DAOS-16936 dtx: rank range based DTX hints for coll_punch - b26 #15975
+196
−50
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When handle collective punch RPC, the leader will generate hints array for related DTX RPC (abort or commit). Each involved engine will have one element in such array. To simplify RPC logic, such hints array is sparse and indexed with engine's rank#. Originally, we did not properly handled the case of incontinuous rank# in the pool map, as to when update some hints element with large rank#, the write maybe out of boundary and crash the others' space and cause kinds of DRAM corruption.
Similar situation can happen during handle collective DTX resync and cleanup.
This patch fixes such issue via building such sparse hints array based on related engines' rank# range instead of the ranks count in the pool map. Use relative ranks diff (real rank# - base rank#) as the index. That will avoid out of boundary access.
Before requesting gatekeeper:
Features:
(orTest-tag*
) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.Gatekeeper: