-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAOS-14013 rebuild: add a basic UT for incremental reint #15782
base: master
Are you sure you want to change the base?
Conversation
c471068
to
708f77f
Compare
Ticket title is 'Validate functionality of incremental reintegration' |
708f77f
to
ec96e40
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ftest LGTM
cont_root = tmp_iov.iov_buf; | ||
*cont_toh = cont_root->tcr_root_hdl; | ||
return 0; | ||
rc = dbtree_open_inplace(&cont_root->tcr_btr_root, &uma, &migrated_toh); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to check if (daos_handle_is_inval(&cont_root->tcr_root_hdl))
before open?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the above code at L3952 the container lookup succeed, it means the container must with OID being migrated, then the container's migrated obj tree should have already been created.
Can add an assertion the handle is valid although, I'll add it.
Fix a migrated cont tree open handle bug. Signed-off-by: Xuezhao Liu <[email protected]>
ec96e40
to
ac949d4
Compare
Signed-off-by: Xuezhao Liu <[email protected]>
@@ -3962,6 +3962,7 @@ obj_tree_lookup_cont(daos_handle_t toh, uuid_t co_uuid, daos_handle_t *cont_toh) | |||
memset(&uma, 0, sizeof(uma)); | |||
uma.uma_id = UMEM_CLASS_VMEM; | |||
cont_root = tmp_iov.iov_buf; | |||
D_ASSERT(daos_handle_is_valid(cont_root->tcr_root_hdl)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some confused, since it is already opened, why need to open again?
@kccain could you prioritize review of this patch? It is a 2.8 blocker, thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a look at src/tests/suite code and have some questions, nothing blocking. Adding my +1 on the patch now in parallel to the discussion of the questions that shouldn't cause any blocking issue from my point of view.
@@ -436,6 +437,11 @@ rebuild_io_obj_internal(struct ioreq *req, bool validate, int index) | |||
int k; | |||
int l; | |||
|
|||
if (dt_no_punch) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this intended to also skip the record punch on line 483 below under the condition (l == rec_punch_idx) ? It seems this logic will not prevent it. Maybe it is ultimately not too important if that punch_single() call is made. Just curious/asking.
@@ -395,6 +402,18 @@ test_setup(void **state, unsigned int step, bool multi_rank, | |||
arg->pool.destroyed = false; | |||
} | |||
|
|||
/** Look at variables set by test arguments and setup pool props */ | |||
if (dt_incr_reint) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dt_incr_reint isn't set anywhere that I can see. Is this meant for future use when there might be a specific setup function for incremental reintegration test cases?
update_oids[i] = daos_test_oid_gen(arg->coh, OC_RP_3GX, 0, 0, arg->myrank); | ||
rebuild_io(arg, update_oids, OBJ_NR); | ||
|
||
reintegrate_single_pool_rank(arg, 5, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: what is the expectation for the global minimum stable epoch of the container when the reintegration happens. Is it expected that none of the oids data in engine rank 5 storage will be wiped during this reintegration? Or, is it expected that maybe some of rank 5 data will be wiped as part of the reintegration, because maybe it falls later than the global minimum stable epoch?
Fix a migrated cont tree open handle bug.
Before requesting gatekeeper:
Features:
(orTest-tag*
) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.Gatekeeper: