-
Notifications
You must be signed in to change notification settings - Fork 307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DAOS-16876 vos: remove DTX record after partial commit #15857
Conversation
Ticket title is 'LRZ: m02r01s07dao engine coredumps with vos EMRG src/vos/ilog.c:411 ilog_open() Assertion' |
9d4612b
to
d2c61ff
Compare
Test stage Unit Test with memcheck on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-15857/4/testReport/ |
Test stage Unit Test bdev with memcheck on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-15857/4/testReport/ |
@@ -685,50 +685,64 @@ dtx_rec_release(struct vos_container *cont, struct vos_dtx_act_ent *dae, bool ab | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks to me this dtx_rec_release() changes both pmem and dram data, if something failed in the middle, do we need to rollback dram changes to keep pmem and dram data consistent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we will change on-disk layout firstly, then the DRAM changes during post_handle(). Please check the latest commit.
src/vos/vos_dtx.c
Outdated
} | ||
|
||
if (keep_act) { | ||
if (count > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks to me the count must be > 0, it's better to add a D_ASSERT(count > 0) before line 176?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When re-commit former partial committed DTX, the count will be zero.
d2c61ff
to
ee8797a
Compare
c954472
to
01d3be3
Compare
Otherwise, the partial committed DTX entry will be re-committed when reopen the container. Then access related dangling DTX record(s) may trigger assertion and cause corruption. Signed-off-by: Fan Yong <[email protected]>
01d3be3
to
2c0c480
Compare
Otherwise, the partial committed DTX entry will be re-committed when reopen the container. Then access related dangling DTX record(s) may trigger assertion and cause corruption.
Before requesting gatekeeper:
Features:
(orTest-tag*
) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.Gatekeeper: