Skip to content

Commit

Permalink
DAOS-14969 container: retry IV might cause deadlock
Browse files Browse the repository at this point in the history
OID IV entry lock might be required again for retry
case.

Test-repeat: 10
Test-tag: test_daos_oid_allocator test_daos_management

Required-githooks: true

Signed-off-by: Di Wang <[email protected]>
  • Loading branch information
wangdi1 committed Jan 18, 2024
1 parent da658ad commit be67921
Show file tree
Hide file tree
Showing 2 changed files with 10 additions and 2 deletions.
10 changes: 9 additions & 1 deletion src/container/oid_iv.c
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ struct oid_iv_entry {
struct oid_iv_range rg;
/** protect the entry */
ABT_mutex lock;
void *current_req;
};

/** Priv data in the iv layer */
Expand Down Expand Up @@ -130,7 +131,14 @@ oid_iv_ent_update(struct ds_iv_entry *ns_entry, struct ds_iv_key *iv_key,
D_ASSERT(priv != NULL);

entry = ns_entry->iv_value.sg_iovs[0].iov_buf;
ABT_mutex_lock(entry->lock);
rc = ABT_mutex_trylock(entry->lock);
/* For retry requests, from _iv_op(), the lock may not be released
* in some cases.
*/
if (rc == ABT_ERR_MUTEX_LOCKED && entry->current_req != src)
return -DER_BUSY;

entry->current_req = src;
avail = &entry->rg;

oids = src->sg_iovs[0].iov_buf;
Expand Down
2 changes: 1 addition & 1 deletion src/engine/server_iv.c
Original file line number Diff line number Diff line change
Expand Up @@ -1053,7 +1053,7 @@ _iv_op(struct ds_iv_ns *ns, struct ds_iv_key *key, d_sg_list_t *value,
retry:
rc = iv_op_internal(ns, key, value, sync, shortcut, opc);
if (retry && !ns->iv_stop &&
(daos_rpc_retryable_rc(rc) || rc == -DER_NOTLEADER)) {
(daos_rpc_retryable_rc(rc) || rc == -DER_NOTLEADER || rc == -DER_BUSY)) {
if (rc == -DER_NOTLEADER && key->rank != (d_rank_t)(-1) &&
sync && (sync->ivs_mode == CRT_IV_SYNC_LAZY ||
sync->ivs_mode == CRT_IV_SYNC_EAGER)) {
Expand Down

0 comments on commit be67921

Please sign in to comment.