Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAOS-14826 client: reserve low fds #13532

Merged
merged 27 commits into from
Jan 29, 2024
Merged
Changes from 2 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
3dfafc8
DAOS-14826 client: avoid low fd in daos_init()
wiliamhuang Dec 21, 2023
b04f4da
fix clang format issue
wiliamhuang Dec 21, 2023
4aec3a1
add call_daos_init() as a wrapper for daos_init().
wiliamhuang Jan 2, 2024
6614ce3
block low fd permanently to avoid daos uses low fd.
wiliamhuang Jan 4, 2024
c57faf6
fix a spelling issue
wiliamhuang Jan 4, 2024
fe93bac
fix clang format issue
wiliamhuang Jan 4, 2024
5b8e966
enable lowfd test in dfuse unit test
wiliamhuang Jan 5, 2024
6e94096
disable lowfd unit test for libioil or dfuse is not started
wiliamhuang Jan 5, 2024
c8ed7de
address reviewers' comments
wiliamhuang Jan 9, 2024
48918a7
Add debug info and fix format issue
wiliamhuang Jan 9, 2024
b757c30
comment out check_fd_to_close and rename lock
wiliamhuang Jan 9, 2024
87f0aa2
port the patch in new_fstatat to new_fxstatat to fix failed test on EL8.
wiliamhuang Jan 10, 2024
96a4cb8
fix glibc version related issue
wiliamhuang Jan 10, 2024
68c5880
rearrange the code in consume_low_fd to avoid possible mismatching lo…
wiliamhuang Jan 16, 2024
80f0dd6
empty commit to trigger rebuild
wiliamhuang Jan 17, 2024
abf91cd
Merge branch 'master' into lei/DAOS-14826
wiliamhuang Jan 18, 2024
160d04f
install git hook after merging master
wiliamhuang Jan 18, 2024
d134f7b
remove some commented code
wiliamhuang Jan 23, 2024
b1bf1e9
call glibc chdir() to let kernel update cwd. Extend dup2() support.
wiliamhuang Jan 24, 2024
9836e22
Merge branch 'master' into lei/DAOS-14826
wiliamhuang Jan 24, 2024
645092e
install git hook after merging master
wiliamhuang Jan 24, 2024
ebc81e8
to fix the compiling issue on EL9
wiliamhuang Jan 24, 2024
5bccea2
to fix compiling issue on EL9
wiliamhuang Jan 24, 2024
af89d48
fix cwd bug introduced by previous commit in this PR
wiliamhuang Jan 24, 2024
938936e
fix format issues
wiliamhuang Jan 24, 2024
7f2cc7d
introduce fd duplication refence count to handle complex situation
wiliamhuang Jan 26, 2024
1030d67
remove one line unused code
wiliamhuang Jan 26, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 44 additions & 6 deletions src/client/dfuse/pil4dfs/int_dfs.c
Original file line number Diff line number Diff line change
Expand Up @@ -913,6 +913,9 @@ child_hdlr(void)
context_reset = true;
}

/* avoid using fd smaller than this number in daos_init() */
#define DAOS_MIN_FD 10

/** determine whether a path (both relative and absolute) is on DAOS or not. If yes,
* returns parent object, item name, full path of parent dir, full absolute path, and
* the pointer to struct dfs_mt.
Expand Down Expand Up @@ -1007,6 +1010,19 @@ query_path(const char *szInput, int *is_target_path, dfs_obj_t **parent, char *i
/* trying to avoid lock as much as possible */
if (!daos_inited) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this existing code it doesn't appear thread-safe in the case of two threads entering this code-path concurrently, not a regression but it would be good to add locking to this at the same time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We were aware of this possible issue. daos_init() uses lock already. query_path() is called in many intercepted functions. We hoped to avoid lock if possible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lock here isn't going to make any difference to performance and should only be needed in the contended case anyway. Without one there's potential for daos_init() to be called multiple times but daos_fini() only being called once leading to resource leaks and incorrect tidyup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case daos_init() is called more than one time, daos_init_cnt has the count. daos_fini() will be called daos_init_cnt times in destruction phase.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you've re-worked this code since yesterday but it's still not thread safe although as you daos_fini() does look like it'll be called the correct number of times.

pthread_atfork() wants to be called precisely once, and if you're using the daos_inited boolean in this manner then you need to access it atomically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Moved pthread_atfork() into the locked region. Could you please check again?

/* daos_init() is expensive to call. We call it only when necessary. */
int low_fd_list[DAOS_MIN_FD], low_fd_count = 0, fd_kernel, idx;

fd_kernel = open("/proc/self/maps", O_RDONLY);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's wrong with opening "/"?

Presumably this code can only get here after the hooks have been installed so perhaps it would be better to call the redirected open directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. "/" should work too.
You are right. libc_open() should work here. I will change it in next commit. Thank you!

while (fd_kernel >= 0) {
if (fd_kernel >= DAOS_MIN_FD) {
close(fd_kernel);
break;
} else {
low_fd_list[low_fd_count] = fd_kernel;
low_fd_count++;
}
fd_kernel = open("/proc/self/maps", O_RDONLY);
}
rc = daos_init();
if (rc) {
DL_ERROR(rc, "daos_init() failed");
Expand All @@ -1024,6 +1040,8 @@ query_path(const char *szInput, int *is_target_path, dfs_obj_t **parent, char *i

daos_inited = true;
atomic_fetch_add_relaxed(&daos_init_cnt, 1);
for (idx = 0; idx < low_fd_count; idx++)
close(low_fd_list[idx]);
}

/* dfs info can be set up after daos has been initialized. */
Expand Down Expand Up @@ -2540,6 +2558,10 @@ new_fstatat(int dirfd, const char *__restrict path, struct stat *__restrict stat
return new_xstat(1, path, stat_buf);
}

if (dirfd >= FD_FILE_BASE && dirfd < FD_DIR_BASE && path[0] == 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im not sure i understand what you are doing here, nor i understand this commit message:
"Also fix a corner case in fstatat() to mimic libc. Rely on libc dup2() to get desired fd to avoid error
"failed to get the desired fd in dup2()"."

what is the corner case being fixed? can you be more clear. if this corner case is fixed, can you add a test for the corner case? thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dirfd is allowed to be a fd of a regular file. I did not expect such situation. ok. I will add a unit test for this corner case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying to add the corner case in dfuse unit test. Looks libioil has the same issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. False alarm. Used wrong flag in fstatat().

/* same as fstat for a file. May need further work to handle flags */
return fstat(dirfd, stat_buf);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this related to the commit message? Same applies for the rest of the code changes in this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a corner case of fstatat(). dirfd is allowed to be a fd of a regular file.
Yes. There are probably a few places where we need to handle flags. I added a comment here to make sure I will not forget it in future.

idx_dfs = check_path_with_dirfd(dirfd, &full_path, path, &error);
if (error)
goto out_err;
Expand Down Expand Up @@ -4979,7 +5001,7 @@ dup(int oldfd)
int
dup2(int oldfd, int newfd)
{
int fd, fd_directed, idx, rc, errno_save;
int fd, oldfd_directed, fd_directed, idx, rc, errno_save;

/* Need more work later. */
if (next_dup2 == NULL) {
Expand All @@ -4995,9 +5017,13 @@ dup2(int oldfd, int newfd)
else
return newfd;
}
if ((oldfd < FD_FILE_BASE) && (newfd < FD_FILE_BASE))
oldfd_directed = query_fd_forward_dest(oldfd);
if ((oldfd_directed < FD_FILE_BASE) && (oldfd < FD_FILE_BASE) && (newfd < FD_FILE_BASE))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't see the relevance of this change? Is it to do with low fds and required for the new test to pass or is it addressing another issue?

Copy link
Contributor Author

@wiliamhuang wiliamhuang Jan 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fixing a bug I recently identified. This change is needed to pass the new test of running bash script that uses low fd directly. The new test is added in the PR for exec() interception. bash opens a file on DFS then calls dup2(fake_fd, 5) to get a low fd 5, then call dup2(5, 1) to let fd 1 point to the file on DFS. After that, bash calls exec() to run "cat" to write to the file on DFS.

#! /bin/sh

exec 5>config.log {
cat <<EOF

EOF
} >&5

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should that test be in this PR or does this PR depend on the exec PR then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. The above is added in the exec PR. It requires the new code there.

return next_dup2(oldfd, newfd);

if (oldfd_directed >= FD_FILE_BASE && oldfd < FD_FILE_BASE)
oldfd = oldfd_directed;

if (newfd >= FD_FILE_BASE) {
DS_ERROR(ENOTSUP, "unimplemented yet for newfd >= FD_FILE_BASE");
errno = ENOTSUP;
Expand All @@ -5015,13 +5041,22 @@ dup2(int oldfd, int newfd)
else
fd_directed = query_fd_forward_dest(oldfd);
if (fd_directed >= FD_FILE_BASE) {
rc = close(newfd);
if (rc != 0 && errno != EBADF)
return -1;
fd = allocate_a_fd_from_kernel();
int fd_tmp;

fd_tmp = allocate_a_fd_from_kernel();
if (fd_tmp < 0) {
/* failed to allocate an fd from kernel */
errno_save = errno;
DS_ERROR(errno_save, "failed to get a fd from kernel");
errno = errno_save;
return (-1);
}
/* rely on dup2() to get the desired fd */
fd = next_dup2(fd_tmp, newfd);
if (fd < 0) {
/* failed to allocate an fd from kernel */
errno_save = errno;
close(fd_tmp);
DS_ERROR(errno_save, "failed to get a fd from kernel");
errno = errno_save;
return (-1);
Expand All @@ -5031,6 +5066,9 @@ dup2(int oldfd, int newfd)
errno = EBUSY;
return (-1);
}
rc = close(fd_tmp);
if (rc != 0)
return -1;
idx = allocate_dup2ed_fd(fd, fd_directed);
if (idx >= 0)
return fd;
Expand Down