Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DAOS-14826 client: reserve low fds #13532

Merged
merged 27 commits into from
Jan 29, 2024
Merged

DAOS-14826 client: reserve low fds #13532

merged 27 commits into from
Jan 29, 2024

Conversation

wiliamhuang
Copy link
Contributor

@wiliamhuang wiliamhuang commented Dec 21, 2023

Features: pil4dfs

Reserve low fd to avoid daos network contexts use them.
Fix a corner case in fstatat() to mimic libc and add unit test.
Rely on libc dup2() to get desired fd to avoid error "failed to get the desired fd in dup2()".
Use atomic APIs to access daos_inited.
Move pthread_atfork() inside locked region to avoid multiple execution.
Call glibc chdir() to update process's cwd in kernel state.
Introduce fd duplication reference count to handle complex situation

Add a unit test to make sure daos does not use low fd.

Before requesting gatekeeper:

  • Two review approvals and any prior change requests have been resolved.
  • Testing is complete and all tests passed or there is a reason documented in the PR why it should be force landed and forced-landing tag is set.
  • Features: (or Test-tag*) commit pragma was used or there is a reason documented that there are no appropriate tags for this PR.
  • Commit messages follows the guidelines outlined here.
  • Any tests skipped by the ticket being addressed have been run and passed in the PR.

Gatekeeper:

  • You are the appropriate gatekeeper to be landing the patch.
  • The PR has 2 reviews by people familiar with the code, including appropriate watchers.
  • Githooks were used. If not, request that user install them and check copyright dates.
  • Checkpatch issues are resolved. Pay particular attention to ones that will show up on future PRs.
  • All builds have passed. Check non-required builds for any new compiler warnings.
  • Sufficient testing is done. Check feature pragmas and test tags and that tests skipped for the ticket are run and now pass with the changes.
  • If applicable, the PR has addressed any potential version compatibility issues.
  • Check the target branch. If it is master branch, should the PR go to a feature branch? If it is a release branch, does it have merge approval in the JIRA ticket.
  • Extra checks if forced landing is requested
    • Review comments are sufficiently resolved, particularly by prior reviewers that requested changes.
    • No new NLT or valgrind warnings. Check the classic view.
    • Quick-build or Quick-functional is not used.
  • Fix the commit message upon landing. Check the standard here. Edit it to create a single commit. If necessary, ask submitter for a new summary.

Features: pil4dfs

also fix a corner case in fstatat() to mimic libc.
rely on dup2() to get desired fd to avoid error
"failed to get the desired fd in dup2()".

Signed-off-by: Lei Huang <[email protected]>
Copy link

github-actions bot commented Dec 21, 2023

Bug-tracker data:
Ticket title is 'hang in configure with libpil4dfs.so loaded'
Status is 'In Review'
Labels: 'intercept_lib'
https://daosio.atlassian.net/browse/DAOS-14826

Features: pil4dfs

Signed-off-by: Lei Huang <[email protected]>
@wiliamhuang wiliamhuang marked this pull request as ready for review January 2, 2024 13:15
@wiliamhuang wiliamhuang requested a review from a team as a code owner January 2, 2024 13:15
Copy link
Contributor

@ashleypittman ashleypittman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't you say you had a working test for the low fd issue? It would be good to see that included alongside the code.

@@ -1007,6 +1010,19 @@ query_path(const char *szInput, int *is_target_path, dfs_obj_t **parent, char *i
/* trying to avoid lock as much as possible */
if (!daos_inited) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this existing code it doesn't appear thread-safe in the case of two threads entering this code-path concurrently, not a regression but it would be good to add locking to this at the same time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. We were aware of this possible issue. daos_init() uses lock already. query_path() is called in many intercepted functions. We hoped to avoid lock if possible.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lock here isn't going to make any difference to performance and should only be needed in the contended case anyway. Without one there's potential for daos_init() to be called multiple times but daos_fini() only being called once leading to resource leaks and incorrect tidyup.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case daos_init() is called more than one time, daos_init_cnt has the count. daos_fini() will be called daos_init_cnt times in destruction phase.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you've re-worked this code since yesterday but it's still not thread safe although as you daos_fini() does look like it'll be called the correct number of times.

pthread_atfork() wants to be called precisely once, and if you're using the daos_inited boolean in this manner then you need to access it atomically.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Moved pthread_atfork() into the locked region. Could you please check again?

@@ -1007,6 +1010,19 @@ query_path(const char *szInput, int *is_target_path, dfs_obj_t **parent, char *i
/* trying to avoid lock as much as possible */
if (!daos_inited) {
/* daos_init() is expensive to call. We call it only when necessary. */
int low_fd_list[DAOS_MIN_FD], low_fd_count = 0, fd_kernel, idx;

fd_kernel = open("/proc/self/maps", O_RDONLY);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's wrong with opening "/"?

Presumably this code can only get here after the hooks have been installed so perhaps it would be better to call the redirected open directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. "/" should work too.
You are right. libc_open() should work here. I will change it in next commit. Thank you!

Comment on lines 2561 to 2564
if (dirfd >= FD_FILE_BASE && dirfd < FD_DIR_BASE && path[0] == 0)
/* same as fstat for a file. May need further work to handle flags */
return fstat(dirfd, stat_buf);

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this related to the commit message? Same applies for the rest of the code changes in this file.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a corner case of fstatat(). dirfd is allowed to be a fd of a regular file.
Yes. There are probably a few places where we need to handle flags. I added a comment here to make sure I will not forget it in future.

@wiliamhuang
Copy link
Contributor Author

Didn't you say you had a working test for the low fd issue? It would be good to see that included alongside the code.

Yes. The test bash script needs some code in exec interception to pass. I will put the test case over there. fd 0, 1, and 2 are preserved after exec(). We need to handle such issue in #13519. We also probably need to close all fds higher than 2.

Features: pil4dfs

also use a lock before init main_eqh in query_path().

Required-githooks: true

Signed-off-by: Lei Huang <[email protected]>
@ashleypittman
Copy link
Contributor

Didn't you say you had a working test for the low fd issue? It would be good to see that included alongside the code.

Yes. The test bash script needs some code in exec interception to pass. I will put the test case over there. fd 0, 1, and 2 are preserved after exec(). We need to handle such issue in #13519. We also probably need to close all fds higher than 2.

That link and some of the code in this PR is for the execv issue, but most of the code in this PR and the commit message relate to the low FD issue. Do you have a test for the low FD issue and can you include it in this PR.

Features: pil4dfs

also added a unit test to make sure daos does not use low fd.

Required-githooks: true

Signed-off-by: Lei Huang <[email protected]>
Features: pil4dfs

Required-githooks: true

Signed-off-by: Lei Huang <[email protected]>
@wiliamhuang
Copy link
Contributor Author

That link and some of the code in this PR is for the execv issue, but most of the code in this PR and the commit message relate to the low FD issue. Do you have a test for the low FD issue and can you include it in this PR.

Just added a unit test to make sure daos does not use low fd. Thank you!

Features: pil4dfs

Required-githooks: true

Signed-off-by: Lei Huang <[email protected]>
Features: pil4dfs

Required-githooks: true

Signed-off-by: Lei Huang <[email protected]>
@wiliamhuang wiliamhuang requested a review from a team as a code owner January 5, 2024 04:05
@daosbuild1
Copy link
Collaborator

Test stage Functional on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-13532/7/testReport/

Features: pil4dfs

Required-githooks: true

Signed-off-by: Lei Huang <[email protected]>
@wiliamhuang
Copy link
Contributor Author

@ashleypittman @mlawsonca Could you please review? Thank you very much!

@@ -5565,6 +5676,20 @@ init_myhook(void)
eq_count_max = MAX_EQ;
}

/* D_IL_DAOS_MIN_FD=3 or smaller value will disable blocking low fds */
rc = d_getenv_uint64_t("D_IL_DAOS_MIN_FD", &daos_min_fd);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this env documented somewhere?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not in our document. Normally users do not need to set it. Default value 10 will be used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok well if users don't need to set it then i do not see why it needs to be an env variable..
if it needs to, then we should document it in the IL readme, and also consider changing it's name D_ and DAOS are repetitive.
maybe call it DAOS_RESERVE_MIN_FD? other suggestion welcome

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. We can just use a hard-coded value for now. We add an env later if necessary. Thank you!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, I am not against making this an env variable. im just saying that an env variable needs to be documented somewhere if we need to ask users to play with it. otherwise only the dev (you) knows about it and what it does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! Understood. We can use hardcoded value to make our code simpler and decrease the number of env users need to deal with. Originally, I thought bash script can use arbitrary fd. An env may be helpful if a larger value is needed. However, it could be very hard for a regular user to figure out the situation. I will add a warning in dup2() if a fd pointing to "socket:[xxxx]" or "anon_inode:[eventpoll]" is closed, so it would help for debugging related issues.

DL_WARN(rc, "daos_eq_create() failed");
main_eqh = td_eqh;
rc = pthread_atfork(NULL, NULL, &child_hdlr);
D_MUTEX_LOCK(&lock_eqh);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't understand why we need nested locks now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. I will rearrange the code to avoid nested locks. Thank you!

@@ -1007,23 +1075,41 @@ query_path(const char *szInput, int *is_target_path, dfs_obj_t **parent, char *i
/* trying to avoid lock as much as possible */
if (!daos_inited) {
/* daos_init() is expensive to call. We call it only when necessary. */

D_MUTEX_LOCK(&lock_global);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed for the entire block? isn't this just needed internally for consume_low_fd()?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. I will move the lock inside consume_low_fd(). Thank you!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well my point was just that if you already added a lock on top (not sure if that one is needed), do you need this one too?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually nevemind that comment was for the next lock not the lock_global

@@ -2540,6 +2626,10 @@ new_fstatat(int dirfd, const char *__restrict path, struct stat *__restrict stat
return new_xstat(1, path, stat_buf);
}

if (dirfd >= FD_FILE_BASE && dirfd < FD_DIR_BASE && path[0] == 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im not sure i understand what you are doing here, nor i understand this commit message:
"Also fix a corner case in fstatat() to mimic libc. Rely on libc dup2() to get desired fd to avoid error
"failed to get the desired fd in dup2()"."

what is the corner case being fixed? can you be more clear. if this corner case is fixed, can you add a test for the corner case? thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dirfd is allowed to be a fd of a regular file. I did not expect such situation. ok. I will add a unit test for this corner case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying to add the corner case in dfuse unit test. Looks libioil has the same issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry. False alarm. Used wrong flag in fstatat().

/* the number of low fd blocked */
static uint16_t low_fd_count;
/* upper limit of low fd that will be blocked */
static uint16_t max_low_fd_blocked;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it blocked or reserved?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe "reserved" is a little better.

Features: pil4dfs

1. add unit test for fstatat corner cases
2. rename blocked as reserved
3. remove the env to set the minimum fd daos can use
4. atomic APIs to access daos_inited
5. apply lock_global only for consume_low_fd()
6. Move pthread_atfork() inside the locked regime
7. check fd before closing it in dup2()

Required-githooks: true

Signed-off-by: Lei Huang <[email protected]>
@wiliamhuang wiliamhuang changed the title DAOS-14826 client: avoid low fd in daos_init() DAOS-14826 client: reserve low fds Jan 9, 2024
@wiliamhuang wiliamhuang requested a review from mchaarawi January 9, 2024 19:21
Features: pil4dfs

Required-githooks: true

Signed-off-by: Lei Huang <[email protected]>
mchaarawi
mchaarawi previously approved these changes Jan 9, 2024
@@ -241,6 +249,7 @@ struct sigaction old_segv;
/* the flag to indicate whether initlization is finished or not */
static bool hook_enabled;
static bool hook_enabled_bak;
static pthread_mutex_t lock_global;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not a big deal, but if a repush is needed, better to rename that to lock_reserve_fd

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you! I will fix it if I need a repush.

}
path[len] = 0;
if (strncmp(path, "socket:", 7) == 0 || strncmp(path, "anon_inode:[eventpoll]", 22) == 0) {
D_WARN("dup2/fcntl is closing fd %d (%s) which may be used by DAOS!", fd, path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i guess i don't understand what a warning is supposed to do. how can we end up in this situation?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This message is mainly for debugging. Not sure whether this is always related to real issues, especially for an application uses network. Shall I use D_DEBUG() instead?
If a bash script uses larger fd (e.g., 15) directly, this piece of code will be triggered.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For what it's worth the ioil code is at https://github.com/daos-stack/daos/blob/master/src/client/dfuse/il/int_posix.c#L741-L797

It just uses a hard-coded value of 10 and does no checking. It also uses dup for all but the first fd and does it all in one function rather than using globals.

return 0;

low_fd_count = 0;
low_fd_list[low_fd_count] = libc_open("/", O_RDONLY);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd of expected O_PATH|O_DIRECTORY here but maybe it's not required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. "O_PATH|O_DIRECTORY" works too. I can use it to match the code in other places. Thank you!

Comment on lines 4906 to 4912
/* check the fd to be closed. If the fd pointing to something like
* 'socket:[xxxx]' or 'anon_inode:[eventpoll]', print out a warning
* since DAOS network contexts may be using such fds.
*/
static void
check_fd_to_close(int fd)
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like this and it will be expensive, particularly on something like dfuse where readlink has to do a full path walk.

The low fds issue is one we need to work around but we don't want to compromise the performance of the whole library to do so. I also don't see that if there ever was an error here that it would be restricted to the dup2 call so I think this check is expensive and not helpful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. I can remove this check for now. Thank you!

Comment on lines 5117 to 5118
oldfd_directed = query_fd_forward_dest(oldfd);
if ((oldfd_directed < FD_FILE_BASE) && (oldfd < FD_FILE_BASE) && (newfd < FD_FILE_BASE))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't see the relevance of this change? Is it to do with low fds and required for the new test to pass or is it addressing another issue?

Copy link
Contributor Author

@wiliamhuang wiliamhuang Jan 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fixing a bug I recently identified. This change is needed to pass the new test of running bash script that uses low fd directly. The new test is added in the PR for exec() interception. bash opens a file on DFS then calls dup2(fake_fd, 5) to get a low fd 5, then call dup2(5, 1) to let fd 1 point to the file on DFS. After that, bash calls exec() to run "cat" to write to the file on DFS.

#! /bin/sh

exec 5>config.log {
cat <<EOF

EOF
} >&5

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should that test be in this PR or does this PR depend on the exec PR then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. The above is added in the exec PR. It requires the new code there.

Features: pil4dfs

Required-githooks: true

Signed-off-by: Lei Huang <[email protected]>
@daosbuild1
Copy link
Collaborator

Test stage Build DEB on Ubuntu 20.04 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13532/18/execution/node/283/log

@daosbuild1
Copy link
Collaborator

Test stage Build RPM on EL 9 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13532/19/execution/node/356/log

@daosbuild1
Copy link
Collaborator

Test stage Build RPM on Leap 15.5 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13532/19/execution/node/286/log

@daosbuild1
Copy link
Collaborator

Test stage Build RPM on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13532/19/execution/node/371/log

@daosbuild1
Copy link
Collaborator

Test stage Build DEB on Ubuntu 20.04 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13532/19/execution/node/272/log

@daosbuild1
Copy link
Collaborator

Test stage NLT on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-13532/18/testReport/

Features: pil4dfs

Required-githooks: true

Signed-off-by: Lei Huang <[email protected]>
@daosbuild1
Copy link
Collaborator

Test stage NLT on EL 8.8 completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-13532/20/testReport/

Features: pil4dfs

Required-githooks: true

Signed-off-by: Lei Huang <[email protected]>
Features: pil4dfs

Required-githooks: true

Signed-off-by: Lei Huang <[email protected]>
@ashleypittman
Copy link
Contributor

@ashleypittman With the latest commit, b1bf1e9, your bash test seems showing same behavior with and without libpil4dfs. Please try it and let me know. Thank you!

I can confirm that my test now runs and passes, It looks like the remaining issues were in the dup() code from your change?

There is one error I'm getting from pil4dfs during this operation which you should look at but once that's resolved then I think this PR might be complete.

src/client/dfuse/pil4dfs/int_dfs.c:1525 free_fd() dfs_release() failed: 22 (Invalid argument)

There are some IO requests to dfuse during the test, I assume this is down to the getcwd handling although I also see two opens and two reads - we can deal with that elsewhere though.

@wiliamhuang
Copy link
Contributor Author

I can confirm that my test now runs and passes, It looks like the remaining issues were in the dup() code from your change?

Yes. One situation in dup2() was not implemented previously. Also glibc chdir() is needed to update the current working directory before exec(), so we always call glibc chdir inside chdir() and fchdir() now.

There is one error I'm getting from pil4dfs during this operation which you should look at but once that's resolved then I think this PR might be complete.

src/client/dfuse/pil4dfs/int_dfs.c:1525 free_fd() dfs_release() failed: 22 (Invalid argument)

Thank you! I will look into this issue.

There are some IO requests to dfuse during the test, I assume this is down to the getcwd handling although I also see two opens and two reads - we can deal with that elsewhere though.

chdir()/fchdir() will be passed through to dfuse. exec() also needs dfuse.

@daosbuild1
Copy link
Collaborator

Test stage Functional Hardware Medium Verbs Provider completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13532/22/execution/node/1503/log

Features: pil4dfs
Priority: 2

Required-githooks: true

Signed-off-by: Lei Huang <[email protected]>
Features: pil4dfs
Priority: 2

Required-githooks: true

Signed-off-by: Lei Huang <[email protected]>
@wiliamhuang
Copy link
Contributor Author

src/client/dfuse/pil4dfs/int_dfs.c:1525 free_fd() dfs_release() failed: 22 (Invalid argument)

I just pushed a commit to introduce fd duplication reference count to handle complex situations. This issue is resolved in my local test. Could you please give it a try? Thank you!

@ashleypittman
Copy link
Contributor

src/client/dfuse/pil4dfs/int_dfs.c:1525 free_fd() dfs_release() failed: 22 (Invalid argument)

I just pushed a commit to introduce fd duplication reference count to handle complex situations. This issue is resolved in my local test. Could you please give it a try? Thank you!

Much better thank you. I updated the test to perform a write between the first and second close of the duplicated fd and was seeing a segfault before this latest revision, not it's passing successfully.

@wiliamhuang
Copy link
Contributor Author

Much better thank you. I updated the test to perform a write between the first and second close of the duplicated fd and was seeing a segfault before this latest revision, not it's passing successfully.

@ashleypittman Thanks a lot for your testing and update!

@wiliamhuang wiliamhuang requested a review from mchaarawi January 29, 2024 13:44
@wiliamhuang
Copy link
Contributor Author

@mchaarawi Could you please review the latest changes? Thank you!

@daosbuild1
Copy link
Collaborator

Test stage Functional Hardware Large completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-13532/24/execution/node/1638/log

Copy link
Contributor

@mchaarawi mchaarawi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my comment can be addressed in the next PR.

Comment on lines +1922 to +1930
rc = dfs_open(dfs_mt->dfs, parent, item_name, mode | S_IFREG, oflags & (~O_APPEND),
0, 0, NULL, &dfs_obj);
mode_query = S_IFREG;
} else if (!parent && (strncmp(item_name, "/", 2) == 0)) {
rc = dfs_lookup(dfs_mt->dfs, "/", oflags, &dfs_obj, &mode_query, NULL);
rc =
dfs_lookup(dfs_mt->dfs, "/", oflags & (~O_APPEND), &dfs_obj, &mode_query, NULL);
} else {
rc = dfs_lookup_rel(dfs_mt->dfs, parent, item_name, oflags, &dfs_obj, &mode_query,
NULL);
rc = dfs_lookup_rel(dfs_mt->dfs, parent, item_name, oflags & (~O_APPEND), &dfs_obj,
&mode_query, NULL);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some comments need to be added here as to how O_APPEND is being handled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. I will add comments in my another PR. Thank you!

@wiliamhuang wiliamhuang requested a review from a team January 29, 2024 14:46
@wiliamhuang wiliamhuang added the forced-landing The PR has known failures or has intentionally reduced testing, but should still be landed. label Jan 29, 2024
@wiliamhuang
Copy link
Contributor Author

The failure in hardware tests are not related to libpil4dfs.

@mchaarawi mchaarawi merged commit c897d29 into master Jan 29, 2024
45 of 48 checks passed
@mchaarawi mchaarawi deleted the lei/DAOS-14826 branch January 29, 2024 14:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
forced-landing The PR has known failures or has intentionally reduced testing, but should still be landed.
Development

Successfully merging this pull request may close these issues.

4 participants