Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove temporary slurm resume file at the end of resume_program #2865

Merged
merged 1 commit into from
Jan 22, 2025

Conversation

hanwen-cluster
Copy link
Contributor

Prior this commit, the code only had trap "rm -f ${SLURM_RESUME_FILE_TMP}" EXIT, which works fine for the script until the sudo -u happens. When using sudo -u to run a command, it creates a new process with a different user context. The trap that was set in the parent shell does not carry over to this new process. Therefore, when the script ends through the sudo command, the EXIT trap in the original shell never gets executed.

Therefore, this commit add another removal of temporary file at the end of the script

Tests

  • Manually tested job submissions. The temporary files are cleaned up properly.

References

Checklist

  • Make sure you are pointing to the right branch.
  • If you're creating a patch for a branch other than develop add the branch name as prefix in the PR title (e.g. [release-3.6]).
  • Check all commits' messages are clear, describing what and why vs how.
  • Make sure to have added unit tests or integration tests to cover the new/modified code.
  • Check if documentation is impacted by this change.

Please review the guidelines for contributing and Pull Request Instructions.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

Prior this commit, the code only had `trap "rm -f ${SLURM_RESUME_FILE_TMP}" EXIT`, which works fine for the script until the `sudo -u` happens. When using `sudo -u` to run a command, it creates a new process with a different user context. The trap that was set in the parent shell does not carry over to this new process. Therefore, when the script ends through the sudo command, the EXIT trap in the original shell never gets executed.

Therefore, this commit add another removal of temporary file at the end of the script

Signed-off-by: Hanwen <[email protected]>
@hanwen-cluster hanwen-cluster merged commit a21a33e into aws:develop Jan 22, 2025
31 of 36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants