Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit the number of files deleted in a single tarsnap call to 500 #57

Merged
merged 1 commit into from
Jan 16, 2019
Merged

Limit the number of files deleted in a single tarsnap call to 500 #57

merged 1 commit into from
Jan 16, 2019

Conversation

jeffwidman
Copy link
Contributor

I had a misconfiguration preventing tarsnapper from deleting
files several years. When I realized it and fixed it, I discovered I
now have 200K+ tarsnap archives to delete. However, when tarsnapper
tries to call tarsnap with 200K+ arguments, it results in
OSError: [Errno 7] Argument list too long. The root cause is bash
complaining that the bytes length of the entire command exceeds
ARG_MAX.

So this limits the maximum number of files deleted in a single call to
tarsnap to 500.

500 is a somewhat arbitrary batch size. The actual restriction is
typically a bytes limit on the size of the command that the shell will
accept, which can vary widely across OS flavors. Because it's a bytes
limit, this is also dependent on the length of the backup filenames. So
rather than deal with all this complexity, 500 was chosen as a
reasonable balance between safety and speed.

I had a misconfiguration preventing `tarsnapper` from deleting
files several years. When I realized it and fixed it, I discovered I
now have 200K+ tarsnap archives to delete. However, when `tarsnapper`
tries to call `tarsnap` with 200K+ arguments, it results in
`OSError: [Errno 7] Argument list too long`. The root cause is bash
complaining that the bytes length of the entire command exceeds
`ARG_MAX`.

So this limits the maximum number of files deleted in a single call to
`tarsnap` to 500.

500 is a somewhat arbitrary batch size. The actual restriction is
typically a bytes limit on the size of the command that the shell will
accept, which can vary widely across OS flavors. Because it's a bytes
limit, this is also dependent on the length of the backup filenames. So
rather than deal with all this complexity, 500 was chosen as a
reasonable balance between safety and speed.
@jeffwidman
Copy link
Contributor Author

An alternative strategy for dealing with this issue would be to run loop that optimistically tries the full command, catches the Python exception, removes N files from it, then tries the command again... while this would generate the optimal result, it could be a little slow, for example in my case to drop from 100K files down to 1K or so would require 99K failures... Plus it makes for a really slow delete call... doing a couple of batches of 500 or another reasonable number still seems simplest.

@miracle2k
Copy link
Owner

Oh my. Is there a UX improvement that can be done to prevent this kind of misconfiguration?

As for the PR, thanks. I don't think there is a need for testing the length, a fixed value is fine.

@miracle2k miracle2k merged commit fea42d1 into miracle2k:master Jan 16, 2019
miracle2k added a commit that referenced this pull request Jan 16, 2019
…s-per-call

Limit the number of files deleted in a single `tarsnap` call to 500
@jeffwidman
Copy link
Contributor Author

jeffwidman commented Jan 16, 2019

Thanks for merging!

Is there a UX improvement that can be done to prevent this kind of misconfiguration?

My problem was #18 and it was entirely my fault for not checking logs for errors, only improvement possible is #52

@jeffwidman jeffwidman deleted the limit-mass-deletes-to-500-files-per-call branch January 16, 2019 23:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants