-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sync s3 to file system target for backup #31
Comments
I believe I may have a solution. I can use the id-logging filter to log what artifacts are processed. I can then compare that list after each run against what is on the filesystem. Files on the filesystem but not in that list have been deleted. |
This should work in a cron so long as none of the files end in .deleted or .log:
|
Unfortunately, it's impossible for ecs-sync to know what was deleted on the source. The only way we can sync deletes is, as you have outlined above, by comparing the list of both storage locations. Given the nature and design of ecs-sync, we have not add this as an option (yet). We are still tinkering with ideas about efficient ways to accomplish this, but there are many caveats and considerations when deleting data that have curtailed development. The best option I've heard so far is to use the sync database to identify files that definitely were on the source system, were synced at one point, and are now gone. This at least attempts to ensure we don't delete 3rd party data on the target storage. This feature is in the backlog, but not on the roadmap as of now. As always, we welcome suggestions for improvement. |
I'm looking to backup some s3 buckets to a filesystem. I have been able to successfully sync from s3 to the filesystem, but I can't find a way to cleanup unreferenced files on the target. What I would like to do is delete a file on the target if it has not been referenced by the source s3 for over 30 days.
There is a --delete-older-than flag but this only appears to be for source objects.
Is this possible (without using force-sync)? I was thinking if there was an easy way to know when each file was last checked for syncing it could be done. Files could be purged if their last sync time was > 30 days (so long as sync ran more frequent than every 30 days). It could also be done if the target filesystem syncer had an option to always touch a file as a "liveness" indicator (without always redownloading it). Files with a timestamp > X days could be purged.
A plugin injection point of "no-op" or something along those lines could be added to SyncStorage
and called here: https://github.com/EMCECS/ecs-sync/blob/master/src/main/java/com/emc/ecs/sync/TargetFilter.java#L78
Plugins could then perform custom logic if a sync was not performed. In this case the file system plugin could have an option to force touch a file.
The text was updated successfully, but these errors were encountered: