Retry batch delete blob on 503 #1277

maingoh · 2024-05-21T13:19:52Z

Is your feature request related to a problem? Please describe.

When deleting a lot of blobs using the batch API, it sometimes raises a ServiceUnavailable: 503 BATCH contentid://None: We encountered an internal error. Please try again.. This is a bit undesired, that it raises in the middle of a big deletion job.

Describe the solution you'd like

I tried settings the retry parameter at the client level client.get_bucket(bucket_name, retry=retry, timeout=600) or a the blob level blob.delete(retry=retry, timeout=600), even forcing the if_generation_match=blob.generation. No retry seem to be done. The class does not seem to use any retry here:

python-storage/google/cloud/storage/batch.py

Line 309 in c52e882

response = self._client._base_connection._make_request(

Either the client can support it, or at the very least the batch object should give access to the blobs (subtasks) that couldn't be deleted so that we can retry manually.
A manual retry of the full batch (for loop) does not work as some of the blobs from the batch got deleted in the first attempt, raising a 404 on the second attempt.

A clear and concise description of what you want to happen.

Retry or give the user the ability to retry only the one that fails

The text was updated successfully, but these errors were encountered:

StrtCoding · 2024-11-26T18:49:50Z

HI, someone find a solution for this?

A transient error might occur when writing a lot of shards to GCS, and right now the GCS IO does not have any retry logic in place: https://github.com/apache/beam/blob/a06454a2/sdks/python/apache_beam/io/gcp/gcsio.py#L269 It means that in such cases the entire bundle of elements fails, and then Beam itself will attempt to retry the entire bundle, and will fail the job if it exceeds the number of retries. This change adds new logic to retry only failed requests, and uses the typical exponential backoff strategy. Note that this change accesses a private method (`_predicate`) of the retry object, which we could avoid by basically copying the logic over here. But existing code already accesses `_responses` property so maybe it's not a big deal. https://github.com/apache/beam/blob/b4c3a4ff/sdks/python/apache_beam/io/gcp/gcsio.py#L297 Existing (unresolved) issue in the GCS client library: googleapis/python-storage#1277

* Add retry logic to each batch method of the GCS IO A transient error might occur when writing a lot of shards to GCS, and right now the GCS IO does not have any retry logic in place: https://github.com/apache/beam/blob/a06454a2/sdks/python/apache_beam/io/gcp/gcsio.py#L269 It means that in such cases the entire bundle of elements fails, and then Beam itself will attempt to retry the entire bundle, and will fail the job if it exceeds the number of retries. This change adds new logic to retry only failed requests, and uses the typical exponential backoff strategy. Note that this change accesses a private method (`_predicate`) of the retry object, which we could avoid by basically copying the logic over here. But existing code already accesses `_responses` property so maybe it's not a big deal. https://github.com/apache/beam/blob/b4c3a4ff/sdks/python/apache_beam/io/gcp/gcsio.py#L297 Existing (unresolved) issue in the GCS client library: googleapis/python-storage#1277 * Catch correct exception type in `_batch_with_retry` The `RetryError` would be always raised since the retry decorator would catch all HTTP-related exceptions. * Update chanelog with GCSIO retry logic fix

product-auto-label bot added the api: storage Issues related to the googleapis/python-storage API. label May 21, 2024

blunderbuss-gcf bot assigned danielduhh May 21, 2024

cojenco added type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. priority: p3 Desirable enhancement or fix. May not be included in next release. labels May 21, 2024

sadovnychyi mentioned this issue Jan 8, 2025

Add retry logic to each batch method of the GCS IO apache/beam#33539

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Retry batch delete blob on 503 #1277

Retry batch delete blob on 503 #1277

maingoh commented May 21, 2024

StrtCoding commented Nov 26, 2024

Retry batch delete blob on 503 #1277

Retry batch delete blob on 503 #1277

Comments

maingoh commented May 21, 2024

StrtCoding commented Nov 26, 2024