You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In order to use AutoScaling, and more specifically scaledown feature.
I understand that I needs to set pods restartPolicy to Never or OnFailure to allow my workers pods normally shutdown and not restart automatically.
But when I redeploy my working daskcluster by adding restartPolicy: Never, worker failed to deploy. Operator failed to update replicas (see logs)
---
apiVersion: kubernetes.dask.org/v1kind: DaskClustermetadata:
name: dask-clusterspec:
worker:
replicas: 2spec:
restartPolicy: Never...
---
apiVersion: kubernetes.dask.org/v1kind: DaskAutoscalermetadata:
name: dask-cluster-autoscalerspec:
cluster: dask-clusterminimum: 2# minimum number of workers to createmaximum: 4# maximum number of workers to create
Environment:
Dask kubernetes version: 2024.1.0
K8s : 1.28
Dask Operator logs:
[2024-01-18 14:45:48,174] kopf.objects [ERROR ] [*****/dask-cluster-default] Handler 'daskworkergroup_replica_update/spec.worker.replicas' failed with an exception. Will retry.
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/kopf/_core/actions/execution.py", line 276, in execute_handler_once
result = await invoke_handler(
File "/usr/local/lib/python3.10/site-packages/kopf/_core/actions/execution.py", line 371, in invoke_handler
result = await invocation.invoke(
File "/usr/local/lib/python3.10/site-packages/kopf/_core/actions/invocation.py", line 116, in invoke
result = await fn(**kwargs) # type: ignore
File "/usr/local/lib/python3.10/site-packages/dask_kubernetes/operator/controller/controller.py", line 591, in daskworkergroup_replica_update
await worker_deployment.create()
File "/usr/local/lib/python3.10/site-packages/kr8s/_objects.py", line 232, in create
async with self.api.call_api(
File "/usr/local/lib/python3.10/contextlib.py", line 199, in __aenter__
return await anext(self.gen)
File "/usr/local/lib/python3.10/site-packages/kr8s/_api.py", line 134, in call_api
response.raise_for_status()
File "/usr/local/lib/python3.10/site-packages/httpx/_models.py", line 759, in raise_for_status
raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Client error '422 Unprocessable Entity' for url 'https://10.32.0.1/apis/apps/v1/namespaces/*****/deployments'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/422
[2024-01-18 14:45:48,196] kopf.objects [WARNING ] [*****/dask-cluster-default] Patching failed with inconsistencies: (('remove', ('status', 'kopf'), {'progress': {'daskworkergroup_replica_update/spec.worker.replicas': {'started': '2024-01-18T14:45:47.030812', 'stopped': None, 'delayed': '2024-01-18T14:46:48.175474', 'purpose': 'create', 'retries': 1, 'success': False, 'failure': False, 'message': "Client error '422 Unprocessable Entity' for url 'https://10.32.0.1/apis/apps/v1/namespaces/*****/deployments'\nFor more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/422", 'subrefs': None}}}, None),)
[2024-01-18 14:45:49,647] kopf.activities.prob [INFO ] Activity 'now' succeeded.
The text was updated successfully, but these errors were encountered:
In order to use AutoScaling, and more specifically scaledown feature.
I understand that I needs to set pods
restartPolicy
toNever
orOnFailure
to allow my workers pods normally shutdown and not restart automatically.But when I redeploy my working daskcluster by adding
restartPolicy: Never
, worker failed to deploy. Operator failed to update replicas (see logs)Environment:
Dask Operator logs:
The text was updated successfully, but these errors were encountered: