[enhancement]: Add idle timeout flag when running with "once" enabled #4806

Nohac · 2024-05-22T09:47:21Z

Describe your feature request here

I'm managing a kubernetes cluster that uses KEDA to dispatch agents on demand based on the pool queue on DevOps. The agents run with --once to make sure they shut down after each job to allow the cluster to scale down it's nodes when no jobs are running, this works fine most of the time.

The issue arises if, for whatever reason, the new agent did not receive a job (this could happen if someone cancels a job, or something else unexpected happens). This is usually fine in a busy pool, since the agent will receive a job within a short amount of time, however, when this happens at the end of the day, or end of the work week, this can cause unnecessary infrastructure to run over the weekend, which will dramatically increase the cost, especially if the infrastructure includes GPU's or other expensive hardware.

I think this could be easily fixed by adding an "idle timeout" flag to the agent, this flag should allow specifying how long an agent is allowed to run while being idle.

./run-agent.sh --timeout 5m --once

The above command would ensure that the agent would timeout after 5 minutes, unless it received a job within that time frame.

I could work around this issue by using the DevOps api to fetch idle agents and tell kubernetes to stop the pod, but this seems like a lot of work that could be easily avoided with this proposal.

The text was updated successfully, but these errors were encountered:

DmitriiBobreshev · 2024-05-22T15:47:29Z

Hi @Nohac, thank you for the idea. We're working on higher-prioritized issues at the moment, but we'll try to implement it soon as we can.

Nohac · 2024-05-24T09:32:27Z

I'm willing to try implementing this feature if someone can point me in the right direction.

github-actions · 2024-11-20T10:03:01Z

This issue has had no activity in 180 days. Please comment if it is not actually stale

Nohac · 2024-11-27T12:17:11Z

It would still be nice to have this feature, please re-open the issue.

Nohac added the enhancement label May 22, 2024

github-actions bot added Area: Agent triage labels May 22, 2024

DmitriiBobreshev removed the triage label May 22, 2024

github-actions bot added the stale label Nov 20, 2024

github-actions bot closed this as completed Nov 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[enhancement]: Add idle timeout flag when running with "once" enabled #4806

[enhancement]: Add idle timeout flag when running with "once" enabled #4806

Nohac commented May 22, 2024

DmitriiBobreshev commented May 22, 2024 •

edited

Loading

Nohac commented May 24, 2024

github-actions bot commented Nov 20, 2024

Nohac commented Nov 27, 2024

[enhancement]: Add idle timeout flag when running with "once" enabled #4806

[enhancement]: Add idle timeout flag when running with "once" enabled #4806

Comments

Nohac commented May 22, 2024

Describe your feature request here

DmitriiBobreshev commented May 22, 2024 • edited Loading

Nohac commented May 24, 2024

github-actions bot commented Nov 20, 2024

Nohac commented Nov 27, 2024

DmitriiBobreshev commented May 22, 2024 •

edited

Loading