capping resources assigned to each model in multi model serving #2132

saeid93 · 2023-04-13T00:07:30Z

Is there a way to cap the number (e.g. CPU cores, CUDA MPS threads) of resources assigned to each model in a multi-model tensorflow server?
The only way (straightforward way and not considering lower-level tools like cpu limits), I can think of resource allocation to the microservices (like model servers) is containerization or VMs, so I think there isn’t such an option. Is that true?

singhniraj08 · 2023-04-13T08:49:12Z

@saeid93,

Similar feature request #2097 is already in work. Requesting you to follow and +1 that thread for updates.

Yes there are no option currently to configure resources per model for multi-model serving setup. But you can try the flag rest_api_num_threads, as mentioned here if that helps.

Thank you!

saeid93 · 2023-04-13T18:03:09Z

Thank you @singhniraj08,
Is there a same plan on the roadmap for cpu inferencing?

singhniraj08 · 2023-04-14T04:42:09Z

@saeid93,

Configuring to limit CPU usage/core per model in mutli-model setup is currently not in our roadmap right now. But this sounds like a good feature to implement. I will keep this as a feature request and discuss internally within team for implementation. Once we have an update, we will update this thread.
Thank you for bringing this up to our attention.

singhniraj08 self-assigned this Apr 13, 2023

singhniraj08 added the type:support label Apr 13, 2023

singhniraj08 added the stat:awaiting response label Apr 13, 2023

google-ml-butler bot removed the stat:awaiting response label Apr 13, 2023

singhniraj08 assigned guanxinq and unassigned singhniraj08 Apr 14, 2023

singhniraj08 added stat:awaiting tensorflower type:feature and removed type:support labels Apr 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

capping resources assigned to each model in multi model serving #2132

capping resources assigned to each model in multi model serving #2132

saeid93 commented Apr 13, 2023

singhniraj08 commented Apr 13, 2023

saeid93 commented Apr 13, 2023

singhniraj08 commented Apr 14, 2023

capping resources assigned to each model in multi model serving #2132

capping resources assigned to each model in multi model serving #2132

Comments

saeid93 commented Apr 13, 2023

singhniraj08 commented Apr 13, 2023

saeid93 commented Apr 13, 2023

singhniraj08 commented Apr 14, 2023