You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Set adaptive resources to Low to allow ML to scale down to 0 # of allocations when there are no active inference requests
When using the inference API for Elasticsearch or ELSER, enable adaptive_allocations which will allow ML to scale down the models to 0 # of allocations when there are no active inference requests
Serverless Docs
Welcome to Elastic Serverless
Description
It can be helpful to add another bullet point under this section https://www.elastic.co/guide/en/serverless/current/elasticsearch-billing.html#elasticsearch-billing-managing-elasticsearch-costs that talks about the two ways to control the ML VCU costs:
adaptive_allocations
which will allow ML to scale down the models to 0 # of allocations when there are no active inference requestsResources and additional context
https://www.elastic.co/guide/en/serverless/current/elasticsearch-billing.html#elasticsearch-billing-managing-elasticsearch-costs
The text was updated successfully, but these errors were encountered: