Any way to rate limit embedding calls? #6122
Unanswered
jimmyland22
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Is there any way to rate limit or "slow down" the RAG API embedding process? I'm using Azure OpenAI embedding, and keep getting the below message.
It doesn't seem like I can increate my quote on Azure any more since it's maxing out at 350k tokens per minute. I'm uploading a PDF that's just 12MB.
openai.RateLimitError: Error code: 429 - {'error': {'code': '429', 'message': 'Requests to the Embeddings_Create Operation under Azure OpenAI API version 2023-05-15 have exceeded call rate limit of your current OpenAI S0 pricing tier. Please retry after 86400 seconds. Please go here: https://aka.ms/oai/quotaincrease if you would like to further increase the default rate limit.'}}
Beta Was this translation helpful? Give feedback.
All reactions