Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Does Litellm Support Multi-Processing to Improve Throughput? (Potential Impact of GIL on Multi-Threaded Performance) #7579

Open
jts250 opened this issue Jan 6, 2025 · 2 comments
Labels
enhancement New feature or request

Comments

@jts250
Copy link

jts250 commented Jan 6, 2025

The Feature

I would like to express my sincere thanks to the Litellm project for providing such a great unified interface for managing large models. It's a fantastic tool that simplifies many aspects of model management. I have been using the Litellm Docker image to manage large models, and it has been working well overall.

Motivation, pitch

However, I’ve noticed that the throughput when using Litellm is significantly lower than when using Nginx. During these lower throughput periods, the CPU usage remains stable at around 110%, which made me wonder if the performance bottleneck is due to Python's Global Interpreter Lock (GIL) limiting the multi-threaded throughput of Litellm. I’m considering using multi-processing to improve the throughput, but I would like to confirm if this approach is suitable for Litellm, or if there might be a more optimal solution. Could you kindly advise if Litellm supports multi-process execution, or if there is something in my current usage that might be causing this issue?

Are you a ML Ops Team?

No

Twitter / LinkedIn details

No response

@jts250 jts250 added the enhancement New feature or request label Jan 6, 2025
@jts250
Copy link
Author

jts250 commented Jan 6, 2025

To clarify, I am using the Docker deployment of Litellm, and the startup command is:
docker run -v /vdb/configs/llm_32b.yaml:/app/config.yaml -p 38085:4000 --log-driver json-file --log-opt max-size=1g --restart always --name test_litellm ghcr.io/berriai/litellm:main-latest --config /app/config.yaml

@ishaan-jaff
Copy link
Contributor

what RPS are you trying to reach @jts250 ?

we see 1K RPS with this update

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants