-
Notifications
You must be signed in to change notification settings - Fork 901
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(server): Use system packages for execution #1252
base: main
Are you sure you want to change the base?
Conversation
I could not understand the precise motivation despite your description. Could you perhaps describe the problem people faced before so I understand the solution? |
Absolutely. The current approach to running the server is somewhat complex, as it involves a Python interpreter calling a Bash script, which in turn calls another Python interpreter. The process looks like this:
The current stack does not handle signals correctly, causing the main “run” command to exit abruptly upon receiving SIGINT or SIGTERM. Additionally, if all required packages are available - such as in a container - we can run “llama stack run …” instead of invoking the “server” module directly, which provides a more user and admin-friendly experience. I don't think we want to encourage user to run the server with Hope this clarifies. |
95ddd2b
to
86d7c37
Compare
In addition to the signal handling benefits that Sebastien described, this will be needed in production where tools like virtualenvs / conda / uv / pip are not desired or forbidden. |
Thank you both for the details. Reviewing now... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't run the code (I don't have an environment with all packages installed as system packages at hand at the moment), so this review is just from my reading of the code. Please ignore if I'm missing something.
895abbf
to
ff58776
Compare
If you have venv activated that would do :) |
I'm testing this locally, and I see this behavior. When killing the server, I get:
With this change:
I get the following:
When I use |
I only see this behavior on Python < 3.12. |
"""Start the LlamaStack server.""" | ||
parser = argparse.ArgumentParser(description="Start the LlamaStack server.") | ||
parser.add_argument( | ||
"--yaml-config", | ||
"--config", | ||
dest="config", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, much nicer :)
wonder if we could instead add a --config
argument separately, and add a deprecation warning for --yaml-config
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
llama_stack/cli/stack/run.py
Outdated
try: | ||
from llama_stack.distribution.server.server import main as server_main | ||
except ImportError as e: | ||
self.parser.error(f"Failed to import server module: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't think this is a parser error. why trap the error and not let the Exception bubble up? for better UX?
llama_stack/cli/stack/run.py
Outdated
server_args = argparse.Namespace() | ||
for arg in vars(args): | ||
# if this is a function, avoid passing it | ||
if callable(getattr(args, arg)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this bit for? seems fishy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how args
looks like:
args Namespace(func=<bound method StackRun._run_stack_run_cmd of <llama_stack.cli.stack.run.StackRun object at 0x10484b010>>, config='./llama_stack/templates/ollama/run.yaml', port=8321, image_name=None, disable_ipv6=True, env=None, tls_keyfile=None, tls_certfile=None, image_type=None)
So I think we want to avoid passing func=<bound method StackRun._run_stack_run_cmd of <llama_stack.cli.stack.run.StackRun object at 0x10484b010>>
.
llama_stack/cli/stack/run.py
Outdated
# Run the server | ||
server_main(server_args) | ||
except Exception as e: | ||
self.parser.error(f"Failed to run server: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again, not a parser error at this point. we are way beyond the point of things being the responsibility of the argparser
llama_stack/cli/stack/run.py
Outdated
|
||
# Run the server | ||
server_main(server_args) | ||
except Exception as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i'd rather not catch these exceptions and let them bubble. or if you do catch, ensure you always print some backtrace
ff58776
to
8db9aef
Compare
Users prefer to rely on the main CLI rather than invoking the server through a Python module. Users interact with a high-level CLI rather than needing to know internal module structures. Now, when running llama stack run <path-to-config>, the server will attempt to use the system package or a virtual environment if one is active. This also eliminates the current process dependency chain when running from a virtual environment: -> llama stack run -> start_env.sh -> python -m server... Signed-off-by: Sébastien Han <[email protected]>
8db9aef
to
67d0c0a
Compare
What does this PR do?
Users prefer to rely on the main CLI rather than invoking the server through a Python module. Users interact with a high-level CLI rather than needing to know internal module structures.
Now, when running llama stack run , the server will attempt to use the system package or a virtual environment if one is active.
This also eliminates the current process dependency chain when running from a virtual environment:
-> llama stack run
-> start_env.sh
-> python -m server...
Signed-off-by: Sébastien Han [email protected]
Test Plan
Run:
Notice that the server starts and shutdowns normally.