LLM-on-Ray introduces a Web UI, allowing users to easily finetune and deploy LLMs through a user-friendly interface. Additionally, the UI includes a chatbot application, enabling users to immediately test and refine the models.
Please follow setup.md to setup the environment first.
python -u ui/start_ui.py --node_user_name $user --conda_env_name $conda_env --master_ip_port "$node_ip:6379"
# Get urls from the log
# Running on local URL: http://0.0.0.0:8080
# Running on public URL: https://180cd5f7c31a1cfd3c.gradio.live
On the Finetune
tab, you can configure the base model, finetuning parameters, the dataset path and the new model name. Click Start To Finetune
to start finetuning.
On the Deployment
tab, you can choose a model to deploy, configure parameter Model Replica Number
, Cpus per Worker
and Gpus per Worker
. Click Deploy
and you will get a model endpoint.
On the Inference
tab, you can now test the model by asking questions.