-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't infer Qwen2-1.5B with a lora #1186
Comments
The documentation for this is pretty bad so I had the same issue. You have to convert the base model using Try this command:
|
I have the same problem |
on one reply me |
@busishengui were you able to give the above suggestion a try? In addition to converting the lora adapters to onnx format, you also need to convert the base model using olive so it has the necessary lora nodes. Let us know if you're still having issues. |
Thanks very much, it could work, but there is another problem python3 -m olive auto-opt \
--model_name_or_path Qwen2-1.5B-Instruct \
--adapter_path ./prc_slm_v1170_best_lora \
--device cpu \
--provider CPUExecutionProvider \
--use_ort_genai \
--output_path ./release \
--precision int4 \
--use_model_builder \
--log_level 4 then I get these files: release/
├── model
│ ├── adapter_weights.onnx_adapter
│ ├── added_tokens.json
│ ├── config.json
│ ├── genai_config.json
│ ├── generation_config.json
│ ├── merges.txt
│ ├── model.onnx
│ ├── release
│ ├── special_tokens_map.json
│ ├── tokenizer_config.json
│ ├── tokenizer.json
│ └── vocab.json
└── model_config.json Then I use the ''model'' floder as my base LLM, and the adapter_weights.onnx_adapter as my Lora model, but it don't work, |
I use the Qwen2-1.5B model and a lora:
+Infer code
but it dump !!! I don't know why
The text was updated successfully, but these errors were encountered: