You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi! I just wanted to how, how would I go about running Qwen using executorch? I was able to create the .pte file for Qwen. The example for Llama had a step 'Create a llama runner for android'. Do we have to do something similar for Qwen by creating a custom runner? Also the Qwen repository on Hugging Face Hub does not have a 'tokenizer.model' file, but the Llama example requires it for running inference using the adb shell. How to navigate around this?
Suggest a potential alternative/fix
No response
The text was updated successfully, but these errors were encountered:
@Arya-Hari for some more context, the llama_runner binary used in our examples is heavily tailored to the llama model architecture. So as Kimish mentioned, depending on the interface of Qwen compared to llama you may not be able to re-use the llama_runner binary. If you are familiar with the interface of the model, then the best way would be to fork or modify the llama_runner binary for the Qwen model; essentially creating a custom runner as you mentioned.
📚 The doc issue
Hi! I just wanted to how, how would I go about running Qwen using executorch? I was able to create the .pte file for Qwen. The example for Llama had a step 'Create a llama runner for android'. Do we have to do something similar for Qwen by creating a custom runner? Also the Qwen repository on Hugging Face Hub does not have a 'tokenizer.model' file, but the Llama example requires it for running inference using the adb shell. How to navigate around this?
Suggest a potential alternative/fix
No response
The text was updated successfully, but these errors were encountered: