-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Evaluate on Jamba-1.5-Mini #69
Comments
I manage to make it run but I am getting the following error:
|
If you setup |
I remove
I manage to run with |
@coranholmes Can you try vllm with the latest container |
@hsiehjackson , I tried I tried different versions of transformers, it didn't help. Any suggestions? |
@dawenxi-007 Can you try pull the docker again? I update to be compatible with HF and vLLM. |
@hsiehjackson If in the If in the |
@dawenxi-007 can you check whether you can see GPUs inside the docker container? |
@hsiehjackson , yes, I forgot to enable the gpus argument. Now I can see the model is loading into the GPUs, however, I got the following OOM error: ` |
Do you use HF or vLLM? If you are using vLLM, maybe you can first reduce |
Thanks @hsiehjackson! I set it with HF and now was able to run it with 4xH100s. The results are a little bit confusing to me. In the
However, the results only show 8 of them, for example, for 128K,
This made the avg core of the result worse than the official number you posted. Do you have any idea what could be the issue? |
@dawenxi-007 can you check whether you have all 13 prediction jsonl files under folder |
Only 8.
|
You can also check |
Oh, yes, there are 13 files under the data folder. |
It seems that some
|
The root cause is that some tests rely on the nltk data packages, which is missing in the docker image. After I downloaded the packages, it seems running okay now. I would update it if any new issue popped out. But thanks for your help! @hsiehjackson |
@hsiehjackson I was able to get the all the results. One more question, do we have detail descriptions for all the 13 tests. I noticed that the paper only shows 8 tests with examples at Table 2. From the tests, QA_2 gave the much worse results compared to others. Want to know a little bit more details on why. |
You can find the description of 13 tasks in Appendix B Table 5 :) |
The scripts stuck at:
[nltk_data] Downloading package punkt to /root/nltk_data...
But it works fine when I evaluate llama3-8b-instruct. I am wondering whether there is any setting I need to config for Jamba? I have already added the model in MODEL_SELCT
template.py
The text was updated successfully, but these errors were encountered: