Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPEX v2.5.10: Fail to run inference with quantized Phi-3 #756

Open
shira-g opened this issue Dec 22, 2024 · 6 comments
Open

IPEX v2.5.10: Fail to run inference with quantized Phi-3 #756

shira-g opened this issue Dec 22, 2024 · 6 comments
Assignees

Comments

@shira-g
Copy link

shira-g commented Dec 22, 2024

Describe the issue

I tried the example in: https://github.com/intel/intel-extension-for-pytorch/tree/release/xpu/2.5.10/examples/gpu/llm/inference#learn-to-quantize-llm-and-save-quantized-model-then-run-inference-with-quantized-model , using microsoft/Phi-3-mini-4k-instruct model.

It fails with:

  File "C:\Users\sdp\.cache\huggingface\modules\transformers_modules\0a67737cc96d2554230f90338b163bc6380a2a85\modeling_phi3.py", line 1305, in prepare_inputs_for_generation
    elif past_length < input_ids.shape[1]:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: '<' not supported between instances of 'NoneType' and 'int'
@xiguiw
Copy link
Contributor

xiguiw commented Dec 23, 2024

@shira-g

Let me check it.

Could you provide some more detail about your environment? The GPU driver version and the transformers version you used.
How you set up your environment?

Thanks!

@xiguiw xiguiw self-assigned this Dec 23, 2024
@shira-g
Copy link
Author

shira-g commented Dec 23, 2024

I'm running on Windows 11, Intel(R) Core(TM) Ultra 5.
transformers version: 4.44.2
GPU driver: Intel(R) Arc(TM) 130V GPU (16GB) driver version: 32.0.101.6325

I set up a fresh conda environment with python 3.12, and run:
conda install libuv
python -m pip install torch==2.5.1+cxx11.abi torchvision==0.20.1+cxx11.abi torchaudio==2.5.1+cxx11.abi intel-extension-for-pytorch==2.5.10+xpu --extra-index-url https://pytorch-extension.intel.com/release-whl/stable/lnl/cn/

Then, I run the sample from the link I sent above, by replacing model_id= "microsoft/Phi-3-mini-4k-instruct"

Thank you

@xiguiw
Copy link
Contributor

xiguiw commented Dec 30, 2024

@shira-g

What's the script you used to run the model?

Please run the Model with this script:

@shira-g
Copy link
Author

shira-g commented Dec 31, 2024

Hi, thank you for the reference.

I used the following script: https://github.com/intel/intel-extension-for-pytorch/tree/release/xpu/2.5.10/examples/gpu/llm/inference#learn-to-quantize-llm-and-save-quantized-model-then-run-inference-with-quantized-model, and in line 10 it sets: 'use_hf_code = True'.

This is the cause for my failure when using Phi-3.
If I set use_hf_code = False, it runs successfully.

You might want to debug this when using 'use_hf_code = True'.

@xiguiw
Copy link
Contributor

xiguiw commented Jan 2, 2025

Hi, thank you for the reference.

I used the following script: https://github.com/intel/intel-extension-for-pytorch/tree/release/xpu/2.5.10/examples/gpu/llm/inference#learn-to-quantize-llm-and-save-quantized-model-then-run-inference-with-quantized-model, and in line 10 it sets: 'use_hf_code = True'.

This is the cause for my failure when using Phi-3. If I set use_hf_code = False, it runs successfully.

You might want to debug this when using 'use_hf_code = True'.

That's interesting.
Would you please provide the details (logs etc,) for the condition 'use_hf_code = True'?
Thanks!

@shira-g
Copy link
Author

shira-g commented Jan 7, 2025

which logs do you need?
Following is the error I get if setting : 'use_hf_code = True'.

  File "C:\Users\sdp\.cache\huggingface\modules\transformers_modules\0a67737cc96d2554230f90338b163bc6380a2a85\modeling_phi3.py", line 1305, in prepare_inputs_for_generation
    elif past_length < input_ids.shape[1]:
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: '<' not supported between instances of 'NoneType' and 'int'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants