New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[BUG] Gate/grouped_topk scoring func dtype issue(BF16 vs FP32) #696

Open

jikunshang opened this issue Feb 21, 2025 · 1 comment

jikunshang commented Feb 21, 2025 •

edited

Loading

Describe the bug

on huggingface implementation, GateMoE will use FP32 for Linear and further compute. see https://huggingface.co/deepseek-ai/DeepSeek-V3/blob/main/modeling_deepseek.py#L427-L429

while on github implementation, Gate weight is BF16 and it will use BF16 for linear, scoring_func(sigmod) and further. see link https://github.com/deepseek-ai/DeepSeek-V3/blob/main/inference/model.py#L573-L577

would this cause accuracy issue? and any recommended/reference impl?

To Reproduce
Steps to reproduce the behavior.

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

Author

jikunshang commented Feb 24, 2025

@GeeeekExplorer @mowentian Please take a look, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment