Skip to content

Commit

Permalink
[Bugfix] fused_experts_impl wrong compute type for float32 (vllm-proj…
Browse files Browse the repository at this point in the history
…ect#11921)

Signed-off-by: shaochangxu.scx <[email protected]>
Co-authored-by: shaochangxu.scx <[email protected]>
  • Loading branch information
shaochangxu and shaochangxu.scx authored Jan 11, 2025
1 parent 2118d05 commit c32a7c7
Showing 1 changed file with 8 additions and 2 deletions.
10 changes: 8 additions & 2 deletions vllm/model_executor/layers/fused_moe/fused_moe.py
Original file line number Diff line number Diff line change
Expand Up @@ -701,8 +701,14 @@ def fused_experts_impl(hidden_states: torch.Tensor,
device=hidden_states.device,
dtype=hidden_states.dtype)

compute_type = (tl.bfloat16
if hidden_states.dtype == torch.bfloat16 else tl.float16)
if hidden_states.dtype == torch.bfloat16:
compute_type = tl.bfloat16
elif hidden_states.dtype == torch.float16:
compute_type = tl.float16
elif hidden_states.dtype == torch.float32:
compute_type = tl.float32
else:
raise ValueError(f"Unsupported compute_type: {hidden_states.dtype}")

if inplace:
out_hidden_states = hidden_states
Expand Down

0 comments on commit c32a7c7

Please sign in to comment.