flashinfer shrink vs cutlass #25

YLGH · 2023-11-30T00:19:43Z

Hi, I really enjoyed learning about SGMV.

I was grokking through the code and wanted to check my understanding. It seems that there are two implementations of SGMV, one based on Grouped GEMM cutlass and another hand written one (using some utils from flashinfer). Just wondering, what is the performance benchmark between the two?

abcdabcd987 · 2023-11-30T00:29:40Z

Thanks for taking a close look!

We'll deprecate the cutlass implementation in the future. See discussions here: #2

YLGH · 2023-11-30T00:42:34Z

Makes sense, thanks!

So it seems like the recommendation would be to use the hand written version for shrink:

https://github.com/punica-ai/punica/blob/master/csrc/sgmv_flashinfer/sgmv_flashinfer.cuh

and in the meantime use the cutlass based version for expand
https://github.com/punica-ai/punica/blob/master/csrc/sgmv/sgmv_cutlass.cuh#L81C3-L81C3
?

abcdabcd987 · 2023-11-30T00:47:25Z

Correct. Once we got time to push out custom expand, we'll deprecate cutlass. You can use punica.add_lora_sgmv_custom_cutlass() for LoRA for now.

Related: #11

YLGH · 2023-11-30T17:51:13Z

Sounds great, thanks!

jcao-ai · 2023-12-01T00:12:12Z

@abcdabcd987 Can't wait to the customized version. So far we use the current version in production and performance seems good for multi-lora deployment.

abcdabcd987 · 2023-12-01T00:14:32Z

@jcao-ai Glad that Punica got deployed and serves your usage :)

YLGH closed this as completed Nov 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

flashinfer shrink vs cutlass #25

flashinfer shrink vs cutlass #25

YLGH commented Nov 30, 2023

abcdabcd987 commented Nov 30, 2023

YLGH commented Nov 30, 2023

abcdabcd987 commented Nov 30, 2023 •

edited

Loading

YLGH commented Nov 30, 2023

jcao-ai commented Dec 1, 2023

abcdabcd987 commented Dec 1, 2023

flashinfer shrink vs cutlass #25

flashinfer shrink vs cutlass #25

Comments

YLGH commented Nov 30, 2023

abcdabcd987 commented Nov 30, 2023

YLGH commented Nov 30, 2023

abcdabcd987 commented Nov 30, 2023 • edited Loading

YLGH commented Nov 30, 2023

jcao-ai commented Dec 1, 2023

abcdabcd987 commented Dec 1, 2023

abcdabcd987 commented Nov 30, 2023 •

edited

Loading