Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CK_TILE] Sync fmha fwd splitkv minor optimizations #1785

Open
wants to merge 8 commits into
base: develop
Choose a base branch
from

Conversation

poyenc
Copy link
Contributor

@poyenc poyenc commented Jan 1, 2025

Proposed changes

We have done some optimizations on branch ck_tile/support-vllm-kcache-layout. It's time to sync those changes back to develop (exclude V colum major vector load).

  • Add kPadHeadDimQ=kPadHeadDimV=false fmha fwd splitkv instances (allow vector load key/value)
  • Enlarge tile size for group mode + paged-kvcache. Also update the num_splits heuristic for better estimation
  • Hide page-block table reading latency
  • Avoid unnecessary splitkv masking computations
  • Avoid launching splitkv combine kernel if num_splits=1
  • Update fmha fwd example
  • Update license year

Checklist

  • Configure project with cmake-ck-dev.sh script and add option -DFMHA_FWD_ENABLE_APIS="fwd_splitkv" to enable the fmha_fwd_splitkv() API
  • Run command smoke_test_fwd.sh -s and see if all the test case pass

@poyenc poyenc self-assigned this Jan 1, 2025
@poyenc poyenc changed the title Ck tile/fmha fwd splitkv minor opt [CK_TILE] fmha fwd splitkv minor optimizations Jan 1, 2025
@poyenc poyenc marked this pull request as draft January 1, 2025 18:06
@poyenc poyenc marked this pull request as ready for review January 1, 2025 19:10
@poyenc poyenc force-pushed the ck_tile/fmha-fwd-splitkv-minor-opt branch from 6d29208 to 25e1015 Compare January 2, 2025 03:40
@poyenc poyenc changed the title [CK_TILE] fmha fwd splitkv minor optimizations [CK_TILE] Sync fmha fwd splitkv minor optimizations Jan 2, 2025
@poyenc poyenc requested a review from afagaj as a code owner January 3, 2025 17:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant