forked from mlc-ai/mlc-llm
-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Enable running PyTorch models (#207)
* refactor to separate TVM specific bits from paged_cache_model * fix * Remove engine config change for now * make mypy happy with TextGenerator impl by Model * stub * wip * wip * wip * PT model memory profiling works * get rid of vllm prepare_inputs * wip * model runs but nan output * mypy improvement * runs e2e but the result is garbage * working * minor * do sampling by mlc function * merge fix * wip parallel sampling * fix test * wip * fix * wip * wip * wip * attach cache_blocks to model * change get_num_cache_blocks signature * wip * wip * wip * refactor * update for qwen * mergei fix * clean * KV cache refactor to decouple cache blocks and metadata about them * update for KV refactor * updated for the latest vllm * qwen and phi supported * Make num_shards configuable via engine config * unify Model and ModelRpcClient classes * support PT model in server * properly allocate port * refactor engine creation * fix sync point * do not create executor at each step * remove dup obtain calls * fix * use sample_from_logits * enable TCP NoDelay option to fix slow socket recv issue * Replace TCP with Unix domain socket * clean and add note on RPC overhead * clean * RPC process join works * fix mypy * merge fix * wip test fix * fix * Properly verify sampling params in api handler * Create model artifact config before module initialization * fix engine start * fix * black * properly handle import failure * add titoken dep * revert logprob change * restored tokenizer.is_fast assert but commented out * fix vocab siz * properly account for logits storage in memory profiling * merge fix * validate num_shards in engine creation * replace print with structlog * add peak memory log for tvm as well * add tokenizer.is_fast warning on creation
- Loading branch information
Showing
13 changed files
with
850 additions
and
64 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.