Skip to content

Commit

Permalink
Update MLX integration to use new generate_step function signature (l…
Browse files Browse the repository at this point in the history
  • Loading branch information
aliasaria authored Feb 9, 2024
1 parent 3f61c6e commit ddb2cc9
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 4 deletions.
6 changes: 3 additions & 3 deletions docs/mlx_integration.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,11 +13,11 @@ Note that for Apple Silicon Macs with less memory, smaller models (or quantized
1. Install MLX.

```
pip install mlx-lm
pip install "mlx-lm>=0.0.6"
```

2. When you launch a model worker, replace the normal worker (`fastchat.serve.model_worker`) with the MLX worker (`fastchat.serve.mlx_worker`).
2. When you launch a model worker, replace the normal worker (`fastchat.serve.model_worker`) with the MLX worker (`fastchat.serve.mlx_worker`). Remember to launch a model worker after you have launched the controller ([instructions](../README.md))

```
python3 -m fastchat.serve.mlx_worker --model-path microsoft/phi-2
python3 -m fastchat.serve.mlx_worker --model-path TinyLlama/TinyLlama-1.1B-Chat-v1.0
```
2 changes: 1 addition & 1 deletion fastchat/serve/mlx_worker.py
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ async def generate_stream(self, params):
)

for i in range(max_new_tokens):
token = await run_in_threadpool(next, iterator)
(token, _) = await run_in_threadpool(next, iterator)
if token == self.mlx_tokenizer.eos_token_id:
finish_reason = "stop"
break
Expand Down

0 comments on commit ddb2cc9

Please sign in to comment.