Skip to content

Commit

Permalink
update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
dhconnelly committed Dec 2, 2024
1 parent 94cd02e commit 4147ffd
Showing 1 changed file with 2 additions and 1 deletion.
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ an inference server from scratch in c++
- [x] test framework
- [x] json parser
- [x] openai-compatible chat completion api
- [ ] model and weights loader
- [x] parse safetensors, params, tokenizer configs
- [ ] tokenizer
- [ ] llama3.2 in cuda
- [ ] profiling and optimization
Expand All @@ -19,6 +19,7 @@ other improvements:
- [ ] backpressure w/http 529
- [ ] streaming w/server-side events
- [ ] add /statusz with metrics etc.
- [ ] revisit concurrency

## prerequisites

Expand Down

0 comments on commit 4147ffd

Please sign in to comment.