Skip to content

04. Chat Completions

Brian Dashore edited this page Sep 8, 2024 · 2 revisions

Chat Completions

TabbyAPI builds on top of the HuggingFace "chat templates" standard for OAI style chat completions (/v1/chat/completions).

If you'd like more detail, look at the autogenerated documentation.

Custom Templates

By default, TabbyAPI will try to pull the chat template from a model's chat_template key within a model's tokenizer_config.json, but you can also make a custom jinja file. To learn how to create a HuggingFace compatible jinja2 template, Please read Huggingface's documentation.

If you create a custom template for a model, consider PRing it to the templates repository

In addition, there's also support to specify stopping strings within the chat template. This can be achieved by adding {%- set stop_strings = ["string1"] -%} at the top of the jinja file. In this case, string1 will be appended to your completion as a stopping string.

Warning

Make sure to add {%- -%} for any top-level metadata. If this is not provided, the top of the rendered prompt will have extra whitespace. This does not apply for comments {# #}

To use a custom template, place it in the templates folder, and make sure to set the prompt_template field in config.yml (see model config) to the template's filename.

Template Variables

A chat completions request to TabbyAPI also supports custom template variables in the form of a key/value object in the JSON body. Here's an example:

"template_vars": {
    "test_var": "hello!"
}

Now let's pass the custom var in the following template:

I'm going to say {{ test_var }}

Running render on this template will now result in: I'm going to say hello!

Clone this wiki locally