Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Changes:
allowed_token_ids
toSamplingParams
(configuration)allowed_token_ids
inSampler.forward
by squashing the logits of disallowed tokensIt allows the user to generate only specific tokens.
Please note, that it is the caller's responsibility to add the EOS and and additional stop tokens to the list of allowed tokens, but this is required only for open ended generations (not limited to 1 or a few tokens). It may be error prone, so we may want to add them automatically to the allowed tokens list.
Idea is to make a separate call for each segment of the generation which has different
allowed_token_ids
. The tokens known for sure can be efficiently "skipped" by appending them at the end of prompt for the next call (segment). The end of segments can be detected by adding them temporarily tostop_token_ids
or by detecting them on the fly from a streaming generation. It gives the caller maximum control over the schema.TODO:
Constrained generation libraries we may want to provide adapters for:
The adapters could be separate libraries or just examples. They should go into separate PRs (or repo).
Issue: #288