-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Request] Constrained Generation #26
Comments
Hi @scottwey, thank you for bringing that up. Implementing methods which would affect sampling such as Constrained Generation should be doable. All one would need to do is inject code at the correct location to implement the Constrained Generation: perhaps an elegant
As you can see, I sample the logits and then add the result to the SequenceGroup. I am not familiar with the implementation of Constrained Generation, but after reading their README, I could imagine that you would add the implementation in this region. Please let me know if you would be interested in implementing this. |
@EricLBuehler I will try to take a crack at this as soon as I get some time. Thank you for the guidance. :) |
@scottwey, I am currently working on mistral.rs. It has a simpler sampling API and overall file structure, so perhaps you could take a look there? Feel free to raise an issue for further guidance. |
Please see EricLBuehler/mistral.rs#59 where we are developing model grammar support. If you have any questions, please feel free to reopen! |
Given the current structure of
candle-vllm
, how difficult would it be to add constrained generation, similar to lm-format-enforcer?I'm happy to help here however I can.
The text was updated successfully, but these errors were encountered: