-
Notifications
You must be signed in to change notification settings - Fork 539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Outline
s
#1364
base: main
Are you sure you want to change the base?
Add Outline
s
#1364
Conversation
8cb052d
to
26b14b9
Compare
We should allow functions that are not decorated, such as def build_prompt(a: int) -> str:
return f"What is {a} squared?" Because we do not want to force the use of templates on users. |
Also you can deprecate the code in |
outlines/outline.py
Outdated
prompt = self.template(*args) | ||
|
||
# Generate the response using the model | ||
response = self.model.generate(prompt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should use either outlines.generate.json
, outlines.generate.regex
, etc. The interface is necessarily going to be brittle for now. You can take a look at the v1 branch to see how this is going to be more robust with the undergoing refactor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, okay, but which one of those should I use by default? And how should I let users switch to the other one, by adding a fourth optional argument, e.g. generation_kind
, WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized that maybe I misunderstood the issue, and this PR should infer the correct regex to match the user's required output type. E.g., if a user asks for an int
, I should ask outlines.generate.regex
something like ^[+-]?[0-9]+$
. IIUC, I'm unsure how to scale this to arbitrary return types. Should we define a list of supported basic return types? Using ast.literal_eval
was the most flexible solution I came up with, but I'm not sure how well it handles custom data structures. In such cases (when the return type isn't in the list of straightforward Python builtins), I think we should check that users provide types that implement deserialization from JSON (and then use outlines.generate.json
), WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's only accept Pydantic models / JSON Schema strings for now and use outlines.generate.json
. We'll be able to easily generalise after we push the refactor in the v1 branch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the PR (and the usage example in the docstring). Does it still look like what we want?
26b14b9
to
e328144
Compare
That's already how this PR works, I just wanted to confirm that this was the intended design and not an omission!
Should I handle that in this PR or a separate one? By deprecate, do you mean adding a user-facing warning when it's used, rather than removing it entirely? |
We can do it in this PR. Let's add a user-facing warning, and announce that it will be removed in favour of the |
9a2a79a
to
1faa76f
Compare
ddd5409
to
214a2f3
Compare
214a2f3
to
b2512e9
Compare
And the `output_type` should now be a Pydantic model or a JSON Schema str
b2512e9
to
9a9acba
Compare
@@ -10,6 +10,9 @@ | |||
from outlines.generate.api import SequenceGenerator | |||
from outlines.prompts import Prompt | |||
|
|||
# Raising a warning here caused all the tests to fail… | |||
print("The 'function' module is deprecated and will be removed in a future release.") | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe better place would be putting warning in __post_init__
of Function
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moving my previous comment here for visibility:
It would be nice to specify which release (1.0.0) so people can pin their version to < this release?
""" | ||
|
||
def __init__(self, model, template, output_type): | ||
if not (isinstance(output_type, str) or issubclass(output_type, BaseModel)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like an oversimplified check to confirm str to be a json schema string: isinstance(output_type, str)
. Even though v1 will handle this much more gracefully, perhaps, even for a temp solution it worth to verify str being an actual json schema?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's surprisingly difficult to find a tool that validates a Json Schema. Would be very happy if you found one!
) | ||
result = outline_instance(3) | ||
|
||
assert result["result"] == 6 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understanding that this is a draft yet, but still to mention it: tests for misc errors flows are missing
""" | ||
|
||
def __init__(self, model, template, output_type): | ||
if not (isinstance(output_type, str) or issubclass(output_type, BaseModel)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's surprisingly difficult to find a tool that validates a Json Schema. Would be very happy if you found one!
self.generator = generate.json(model, output_type) | ||
|
||
def __call__(self, *args): | ||
prompt = self.template(*args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe worth to support **kwargs as well?
Fix #1346, WDYT about the integration with
outlines.prompt
? (Here, the template argument is just a function that returns astr
, so it could be either an@outlines.prompt
-decorated function or a regular one)