Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Outlines #1364

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Add Outlines #1364

wants to merge 2 commits into from

Conversation

yvan-sraka
Copy link
Contributor

Fix #1346, WDYT about the integration with outlines.prompt? (Here, the template argument is just a function that returns a str, so it could be either an @outlines.prompt-decorated function or a regular one)

@yvan-sraka yvan-sraka requested review from torymur and rlouf January 8, 2025 08:23
@yvan-sraka yvan-sraka self-assigned this Jan 8, 2025
@yvan-sraka yvan-sraka linked an issue Jan 8, 2025 that may be closed by this pull request
tests/test_outline.py Outdated Show resolved Hide resolved
@rlouf
Copy link
Member

rlouf commented Jan 8, 2025

Fix #1346, WDYT about the integration with outlines.prompt?

We should allow functions that are not decorated, such as

def build_prompt(a: int) -> str:
    return f"What is {a} squared?"

Because we do not want to force the use of templates on users.

@rlouf
Copy link
Member

rlouf commented Jan 8, 2025

Also you can deprecate the code in function.py as this is going to replace it.

outlines/outline.py Outdated Show resolved Hide resolved
outlines/outline.py Outdated Show resolved Hide resolved
prompt = self.template(*args)

# Generate the response using the model
response = self.model.generate(prompt)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should use either outlines.generate.json, outlines.generate.regex, etc. The interface is necessarily going to be brittle for now. You can take a look at the v1 branch to see how this is going to be more robust with the undergoing refactor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, okay, but which one of those should I use by default? And how should I let users switch to the other one, by adding a fourth optional argument, e.g. generation_kind, WDYT?

Copy link
Contributor Author

@yvan-sraka yvan-sraka Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized that maybe I misunderstood the issue, and this PR should infer the correct regex to match the user's required output type. E.g., if a user asks for an int, I should ask outlines.generate.regex something like ^[+-]?[0-9]+$. IIUC, I'm unsure how to scale this to arbitrary return types. Should we define a list of supported basic return types? Using ast.literal_eval was the most flexible solution I came up with, but I'm not sure how well it handles custom data structures. In such cases (when the return type isn't in the list of straightforward Python builtins), I think we should check that users provide types that implement deserialization from JSON (and then use outlines.generate.json), WDYT?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's only accept Pydantic models / JSON Schema strings for now and use outlines.generate.json. We'll be able to easily generalise after we push the refactor in the v1 branch.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the PR (and the usage example in the docstring). Does it still look like what we want?

@yvan-sraka
Copy link
Contributor Author

Fix #1346, WDYT about the integration with outlines.prompt?

We should allow functions that are not decorated, such as

def build_prompt(a: int) -> str:
    return f"What is {a} squared?"

Because we do not want to force the use of templates on users.

That's already how this PR works, I just wanted to confirm that this was the intended design and not an omission!

Also you can deprecate the code in function.py as this is going to replace it.

Should I handle that in this PR or a separate one? By deprecate, do you mean adding a user-facing warning when it's used, rather than removing it entirely?

@yvan-sraka yvan-sraka requested a review from rlouf January 8, 2025 13:53
@rlouf
Copy link
Member

rlouf commented Jan 8, 2025

Should I handle that in this PR or a separate one? By deprecate, do you mean adding a user-facing warning when it's used, rather than removing it entirely?

We can do it in this PR. Let's add a user-facing warning, and announce that it will be removed in favour of the Outline interface in v1.0

@yvan-sraka yvan-sraka force-pushed the 1346-add-outlines branch 3 times, most recently from 9a2a79a to 1faa76f Compare January 10, 2025 11:02
@yvan-sraka yvan-sraka force-pushed the 1346-add-outlines branch 2 times, most recently from ddd5409 to 214a2f3 Compare January 10, 2025 11:07
@yvan-sraka yvan-sraka marked this pull request as ready for review January 10, 2025 11:07
And the `output_type` should now be a Pydantic model or a JSON Schema str
@@ -10,6 +10,9 @@
from outlines.generate.api import SequenceGenerator
from outlines.prompts import Prompt

# Raising a warning here caused all the tests to fail…
print("The 'function' module is deprecated and will be removed in a future release.")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe better place would be putting warning in __post_init__ of Function?

Copy link
Member

@rlouf rlouf Jan 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving my previous comment here for visibility:

It would be nice to specify which release (1.0.0) so people can pin their version to < this release?

"""

def __init__(self, model, template, output_type):
if not (isinstance(output_type, str) or issubclass(output_type, BaseModel)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an oversimplified check to confirm str to be a json schema string: isinstance(output_type, str). Even though v1 will handle this much more gracefully, perhaps, even for a temp solution it worth to verify str being an actual json schema?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's surprisingly difficult to find a tool that validates a Json Schema. Would be very happy if you found one!

)
result = outline_instance(3)

assert result["result"] == 6
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understanding that this is a draft yet, but still to mention it: tests for misc errors flows are missing

outlines/function.py Outdated Show resolved Hide resolved
"""

def __init__(self, model, template, output_type):
if not (isinstance(output_type, str) or issubclass(output_type, BaseModel)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's surprisingly difficult to find a tool that validates a Json Schema. Would be very happy if you found one!

self.generator = generate.json(model, output_type)

def __call__(self, *args):
prompt = self.template(*args)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe worth to support **kwargs as well?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Outlines
3 participants