Proposition integration LlamaIndex #380
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The aim of this PR is not to be merged but just to propose a potential way of integrating outlines with LlamaIndex (issue: #345)
How to use LlamaIndex in outlines is not that straightforward as the way LlamaIndex is built is not fully suited for our use case. To avoid having to replicate too much of LlamaIndex's logic or interacting with its objects in a way that feels too hacky, I thought that the easiest would be to create a custom version of their
LLM
object specific to outlines that would be called when LlamaIndex would otherwise call the LLM normally chosen by the user.The idea is that by calling the
query
method of the LlamaIndex engine as a regular user would, we let LlamaIndex figure out what context elements to add to the original prompt (including how many calls to make in case of a more sophisticated type of response synthesiser). Then, the initial outlines function the user has selected is called downstream of the LlamaIndex process through our customLLM
object with the modified prompt. It returns the expected LlamaIndexCompletionResponse
object and then LlamaIndex decides whether to make some more queries or just to return the response based on the context configuration.There are definitely many possible configuration arguments that are not covered here and I'm not sure it would already work with more complex cases but the basic example included seems to be working as intended. I'd be curious to know whether you think it looks like a promising direction to take