Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add afterCompletion callback option for runTools to enable easily building multi-model / multi-agent flows #1064

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

granmoe
Copy link
Contributor

@granmoe granmoe commented Sep 12, 2024

  • I understand that this repository is auto-generated and my pull request may not be merged

Changes being requested

I'd like to propose adding an afterCompletion hook as an option to runTools. This enables building powerful multi-model and multi-agent functionality easily by providing a simple entrypoint at which you can analyze runner.messages and optionally carry out a separate LLM flow and then inject one or more messages (or modify messages) in order to, for example, inject some targeted web research to help your agent overcome a problem it's encountering.

I've updated tests, added an example, and ensured everything is correctly typed.

My loom linked below shows how I'm using this to great effect in Otto Engineer, an autonomous coding system.

Additional context & links

Note that I tried to get this to work via tool definitions and prompting but got very poor results. I could almost never get the model to perform web research, and when I did, the prompt and tools had to be shaped so aggressively around this goal that it degraded performance of everything else. So I started thinking of ways to decompose this into separate problems and entirely remove web research from the prompt/tools so the model would not have to think about it and could focus purely on the already complex task of implementing and testing code changes.

I came up with an idea to periodically inspect the chat thread during a runTools call, and I initially attempted to do it via tricky Promise resolution combined with runner.abort + starting a new runTools call and modifying messages, but it was painfully complex and near impossible to correctly update the chat messages without accidentally injecting my web research message in the wrong place (prior to tool call outputs). (Controlling promises via the emitted events is hazardous as, of course, they can happen out of sequence, and it's hard to make sure you've captured all in progress tool calls / messages before injecting a message.)

Here's a demo loom where I show how I use this in my app.

Thank you

I absolutely love having static types around your API and all the goodness this library gives me! I get a TON of mileage out of runTools and want to contribute and help this library thrive! Thank you 🙏🏻

@granmoe granmoe requested a review from a team as a code owner September 12, 2024 21:39
@@ -119,19 +120,19 @@ export class Completions extends APIResource {
runTools<
Params extends ChatCompletionToolRunnerParams<any>,
ParsedT = ExtractParsedContentFromParams<Params>,
>(body: Params, options?: Core.RequestOptions): ChatCompletionRunner<ParsedT>;
>(body: Params, options?: RunnerOptions): ChatCompletionRunner<ParsedT>;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This type has been wrong such that it causes a TS error when you try to pass maxCompletions. Will Stainless auto-gen remove my change? If so, what's the right way to fix this? It's been driving me nuts 😅

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah thanks for fixing! No the generator will leave this change in :)

Would you mind opening a separate PR with this change so we can get it merged faster?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call! Will update now...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@granmoe granmoe changed the title Add afterCompletion callback option for runTools to enable easily building multi-model / multi-agent flows feat: Add afterCompletion callback option for runTools to enable easily building multi-model / multi-agent flows Sep 13, 2024
@RobertCraigie
Copy link
Collaborator

Thanks so much for the detailed PR! I'll take this back to the team.

afterCompletion: async () => {
if (!shouldInjectMessage) {
runner._addMessage({
role: 'system',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is just an example, but I think we'd want this to be an assistant message also

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I'm a bit unclear on this: I've been using system for these kinds of messages where I "inject" important context or attempt to give the model a strong nudge to go in a different direction or update its instructions partway through a chat, and that seems to work well for me. (I also display all assistant messages in my FE by default and wouldn't want to display this message, but I could find a way around that if needed.) Could I potentially get better results in such cases by using the assistant role and changing up how I word the message content accordingly? I'm consistently getting the model to respond intelligently to inline web research using my current approach, but I'm curious and honestly I've probably neglected the assistant role a bit when it comes to manually inserted messages!

@kwhinnery-openai
Copy link
Contributor

I think this is an interesting idea, to build in the ability to incrementally add to the context window into the API. It's certainly possible to do this with the API alone today, but a bit more manual. I think we'd need to workshop the API design a bit, but you could see a feature like this being a nice DX optimization. To be transparent, it's unlikely we'll be able to analyze this in depth until later in the year, but happy to keep this conversation going.

You could also build this as a standalone module on npm that uses the SDK as a dependency - might be an easier way to iterate on the DX independently?

@granmoe
Copy link
Contributor Author

granmoe commented Sep 20, 2024

I think this is an interesting idea, to build in the ability to incrementally add to the context window into the API. It's certainly possible to do this with the API alone today, but a bit more manual. I think we'd need to workshop the API design a bit, but you could see a feature like this being a nice DX optimization. To be transparent, it's unlikely we'll be able to analyze this in depth until later in the year, but happy to keep this conversation going.

You could also build this as a standalone module on npm that uses the SDK as a dependency - might be an easier way to iterate on the DX independently?

Yes, good call to iterate on this as a separate npm package in the interim! I can just pass in my instance of OpenAI as an arg and build a custom runTools from lower level pieces of the SDK. That will also make it so I can get off my fork of the official repo, which is nice 😅

Also yes, totally possible by using the API directly, but I've become addicted to openai-node due to how much mileage I get out of runTools as well as the built in type safety and the fact that I can count on it to be up to date with your API. I tried creating my own library a while back (RIP, gpt-toolkit) but happily abandoned it when I saw that you incorporated type safety and easy zod integration into openai-node 🙂

Anyhow, runTools is just super ergonomic for my use case, and it's absolutely crushing it when combined with afterCompletion in terms of results. I've shared a couple (long 😅) loom videos in this PR that show consistent state of the art results for an autonomous coding system, all because of the multi-agent patterns I'm using, which are written using this library, and afterCompletion is a crucial piece 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants