Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updates to Performance #346

Merged
merged 14 commits into from
Jan 22, 2025
Merged

Updates to Performance #346

merged 14 commits into from
Jan 22, 2025

Conversation

dillonalaird
Copy link
Member

@dillonalaird dillonalaird commented Jan 21, 2025

Several Updates:

  • Updated coding prompt to return only final answer and not additional information
  • If plan_context contains code, then coder will not write code, else coder will write code
  • get_tool_for_task can take in dictionary of images to differentiate different images (helps with choosing flux as a tool)
  • Update vision_agent_v2 prompt to not try to fix code on user's behalf
  • Parallelizes temporal_localization
  • Add inpainting and temporal localization to categorization request agent

Copy link
Member

@camiloaz camiloaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one question, but LGTM

Comment on lines +461 to +462
- "temporal localization" - localizing the time period an event occurs in a video.
- "inpainting" - filling in masked parts of an image.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -248,6 +250,7 @@ def get_tool_for_task(
- VQA
- Depth and pose estimation
- Video object tracking
- Image inpainting
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about temporal localization?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh good catch

@@ -651,22 +653,24 @@ def process_florence2_sam2_video_tracking(frames):
"""

FINALIZE_PLAN = """
**Role**: You are an expert AI model that can understand the user request and construct plans to accomplish it.

**Task**: You are given a chain of thoughts, python executions and observations from a planning agent as it tries to construct a plan to solve a user request. Your task is to summarize the plan it found so that another programming agnet to write a program to accomplish the user request.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: not related to the pr

Suggested change
**Task**: You are given a chain of thoughts, python executions and observations from a planning agent as it tries to construct a plan to solve a user request. Your task is to summarize the plan it found so that another programming agnet to write a program to accomplish the user request.
**Task**: You are given a chain of thoughts, python executions and observations from a planning agent as it tries to construct a plan to solve a user request. Your task is to summarize the plan it found so that another programming agent to write a program to accomplish the user request.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

@dillonalaird dillonalaird merged commit 8439354 into main Jan 22, 2025
8 checks passed
@dillonalaird dillonalaird deleted the update-performance branch January 22, 2025 03:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants