Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: (Draft) File summary for large files #2800

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

caseymcc
Copy link
Contributor

@caseymcc caseymcc commented Jan 7, 2025

I have been working with some large json files recently which don't fit the context window so I added the ability to split the file up in chunks and summarize it all into a single file with a caching system. Curious if this is something I should finish fleshing out and if so any direction you all might add.

Currently if the file is > 200K it is sent to the SummaryCache,

  • chunks the file while trying to keep the splits between code/format blocks
  • runs all the chunks through the llm (might get costly) and summarizes the chunks while trying to hold on to any hierarchical info
  • combines the summarizes using the llm (recursively if they dont fit the tokens)
  • stores the file time, content size and summary in the cache to be loaded if it exists or generated if it does not or the file has changed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant