Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added a folder for v1-single-agent with modified codes to run with groq api and nomic embeddings with task specific embedding query. #22

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

hasanulkarim
Copy link

No description provided.

@hasanulkarim
Copy link
Author

added a folder with modified codes to run with groq api and nomic embeddings with task specific embedding query. Here is the readme details:

Open source models with Groq api and nomic embeddings (local)

If you do not have the hardware to run ollama with open source models, using groqs free tier is a good way to experiment.
The code structure is almost identical as the openai setup in the v1-single-agent parent directory, modified to work with groq api (for open source llms with fast inference and more model choices)

Additional Features

  • Options to choose different models for summarization and regular LLM model
    • LLM_MODEL="groq:deepseek-r1-distill-qwen-32b" (uses this for all agents decision and code generation)
    • SUMMARIZATION_MODEL= "llama-3.3-70b-versatile" (this is used for summarization of the chunks)
    • (change it in the .env file if you need to use different models)
    • note: for the LLM model pydantic_ai supports a subset of models (so all groq models are not supported)
  • Added Nomic embeddings (runs on gpu, and in local mode.) (note the embedding does not run asynchronously)
    • uses search_document when generating the embedding from the pydantic docs
    • uses search_query when embedding the query from the user
    • using of task specific embedding are found to be better performant
  • Added retry options with a backing off factor on the delay if the rate limit reached for groq free tier
    • recommend ingesting a few documents at a time to avoid hitting daily rate limit and in an infinite loop of retries

Prerequisites

  • Python 3.11+
  • groq
  • nomic (for local embedding generation)
  • Supabase account and database
  • Groq API key
  • Streamlit (for web interface)

@hasanulkarim hasanulkarim changed the title added a folder with modified codes to run with groq api and nomic embeddings with task specific embedding query. added a folder for v1-single-agent with modified codes to run with groq api and nomic embeddings with task specific embedding query. Feb 23, 2025
@coleam00 coleam00 added the enhancement New feature or request label Feb 24, 2025
@coleam00
Copy link
Owner

Thank you for this PR @hasanulkarim! Could you please share your motivation behind implementing Groq instead of using OpenRouter when you want open source LLMs but not to run them locally? Is it just because of speed?

I love Groq but also want to avoid adding too many different providers to this project since it makes the documentation and .env.example file longer!

@coleam00 coleam00 added the question Further information is requested label Feb 24, 2025
@hasanulkarim
Copy link
Author

Hi Cole, yes, my main motivation was the speed of groq (for users who dont have access to a high end gpu for large LLMs). It also has a more generous rate limits on the free tier than OpenRouter afaik. Thought its a good option for anyone who wants to try the app for free with latest open source models just to get familiar with agent building.

Agree with your points though that it can get out of hand if we start adding other providers like gemini etc., but, hopefully we can keep one free completely local settings (ollama with nomic) and one decent free tier with good speed (groq) to provide alternatives to openai. Thank you very much for looking into the PR and would appreciate if you have any feedback! Good luck with the project and thanks for your amazing youtube contents.

@niandran
Copy link

niandran commented Mar 4, 2025

thanks

@coleam00
Copy link
Owner

coleam00 commented Mar 9, 2025

Makes sense, thanks @hasanulkarim! I plan on adding more providers in bulk in a future version of Archon so I will use this PR then!

@coleam00 coleam00 added future Important but for later and removed question Further information is requested labels Mar 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request future Important but for later
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants