Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docsite search: Add a new docsite search RAG #172

Open
hanna-paasivirta opened this issue Feb 17, 2025 · 1 comment · May be fixed by #176
Open

Docsite search: Add a new docsite search RAG #172

hanna-paasivirta opened this issue Feb 17, 2025 · 1 comment · May be fixed by #176
Assignees

Comments

@hanna-paasivirta
Copy link
Contributor

hanna-paasivirta commented Feb 17, 2025

Add a new docsite search RAG to replace the experimental Search service. Use the Search service as an example, but rewrite to match the structure of our new embeddings services.

  • Leverage the Embeddings service and its SearchResult and VectorStore classes and add module to connect to docs embeddings
  • Add separate service to embed docs
  • Replace the Search service (doesn't seem to be in use) with a new docsite search service that can search by connecting to the Embeddings service
  • Use the same vector store and embeddings APIs as new services (Pinecone, OpenAI)
  • Analyse and optimise the chunking of the docsite and adaptor APIs
  • Include metadata for each document returned to present to the user
@hanna-paasivirta hanna-paasivirta self-assigned this Feb 17, 2025
@hanna-paasivirta hanna-paasivirta changed the title Add a new docsite search RAG Docsite search: Add a new docsite search RAG Feb 17, 2025
@josephjclark
Copy link
Collaborator

josephjclark commented Feb 17, 2025

Can confirm that the existing search is not in use and we can freely rename it.

To consider (and let's call Elias): is the VectorStore abstraction helping at all or should we just use langchain directly?

Update: let's drop vector store and just use langchain

Structure:

embeddings/
    -docs_store.py <-- this will 
docs_search.py   <-- this is a connected service which will search the docsite and return useful document chunks (it might be very lightweight)

@hanna-paasivirta hanna-paasivirta linked a pull request Feb 21, 2025 that will close this issue
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants