-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ADR: Vector store for RAG #168
Conversation
What about moving adrs/ at the root folder, with README and the template files, and have a rag/ subfolder? |
That would probably need org-wide approval and we don't have that. I'll bring it up with the wider team next year to advocate for adoption of the format. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the decomposition into Context, Decision, Status, and Consequences. It seems much smoother and more natural than the approach used in #164 which feels awkward and cumbersome to me. I have some minor change requests here, but mostly I like the direction of both the ADR template and the proposed Milvus ADR.
I'm glad the approach resonates. Within reason, people should take priority over processes and not the other way around. This is a core tenet of the Agile Manifesto which, for better or for worse, has been a solid philosophical framework for ways of working for software teams since 2004:
|
docs/rag/adrs/adr-vectordb.md
Outdated
|
||
## Context | ||
|
||
One of the first choices to make in implementing RAG is to choose an initial vector store to develop against. Though the usage of frameworks like LangChain or Haystack make it easy to swap vector databases, we need a working end to end implementation for RAG that is tested against and available to install with InstructLab. There are many options (see [here](https://docs.haystack.deepset.ai/docs/choosing-a-document-store)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do I disable the below linting bot comment? It seems to me that this is a useless nitpick for markdown which will be rendered and in which whitespace does not matter. @nathan-weinberg maybe you can help here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also be fine with shutting off the linting, which I also find kind of tedious. However, I don't feel strongly. The markdownlint plugin in VSCode has made it a lot easier for me to comply with the linting in my ADRs since I can see the issues before submitting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can run make md-lint
locally to catch things locally so you don't need to wait for the GitHub CI to run
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a couple more comments. Also, I do think there is a lot of controversy around the Milvus dependencies specifically, but I am OK with waiting to see if someone else wants to challenge this decision on those grounds.
I see that this PR will need an update to the spellcheck dictionary and a couple of spelling corrections to pass the spell check. Also, I see that it is not passing DCO. I tried using the DCO guidance to bring my last dev-doc into compliance with DCO and it didn't work, but I do think it can be done with enough github expertise (just more than I have). So I don't think it is ready to merge. However, the only open item from me is the title line of the template.
docs/rag/adrs/adr-vectordb.md
Outdated
|
||
## Context | ||
|
||
One of the first choices to make in implementing RAG is to choose an initial vector store to develop against. Though the usage of frameworks like LangChain or Haystack make it easy to swap vector databases, we need a working end to end implementation for RAG that is tested against and available to install with InstructLab. There are many options (see [here](https://docs.haystack.deepset.ai/docs/choosing-a-document-store)). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would also be fine with shutting off the linting, which I also find kind of tedious. However, I don't feel strongly. The markdownlint plugin in VSCode has made it a lot easier for me to comply with the linting in my ADRs since I can see the issues before submitting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issues I raised earlier are all resolved now, so I am happy with this in its current form.
Signed-off-by: Anastas Stoyanovsky <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this all looks really good to me. I really appreciate the detail you put in here about ADRs in general.
@cdoern @nathan-weinberg thanks both! I don't have a merge button, so, if you would be so kind, that would get the template and guidance merged, and a subsequent ADR will record any changes regarding vector stores based on situational / informational changes. |
Signed-off-by: Anastas Stoyanovsky <[email protected]>
#2869