generated from kyma-project/template-repository
-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create Script to Pull, Clean, Embed, and Store Kyma Documentation in Hana Vector Database #199
Closed
6 tasks done
Milestone
Comments
Todo(s):
|
This was
linked to
pull requests
Oct 30, 2024
This was
linked to
pull requests
Nov 4, 2024
Check. doc |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Description
The goal of this task is to develop a script that automates the process of pulling Kyma BTP and Kyma Open-Source documentation (in
.md
format), filtering for relevant documents, embedding them using a suitable model, and storing the resulting embeddings in the Hana Vector Database. The embedding model used should be carefully selected, with a suggestion to start by exploring OpenAI models, given their success in previous PoC experiments. An appropriate chunking strategy for breaking down the documentation into manageable parts must also be implemented. A plan to trigger this script will be discussed with the team for follow-up tasks.This task can be parallelelized, 2 people can work on it and split the subtasks however they decide. (Recommendation strong)
Subtasks
Pull Kyma Documentation:
.md
format from their respective sources.Filter Relevant Documentation Files:
Choose an Embedding Model:
Implement Chunking Strategy:
Store Embeddings in Hana Vector Database:
Propose Triggering Mechanism:
Subtasks
Acceptance Criteria
.md
format. (Assignee: @mfaizanse )The text was updated successfully, but these errors were encountered: