-
Notifications
You must be signed in to change notification settings - Fork 4
Home
Alexander Zuev edited this page Oct 19, 2024
·
14 revisions
Roadmap is subject to change.
0.2.0 will give the user the easiest way to chat with web content (primarily documentation) with a nice, simple web UI (based on Chainlit). It will be offered in two options:
- open-source solution for DIY enthusiasts - free to use and setup on your own
- a showcase web app with a set of indexed documents focusing on AI agents, AI, LLM framework - covering the full lifecycle
- Self-hosted: clone repo & setup your self
- Web-app: hosted solution with +30 AI / LLM / AI-agent focused libraries to work with
- Index web content via
@docs
command
-
Automate ingestion pipeline
- automate the pipeline from URL to embedding and loading
- simplify db reset
- index and do not reload documents that exist
-
Command support
- @docs -> parse and load the parsed docs into db
- @get -> get, parse and store a single URL
-
Chunking improvements:
- Add chunk summary generated by LLM
- Improve chunking quality (token, header preservation)
- Try out Unstructured.io
-
Git sync
- Allow syncing of a Git repo a-la Claude enterprise style
- @repo -> setup indexing of a github repo
-
Improve stability of RAG calls
- LLM should decide in a smart way when RAG is needed
- Search re-tries with different query
-
Setup basic eval suite for RAG
- Scope: a. Anthropic docs b. Supabase docs c. LlamaIndex docs
- Evaluation metrics defined (e.g., accuracy, relevance, speed)
- Automated test suite implemented
- Baseline performance established for each doc set
- Output streaming
- Tool result streaming / output
- Improved terminal output (using colorama)
- Chainlit UI implementation
- Research the need to transition to LangChain [WON'T DO -> LCEL, WHY RE-INVENT THE WHEEL?]
- Refactor code to use LlamaIndex for search & retrieval
- Setup basic telemetry for open-source app with public dashboards
V0.3.0 - Chat with a GitHub repo via a web interface V0.4.0 - Confluence agent to chat with company docs V0.5.0 - Open-source plugin that can be installed on any website to instantly give it LLM + RAG over site content, securely, privately
Initial release of Kollektiv (called OmniClaude back then) with the following features:
- crawling of documentation with FireCrawl
- custom markdown chunking
- embedding and storage with ChromaDB
- custom retrieval with multi-query expansion and re-ranking
- chat with Sonnet 3.5 with rag search tool