Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Potential Risk of Hitting Milvus Collection Limit in Dify Design #12894

Open
4 of 5 tasks
ducanh997 opened this issue Jan 21, 2025 · 0 comments
Open
4 of 5 tasks

Potential Risk of Hitting Milvus Collection Limit in Dify Design #12894

ducanh997 opened this issue Jan 21, 2025 · 0 comments
Labels
👻 feat:rag Embedding related issue, like qdrant, weaviate, milvus, vector database.

Comments

@ducanh997
Copy link

ducanh997 commented Jan 21, 2025

Self Checks

  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • Please do not modify this template :) and fill in all the required fields.

1. Is this request related to a challenge you're experiencing? Tell me about your story.

Background

We are using Milvus as the vector database for Dify, specifically for internal chatbot development. Currently, every time we upload a data file, Dify creates a new collection within the configured Milvus database. However, Milvus has a recommended limit of 10,000 collections.

The recommended limit for the number of collections/partitions is 10,000, as exceeding this limit may impact failure recovery and resource usage. https://milvus.io/blog/milvus-2-3-4-faster-searches-expanded-data-support-improved-monitoring-and-more.md

Problem

Given the current behavior of Dify’s Milvus integration, where each uploaded dataset leads to the creation of a new collection, there is a significant risk of hitting this limit over time.

Proposed idea
Instead of creating a new collection for every dataset, Dify should leverage partition keys within a single collection to separate data logically. This approach aligns with Milvus’ recommended practices and avoids the risk of hitting the collection limit.

2. Additional context or comments

No response

3. Can you help us with this feature?

  • I am interested in contributing to this feature.
@dosubot dosubot bot added the 👻 feat:rag Embedding related issue, like qdrant, weaviate, milvus, vector database. label Jan 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
👻 feat:rag Embedding related issue, like qdrant, weaviate, milvus, vector database.
Projects
None yet
Development

No branches or pull requests

1 participant