Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(userSchema): Remove Sparse Unique Indexes from Social Login Fields to Fix CosmosDB Duplicate Key Errors #1814

Closed
wants to merge 3 commits into from

Conversation

danny-avila
Copy link
Owner

Addresses CosmosDB compatibility issue

@strayer
Copy link

strayer commented Feb 19, 2024

I'm having a bit of trouble testing this. I checked out the branch and created a serverless RU-based CosmosDB for MongoDB on Azure (4.2).

When running the api container this happens:

❯ docker compose -f my-compose.yaml up --force-recreate api
[+] Running 1/0
 ✔ Container librechat-cosmosdbtest-2-api-1  Recreated                                                                                                                                                      0.1s
Attaching to api-1
api-1  |
api-1  | > [email protected] backend
api-1  | > cross-env NODE_ENV=production node api/server/index.js
api-1  |
api-1  | 2024-02-19 08:48:02 info: [Optional] Redis not initialized. Note: Redis support is experimental.
api-1  | 2024-02-19 08:48:03 info: Connected to MongoDB
api-1  | 2024-02-19 08:48:05 error: There was an uncaught error: a collection 'LibreChat.logs' already exists
api-1 exited with code 1

Running it again shows this:

❯ docker compose -f my-compose.yaml up --force-recreate api
[+] Running 1/0
 ✔ Container librechat-cosmosdbtest-2-api-1  Recreated                                                                                                                                                      0.0s
Attaching to api-1
api-1  |
api-1  | > [email protected] backend
api-1  | > cross-env NODE_ENV=production node api/server/index.js
api-1  |
api-1  | 2024-02-19 08:48:32 info: [Optional] Redis not initialized. Note: Redis support is experimental.
api-1  | 2024-02-19 08:48:33 info: Connected to MongoDB
api-1  | 2024-02-19 08:48:33 error: There was an uncaught error: Error=13, Details='Response status code does not indicate success: Forbidden (403); Substatus: 0; ActivityId: 6b5b046e-77e0-4798-b737-8a9a1ba1089e; Reason: (Message: {"Errors":["The unique index cannot be modified. To change the unique index, remove the collection and re-create a new one."]}
api-1  | ActivityId: 6b5b046e-77e0-4798-b737-8a9a1ba1089e, Request URI: /apps/3ff79e64-935b-4fa1-bba6-1ec7a0023582/services/8606c4a4-8694-49f9-a212-14e5f2fb0433/partitions/dd1db87c-77cf-43d8-8e8a-94682b9339bd/replicas/133453239069532763p, RequestStats:
api-1  | RequestStartTime: 2024-02-19T08:48:33.8531873Z, RequestEndTime: 2024-02-19T08:48:33.8610585Z,  Number of regions attempted:1
api-1  | {"systemHistory":[{"dateUtc":"2024-02-19T08:47:42.7050197Z","cpu":1.978,"memory":473539156.000,"threadInfo":{"isThreadStarving":"False","threadWaitIntervalInMs":0.0719,"availableThreads":32764,"minThreads":48,"maxThreads":32767},"numberOfOpenTcpConnection":3574},{"dateUtc":"2024-02-19T08:47:52.7156998Z","cpu":1.151,"memory":473475100.000,"threadInfo":{"isThreadStarving":"False","threadWaitIntervalInMs":0.1417,"availableThreads":32764,"minThreads":48,"maxThreads":32767},"numberOfOpenTcpConnection":3539},{"dateUtc":"2024-02-19T08:48:02.7261881Z","cpu":0.605,"memory":473457576.000,"threadInfo":{"isThreadStarving":"False","threadWaitIntervalInMs":0.0915,"availableThreads":32764,"minThreads":48,"maxThreads":32767},"numberOfOpenTcpConnection":3541},{"dateUtc":"2024-02-19T08:48:12.7367860Z","cpu":1.174,"memory":473480820.000,"threadInfo":{"isThreadStarving":"False","threadWaitIntervalInMs":0.0776,"availableThreads":32764,"minThreads":48,"maxThreads":32767},"numberOfOpenTcpConnection":3549},{"dateUtc":"2024-02-19T08:48:22.7474597Z","cpu":1.905,"memory":473846128.000,"threadInfo":{"isThreadStarving":"False","threadWaitIntervalInMs":0.0682,"availableThreads":32764,"minThreads":48,"maxThreads":32767},"numberOfOpenTcpConnection":3501},{"dateUtc":"2024-02-19T08:48:32.7580479Z","cpu":1.437,"memory":473858636.000,"threadInfo":{"isThreadStarving":"False","threadWaitIntervalInMs":0.0811,"availableThreads":32764,"minThreads":48,"maxThreads":32767},"numberOfOpenTcpConnection":3511}]}
api-1  | RequestStart: 2024-02-19T08:48:33.8532173Z; ResponseTime: 2024-02-19T08:48:33.8610585Z; StoreResult: StorePhysicalAddress: rntbd://10.0.0.19:11300/apps/3ff79e64-935b-4fa1-bba6-1ec7a0023582/services/8606c4a4-8694-49f9-a212-14e5f2fb0433/partitions/dd1db87c-77cf-43d8-8e8a-94682b9339bd/replicas/133453239069532763p, LSN: 181, GlobalCommittedLsn: 181, PartitionKeyRangeId: , IsValid: True, StatusCode: 403, SubStatusCode: 0, RequestCharge: 1.57, ItemLSN: -1, SessionToken: -1#181, UsingLocalLSN: False, TransportException: null, BELatencyMs: 6.75, ActivityId: 6b5b046e-77e0-4798-b737-8a9a1ba1089e, RetryAfterInMs: , ReplicaHealthStatuses: [(port: 11300 | status: Connected | lkt: 2/19/2024 8:48:33 AM)], TransportRequestTimeline: {"requestTimeline":[{"event": "Created", "startTimeUtc": "2024-02-19T08:48:33.8532183Z", "durationInMs": 0.01},{"event": "ChannelAcquisitionStarted", "startTimeUtc": "2024-02-19T08:48:33.8532283Z", "durationInMs": 0.0025},{"event": "Pipelined", "startTimeUtc": "2024-02-19T08:48:33.8532308Z", "durationInMs": 0.0922},{"event": "Transit Time", "startTimeUtc": "2024-02-19T08:48:33.8533230Z", "durationInMs": 7.4728},{"event": "Received", "startTimeUtc": "2024-02-19T08:48:33.8607958Z", "durationInMs": 0.0873},{"event": "Completed", "startTimeUtc": "2024-02-19T08:48:33.8608831Z", "durationInMs": 0}],"serviceEndpointStats":{"inflightRequests":1,"openConnections":1},"connectionStats":{"waitforConnectionInit":"False","callsPendingReceive":0,"lastSendAttempt":"2024-02-19T08:48:33.8503062Z","lastSend":"2024-02-19T08:48:33.8503438Z","lastReceive":"2024-02-19T08:48:33.8510444Z"},"requestSizeInBytes":2277,"requestBodySizeInBytes":1746,"responseMetadataSizeInBytes":172,"responseBodySizeInBytes":126};
api-1  |  ResourceType: Collection, OperationType: Replace
api-1  | , SDK: Microsoft.Azure.Documents.Common/2.14.0, Microsoft.Azure.Cosmos.Tracing.TraceData.ClientSideRequestStatisticsTraceDatum, Windows/10.0.20348 cosmos-netstandard-sdk/3.18.0);
api-1 exited with code 1

Weirdly enough the same happens on main branch. I don't think I'm doing something wrong here, but I'm also a bit confused about what is happening. When starting the container the CosmosDB is definitely empty. After starting the container some schemas exist:

grafik

@strayer
Copy link

strayer commented Feb 19, 2024

Some progress: I managed to resolve the duplicate collection error by commenting out the keyVMongo object creation here: https://github.com/danny-avila/LibreChat/blob/partial-filter-index/api/cache/keyvMongo.js#L6

It looks like something goes wrong when instantiating that class that breaks migrations or something? I added a redis cache to avoid MongoDB here and both errors from my previous comment disappeared. I was able to create two users successfully.

cosmosuser@SandboxHost-638439284691316803:~$ MongoDB shell version v3.6.8
connecting to: mongodb://foobar.mongo.cosmos.azure.com:10255/
MongoDB server version: 4.2.0
WARNING: shell and server versions do not match
Welcome to the MongoDB shell.
For interactive help, type "help".
For more comprehensive documentation, see
        http://docs.mongodb.org/
Questions? Try the support group
        http://groups.google.com/group/mongodb-user
globaldb:PRIMARY> use test;
switched to db test
globaldb:PRIMARY> db.users.find()
{ "_id" : ObjectId("65d320e2702a7e9cf2c04ae2"), "name" : "test", "username" : "test", "email" : "[email protected]", "emailVerified" : false, "password" : "$2a$10$DJGHCCUUvDAESPKeDD7JFeT4IKLVBkwOAuMcLSxIweD2QAAD/sraO", "avatar" : null, "provider" : "local", "role" : "ADMIN", "plugins" : [ ], "refreshToken" : [ ], "createdAt" : ISODate("2024-02-19T09:35:30.799Z"), "updatedAt" : ISODate("2024-02-19T09:35:30.799Z"), "__v" : 0 }
{ "_id" : ObjectId("65d320ff702a7e9cf2c04b54"), "name" : "test2", "username" : "test2", "email" : "[email protected]", "emailVerified" : false, "password" : "$2a$10$q.2jYldTj16yAhaMdHckkeUGnqWvC7pli.Tg5bl/Gh3hd8eq9fYUa", "avatar" : null, "provider" : "local", "role" : "USER", "plugins" : [ ], "refreshToken" : [ ], "createdAt" : ISODate("2024-02-19T09:35:59.702Z"), "updatedAt" : ISODate("2024-02-19T09:35:59.702Z"), "__v" : 0 }

I'm now trying to figure out the Azure OpenID authentication to test that as well.

Edit: Looks like I'm running into this issue: #1521

Don't think I can progress until that weird MongoDB logs collection error is solved :/

@danny-avila danny-avila force-pushed the partial-filter-index branch from 8ec3ea9 to a1c7110 Compare March 7, 2024 16:06
@FabianHertwig
Copy link

Linking this comment so that anyone else hoping for a resolution knows

PR does not actually fix the issue, except presents another compatibility issue with Azure CosmosDB, that's why it hasn't been merged

#1778 (reply in thread)

@rubentalstra
Copy link
Collaborator

please don't close. working on a fix.

I’ve updated this PR to remove unique: true and sparse: true from the optional social login fields (googleId, facebookId, etc.) and switch them to simple indexes (index: true). This resolves the Cosmos DB “duplicate key” errors caused by multiple null values and ensures compatibility with both Cosmos DB and MongoDB. The email field remains required and unique, preserving overall identity uniqueness.
@rubentalstra
Copy link
Collaborator

rubentalstra commented Feb 12, 2025

@danny-avila

I’ve updated this PR to remove unique: true and sparse: true from the optional social login fields (googleId, facebookId, etc.) and switch them to simple indexes (index: true). This resolves the Cosmos DB duplicate key errors caused by multiple null values and ensures compatibility with both Cosmos DB and MongoDB. The email field remains required and unique, preserving overall identity uniqueness.

  • tested on mongoDB locally
  • tested on Azure CosmosDB for MongoDB

@rubentalstra rubentalstra changed the title refactor(userSchema): unique index definitions using partialFilterExpression instead of sparse refactor(userSchema): Remove Sparse Unique Indexes from Social Login Fields to Fix CosmosDB Duplicate Key Errors Feb 12, 2025
@danny-avila
Copy link
Owner Author

This is not an acceptable fix..

  1. unique: true:

    • Ensures that no two users can have the same ID for a given authentication provider
    • Prevents duplicate accounts linking to the same third-party service
    • Maintains data integrity by ensuring one-to-one relationships between your users and their social accounts
  2. sparse: true:

    • Allows the field to be optional (can be null or undefined)
    • Only enforces uniqueness for documents where the field exists
    • Important because not every user will have all these IDs (e.g., some users might sign in with Facebook but not GitHub)

@danny-avila
Copy link
Owner Author

Also, it already works without issue using CosmosDB vCore-based, as opposed to CosmosDB RU-based:

https://learn.microsoft.com/en-us/azure/cosmos-db/mongodb/choose-model

@rubentalstra
Copy link
Collaborator

okay got it. but... it's all good.

we already do a check by using the email as required: true so it would not even be possible to create a local account or another social login with the same email. tested that before pushing this. but it's all good.

@rubentalstra rubentalstra removed the request for review from berry-13 February 12, 2025 18:41
@danny-avila
Copy link
Owner Author

@rubentalstra I already tried it, other incompatibilities surface once you get past the user schema. I'd rather not focus on maintaining a replacement for Mongo, especially when there is a better suited alternative using the same cloud provider. AWS DocumentDB also has no issues.

@rubentalstra
Copy link
Collaborator

okay then that's the answer. not that it won't work thank you @danny-avila.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants