-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SOLR phantom duplicate discovery #3943
Comments
|
Moving back to backlog. Hopefully this is no issue any more after above two ticket have been addressed. |
Monitoring the scheduled db-solr-sync job: 10/12 10/13 10/14 |
10/17 10/18 |
Marking as done |
This is awesome! This is the fruits of the daily This is indeed fixed right now, but if the |
SOLR duplicate that doesn't exist in DB, but does in SOLR
Note that these have different names; it's not a bug in SOLR that is causing this duplicate. CKAN is for some reason creating it twice, but only on SOLR
How to reproduce
Expected behavior
If a dataset doesn't exist in the DB, it can't exist in SOLR
Actual behavior
Duplicate record only exists in SOLR
Sketch
Since this came from CKAN, we expect that it is related to a logic issue. It doesn't seem to be replicable (it didn't occur in dev, and it more duplicates aren't created when re-harvesting).
This will be mitigated by #2213, but it won't fix how this occurred initially.
It could be that a restart at the wrong time caused the system to fail at the wrong moment, but not sure. Could theoretically validate by examining logs.
The goal of this ticket is to solve the problem (code, infrastructure, restarts, whatever it is) and stop this from occurring.
There should be a follow up ticket to this to
This may be done by utilizing Test ckan-db-solr-sync job #2213.
The text was updated successfully, but these errors were encountered: