Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve index and group indexing #76

Closed
jrcastro2 opened this issue May 17, 2023 · 1 comment · Fixed by #73, inveniosoftware/invenio-rdm-records#1307 or inveniosoftware/docs-invenio-rdm#549
Assignees

Comments

@jrcastro2
Copy link
Contributor

jrcastro2 commented May 17, 2023

No description provided.

@ppanero
Copy link
Member

ppanero commented Jun 7, 2023

@jrcastro2 Is this taking into account the fact that users_service.rebuild_index does a query.all()? the problem with this approach is that all entries are loaded in memory. For example in the Zenodo migration, this crashes since it requires around 300k entries loaded in memory.

An alterantive is to do something similar to what is done in records resources:

users = (db.session.query(User.id).yield_per(1000))
users_service.indexer.bulk_index((rec.id for rec in records))

TLGINO added a commit to TLGINO/invenio-users-resources that referenced this issue Jun 16, 2023
 * changed number of workers per celery task based on task type
 * closes inveniosoftware#76
TLGINO added a commit to TLGINO/invenio-rdm-records that referenced this issue Jun 16, 2023
TLGINO added a commit to TLGINO/invenio-users-resources that referenced this issue Jun 17, 2023
 * use bulk indexing instead of iterative single indexing
 * closes inveniosoftware#76
TLGINO added a commit to TLGINO/invenio-users-resources that referenced this issue Jun 23, 2023
 * use bulk indexing instead of iterative single indexing
 * closes inveniosoftware#76
TLGINO added a commit to TLGINO/invenio-users-resources that referenced this issue Jun 26, 2023
 * use bulk indexing instead of iterative single indexing
 * closes inveniosoftware#76
TLGINO added a commit to TLGINO/invenio-users-resources that referenced this issue Jun 26, 2023
 * use bulk indexing instead of iterative single indexing
 * closes inveniosoftware#76
TLGINO added a commit to TLGINO/invenio-users-resources that referenced this issue Jun 26, 2023
 * fixed tests
 * use bulk indexing instead of iterative single indexing
 * closes inveniosoftware#76
@TLGINO TLGINO removed their assignment Jun 26, 2023
@slint slint assigned slint and TLGINO and unassigned slint Jun 26, 2023
TLGINO added a commit to TLGINO/invenio-rdm-records that referenced this issue Jun 27, 2023
TLGINO added a commit to TLGINO/invenio-users-resources that referenced this issue Jun 27, 2023
 * fixed tests
 * use bulk indexing instead of iterative single indexing
 * closes inveniosoftware#76
@TLGINO TLGINO removed their assignment Jun 27, 2023
TLGINO added a commit to TLGINO/invenio-users-resources that referenced this issue Jun 30, 2023
 * fixed tests
 * use bulk indexing instead of iterative single indexing
 * closes inveniosoftware#76
@TLGINO TLGINO removed their assignment Jun 30, 2023
TLGINO added a commit to TLGINO/invenio-users-resources that referenced this issue Jun 30, 2023
 * fixed tests
 * use bulk indexing instead of iterative single indexing
 * closes inveniosoftware#76

Co-authored-by: Zacharias Zacharodimos <[email protected]>
zzacharo added a commit that referenced this issue Jun 30, 2023
 * fixed tests
 * use bulk indexing instead of iterative single indexing
 * closes #76

Co-authored-by: Zacharias Zacharodimos <[email protected]>
@zzacharo zzacharo self-assigned this Jun 30, 2023
zzacharo pushed a commit to inveniosoftware/invenio-rdm-records that referenced this issue Jun 30, 2023
zzacharo pushed a commit to inveniosoftware/invenio-rdm-records that referenced this issue Jun 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment