Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use Elasticsearch to fetch applications for exports #34266

Merged
merged 9 commits into from
Oct 13, 2024

Conversation

proteusvacuum
Copy link
Contributor

@proteusvacuum proteusvacuum commented Mar 12, 2024

Product Description

Fetching all apps when there are a large number of deleted apps was extremely slow, and was timing out. This fetches apps from elasticsearch instead of re-writing the couch view, since I think that was going to require a reindex of the whole index (which I'm guessing is undesirable).

This is currently behind an internal feature flag, and also a domain setting. To enable it, first enable the EXPORTS_APPS_USE_ELASTICSEARCH feature flag, then on the domain settings screen enable the Use elasticsearch when fetching apps for exports setting.

The list of possible apps to export will be different than when pulling data from couch: It will not include forms from "Unknown Apps" (mostly because there are apps that are malformed in the DB currently and it was causing 500 errors when fetching these. For example: https://staging.commcarehq.org/a/qa-automation/apps/view/934af150f0746ff291335b7161d2c076/).

Technical Summary

https://dimagi.atlassian.net/browse/SAAS-15314
Pulling the list of exports on the qa-automation domain from couch was extremely slow, likely due to the volume of deleted apps on that domain. Since those are baked into the couch view, I opted to fetch the apps from elasticsearch instead of creating a new couch view, or modifying the existing one.

This has sped up the request from ~3-5 minutes to < 2 seconds.

Feature Flag

export_apps_use_elasticsearch

Safety Assurance

Safety story

This is only fetching data, and is behind a flag.

Automated test coverage

I've added a new unit test, along with updating existing tests to validate existing behaviour.

QA Plan

This is actually for the QA team to use during testing on staging, as their domain has a large number of deleted apps.

Rollback instructions

  • This PR can be reverted after deploy with no further considerations

Labels & Review

  • Risk label is set correctly
  • The set of people pinged as reviewers is appropriate for the level of risk of the change

@proteusvacuum proteusvacuum force-pushed the fr/deleted-apps-2 branch 4 times, most recently from 8118925 to 9cdc525 Compare March 12, 2024 12:27
@proteusvacuum proteusvacuum changed the title Fr/deleted apps 2 Use Elasticsearch to fetch applications for exports Mar 24, 2024
@proteusvacuum proteusvacuum force-pushed the fr/deleted-apps-2 branch 2 times, most recently from 3dddf8a to 6c46862 Compare March 24, 2024 20:17
@proteusvacuum proteusvacuum marked this pull request as ready for review June 19, 2024 13:15
Copy link
Contributor

@millerdev millerdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like there is a fair amount of experimentation going on in the commit history of this PR. What do you think of cleaning that up with a rebase, removing things like f395ae1 and 45ea357 and other things that were added and then changed?

Comment on lines +288 to +290
def __init__(self, domain, project, user, as_dict=True):
self.domain = domain
self.domain_object = project
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it work to only pass in the domain object?

Suggested change
def __init__(self, domain, project, user, as_dict=True):
self.domain = domain
self.domain_object = project
def __init__(self, project, user, as_dict=True):
self.domain = project.name
self.domain_object = project

self.domain could also be changed to a property that returns self.domain_object.name

@@ -111,6 +111,8 @@ def test_get_form_analytics_metadata__app(self):

@flaky_slow
def test_get_exports_by_form(self):
# Call this twice, since the couchdb `update_after` to force couchdb to return the right results
get_exports_by_form(self.domain)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does calling it twice actually force it, or does it just take a bit more time, making the second call a bit more likely to return the correct results? In other words, is this possibly a flaky test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it actually forces it. This is also a flaky test, but it was before I added this change (and is annotated as such)

'submissions': 1,
},
'key': ['exports_forms_analytics_domain', self.app_id_3,
'my://crazy.xmlns/deleted-app']
Copy link
Contributor

@millerdev millerdev Jun 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is confusing in the context of exclude_deleted_apps=True. Can you say a bit more about why this makes sense?

Edit: maybe this is irrelevant. Looks like it was later removed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sorry about that - there was a lot of experimentation going on here

form = f['value']
if form.get('app_deleted') and not form.get('submissions'):
continue
if 'app' in form:
form['has_app'] = True
forms.append(form)
else:
elif not self.domain_object.exports_use_elasticsearch:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Farid, can you explain what does it mean when app is not in form? I would prefer not to exclude those forms, because when I run in a shell, your _get_export_forms_by_app_es
is already quick enough to solve the current issue. 500 error seems like another issue. Loading those either unknown or delete apps used to work without 500 error, at least until 2 months ago. If there is some changes that resulted in those 500 errors, we should fix that, instead of just exclude those forms.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Offlining a response to this)

@jingcheng16
Copy link
Contributor

Btw not sure if you can open this, but this is the Sentry issue I found when access your example

Add show_deleted_apps_exports to domain model and settings
Add deleted app to test_get_exports_by_form

Optionally get export forms through ES

Make test not flaky

Use flag to use elasticsearch to get fields

Remove deleted apps when fetching from elasticsearch

Revert "Remove deleted apps when fetching from elasticsearch"

This reverts commit 7ae476e.

Update toggle to be switching on using the elasticsearch getter

Don't error when there are no modules / forms

Only return no-copy-of apps
@proteusvacuum
Copy link
Contributor Author

@millerdev I rebased and squashed out a bunch of the experimental commits. Sorry about that.

Copy link
Contributor

@jingcheng16 jingcheng16 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@proteusvacuum proteusvacuum force-pushed the fr/deleted-apps-2 branch 2 times, most recently from a5c3e79 to caefff8 Compare September 3, 2024 13:18
@jingcheng16 jingcheng16 merged commit 266d4f6 into dimagi:master Oct 13, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants