-
-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Elasticsearch to fetch applications for exports #34266
Conversation
8118925
to
9cdc525
Compare
f2df7ef
to
3dad7a1
Compare
3dddf8a
to
6c46862
Compare
5d635b9
to
5fd1eaa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def __init__(self, domain, project, user, as_dict=True): | ||
self.domain = domain | ||
self.domain_object = project |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it work to only pass in the domain object?
def __init__(self, domain, project, user, as_dict=True): | |
self.domain = domain | |
self.domain_object = project | |
def __init__(self, project, user, as_dict=True): | |
self.domain = project.name | |
self.domain_object = project |
self.domain
could also be changed to a property that returns self.domain_object.name
@@ -111,6 +111,8 @@ def test_get_form_analytics_metadata__app(self): | |||
|
|||
@flaky_slow | |||
def test_get_exports_by_form(self): | |||
# Call this twice, since the couchdb `update_after` to force couchdb to return the right results | |||
get_exports_by_form(self.domain) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does calling it twice actually force it, or does it just take a bit more time, making the second call a bit more likely to return the correct results? In other words, is this possibly a flaky test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it actually forces it. This is also a flaky test, but it was before I added this change (and is annotated as such)
'submissions': 1, | ||
}, | ||
'key': ['exports_forms_analytics_domain', self.app_id_3, | ||
'my://crazy.xmlns/deleted-app'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is confusing in the context of exclude_deleted_apps=True
. Can you say a bit more about why this makes sense?
Edit: maybe this is irrelevant. Looks like it was later removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, sorry about that - there was a lot of experimentation going on here
form = f['value'] | ||
if form.get('app_deleted') and not form.get('submissions'): | ||
continue | ||
if 'app' in form: | ||
form['has_app'] = True | ||
forms.append(form) | ||
else: | ||
elif not self.domain_object.exports_use_elasticsearch: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Farid, can you explain what does it mean when app
is not in form
? I would prefer not to exclude those forms, because when I run in a shell, your _get_export_forms_by_app_es
is already quick enough to solve the current issue. 500 error seems like another issue. Loading those either unknown or delete apps used to work without 500 error, at least until 2 months ago. If there is some changes that resulted in those 500 errors, we should fix that, instead of just exclude those forms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Offlining a response to this)
Btw not sure if you can open this, but this is the Sentry issue I found when access your example |
Add show_deleted_apps_exports to domain model and settings
Add deleted app to test_get_exports_by_form Optionally get export forms through ES Make test not flaky Use flag to use elasticsearch to get fields Remove deleted apps when fetching from elasticsearch Revert "Remove deleted apps when fetching from elasticsearch" This reverts commit 7ae476e. Update toggle to be switching on using the elasticsearch getter Don't error when there are no modules / forms Only return no-copy-of apps
7571fcf
to
5985875
Compare
@millerdev I rebased and squashed out a bunch of the experimental commits. Sorry about that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!
a5c3e79
to
caefff8
Compare
Product Description
Fetching all apps when there are a large number of deleted apps was extremely slow, and was timing out. This fetches apps from elasticsearch instead of re-writing the couch view, since I think that was going to require a reindex of the whole index (which I'm guessing is undesirable).
This is currently behind an internal feature flag, and also a domain setting. To enable it, first enable the
EXPORTS_APPS_USE_ELASTICSEARCH
feature flag, then on the domain settings screen enable theUse elasticsearch when fetching apps for exports
setting.The list of possible apps to export will be different than when pulling data from couch: It will not include forms from "Unknown Apps" (mostly because there are apps that are malformed in the DB currently and it was causing 500 errors when fetching these. For example: https://staging.commcarehq.org/a/qa-automation/apps/view/934af150f0746ff291335b7161d2c076/).
Technical Summary
https://dimagi.atlassian.net/browse/SAAS-15314
Pulling the list of exports on the
qa-automation
domain from couch was extremely slow, likely due to the volume of deleted apps on that domain. Since those are baked into the couch view, I opted to fetch the apps from elasticsearch instead of creating a new couch view, or modifying the existing one.This has sped up the request from ~3-5 minutes to < 2 seconds.
Feature Flag
export_apps_use_elasticsearch
Safety Assurance
Safety story
This is only fetching data, and is behind a flag.
Automated test coverage
I've added a new unit test, along with updating existing tests to validate existing behaviour.
QA Plan
This is actually for the QA team to use during testing on staging, as their domain has a large number of deleted apps.
Rollback instructions
Labels & Review