Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL Repeat Record Report & Case Data Report #29029

Merged
merged 21 commits into from
Apr 14, 2021
Merged

SQL Repeat Record Report & Case Data Report #29029

merged 21 commits into from
Apr 14, 2021

Conversation

kaapstorm
Copy link
Contributor

Summary

This PR is the next in a series to implement [CEP] Migrate RepeatRecord to SQL.

Previous PR: process_repeater_stub() task

no-obligation fyi: @orangejenny @millerdev @snopoke

This PR creates a repeat record report for SQL repeat records. HQ determines which report to use based on whether the domain has SQL repeat records. (See the "Test are_repeat_records_migrated" commit.)

Two important things to note from the CEP:

  1. In an upcoming PR, SQLRepeatRecords are created by soft-migrating the RepeatRecord instances that are about to be forwarded.

    • Only domains actively forwarding data will have SQLRepeatRecord instances, and will see the SQL repeat record report.
    • Domains whose repeaters are paused, or who are not generating payloads for them, will only have Couch RepeatRecord instances, and will see the current repeat record report.
    • Domains with a mix of paused and unpaused repeaters will see the SQL repeat record report. In this scenario, the repeat records of paused repeaters will not appear in the repeat record report unless/until they are migrated.
  2. It will take up to 7 days to soft-migrate all active Couch RepeatRecord instances. Old instances that have either succeeded or been cancelled will not be soft-migrated. We can choose whether to migrate them or not.

An upcoming PR includes a management command for migrating repeat records to SQL.

🐟 🐡 🐠

Safety Assurance

  • Risk label is set correctly
  • The set of people pinged as reviewers is appropriate for the level of risk of the change
  • If QA is part of the safety story, the "Awaiting QA" label is used
  • I am certain that this PR will not introduce a regression for the reasons below

Automated test coverage

PR includes tests that cover changes. (Please comment if you spot changes that are not covered by automated tests and you think ought to be.)

QA Plan

The code that makes this PR reachable will be staged and QAed. We have two more PRs to go before we get to the one that makes these changes reachable.

Safety story

This code is not reachable yet.

Rollback instructions

  • This PR can be reverted after deploy with no further considerations

@kaapstorm kaapstorm added the product/invisible Change has no end-user visible impact label Jan 24, 2021
def task_operate_on_payloads(
record_ids: List[str],
domain: str,
action, # type: Literal['resend', 'cancel', 'requeue'] # 3.8+

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F821 undefined name 'Literal'

@@ -234,24 +241,34 @@ def task_generate_ids_and_operate_on_payloads(
payload_id: Optional[str],
repeater_id: Optional[str],
domain: str,
action: str = '',
action, # type: Literal['resend', 'cancel', 'requeue'] # 3.8+

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F821 undefined name 'Literal'

def operate_on_payloads(
repeat_record_ids: List[str],
domain: str,
action, # type: Literal['resend', 'cancel', 'requeue'] # 3.8+

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F821 undefined name 'Literal'

corehq/apps/data_interfaces/utils.py Outdated Show resolved Hide resolved
corehq/apps/data_interfaces/utils.py Outdated Show resolved Hide resolved
def _schedule_task_with_flag(
request: HttpRequest,
domain: str,
action, # type: Literal['resend', 'cancel', 'requeue'] # 3.8+

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F821 undefined name 'Literal'

def _schedule_task_without_flag(
request: HttpRequest,
domain: str,
action, # type: Literal['resend', 'cancel', 'requeue'] # 3.8+

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F821 undefined name 'Literal'

@kaapstorm
Copy link
Contributor Author

ping @dannyroberts

Base automatically changed from nh/rep/process_repeater_stub to master January 28, 2021 19:57
@kaapstorm kaapstorm marked this pull request as ready for review January 28, 2021 20:00
queryset = queryset.filter(repeater_stub__repeater_id=repeater_id)
if state:
queryset = queryset.filter(state=state)
return queryset.count()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it important for this count to be exact? If not, would it be good enough to estimate it using the query planner (much faster)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

queryset = queryset.filter(repeater_stub__repeater_id=repeater_id)
if state:
queryset = queryset.filter(state=state)
return (queryset.order_by('-registered_at')[skip:skip + limit]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Offset is inefficient for large offsets and can result in records being skipped in some cases. Would it work to use paginate_query or possibly the technique used in iter_large?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I share your concern.

Thanks for the links. I also looked at Ethan's recent PR along the same lines. But all of them seem to me to be appropriate for iterating rows or models, but not for paginating a report.

I think we have two options here:

  • Use Elasticsearch instead of Postgres for the Repeat Record Report.
  • Hope that users don't care about what happens more than a couple of pages into the report.

I don't really love either of those options. Are my concerns valid? Are there other options I'm not considering?

@dannyroberts @gherceg do you have opinions on whether the Repeat Records Report should be backed by Elasticsearch or Postgres? And if Postgres, how we should paginate it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder, doesn't couch (which currently backs the report) have the same pagination issue? And doesn't elasticsearch? Maybe I'm missing some special magic that Couch and/or Elasticsearch do, but my guess is that they also just internally do the query for skip + limit items and return the last limit of them.

At some point it also becomes more than is feasible for a person to actually click through. So having more granular filters and an upper limit (10k? 100k?) to the number of items we're willing to show in the results through pagination seems like the best longer term option to me

corehq/motech/repeaters/models.py Outdated Show resolved Hide resolved
corehq/motech/repeaters/models.py Outdated Show resolved Hide resolved
corehq/apps/data_interfaces/tasks.py Outdated Show resolved Hide resolved
corehq/apps/data_interfaces/utils.py Outdated Show resolved Hide resolved
Co-authored-by: Daniel Miller <[email protected]>
@@ -1185,7 +1185,8 @@ def failure_reason(self):
def last_message(self):
# Uses `list(queryset)[-1]` instead of `queryset.last()` to use
# prefetched attempts, if available.
return list(self.attempts)[-1].message
attempts = list(self.attempts)
return attempts[-1].message if attempts else None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: could this return an empty string rather than None in the case of no attempts? That would make the return type always be the same (requiring less if x is None checks elsewhere) without breaking process(message) if message else ignore.

Copy link
Contributor Author

@kaapstorm kaapstorm Jan 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this, and those are good points.

I leaned towards None because attempts[-1].message could be None anyway:

    message = models.TextField(null=True, blank=True)

I could return a string always:

    def last_message(self):
        attempts = list(self.attempts)
        if attempts
            return attempts[-1].message or ''
        return ''

Or I could warn future devs to take None into consideration -> 😉 :

    def last_message(self) -> Optional[str]:
        attempts = list(self.attempts)
        return attempts[-1].message if attempts else None

I don't really love the idea of changing the model. Semantically None means that message has not been assigned a value. I think that's useful to know, and maybe last_message() should convey that. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think up to you. I'd personally go with a NOT NULL field on the model and make it always return a string since that tends to make many things easier and less error prone in my experience. Is there ever a time when it's important to know that a message has not been assigned (value is None) vs there is no message (value is empty string)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there ever a time when it's important to know that a message has not been assigned

An excellent question. e686887

@@ -1185,7 +1185,8 @@ def failure_reason(self):
def last_message(self):
# Uses `list(queryset)[-1]` instead of `queryset.last()` to use
# prefetched attempts, if available.
return list(self.attempts)[-1].message
attempts = list(self.attempts)
return attempts[-1].message if attempts else None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think up to you. I'd personally go with a NOT NULL field on the model and make it always return a string since that tends to make many things easier and less error prone in my experience. Is there ever a time when it's important to know that a message has not been assigned (value is None) vs there is no message (value is empty string)?

else:
return [r['id'] for r in iter_repeat_records_by_repeater(
domain, repeater_id)]
records = iter_repeat_records_by_repeater(domain, repeater_id)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this query use include_docs: true (couch view option)? Would the performance improvement of doing the query with include_docs: false be worth it? Same applies to above query with payload_id.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 9c6c06f

@dimagimon dimagimon added the reindex/migration Reindex or migration will be required during or before deploy label Apr 10, 2021
@kaapstorm kaapstorm merged commit 49953a1 into master Apr 14, 2021
@kaapstorm kaapstorm deleted the nh/rep/report branch April 14, 2021 13:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
product/invisible Change has no end-user visible impact reindex/migration Reindex or migration will be required during or before deploy
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants