Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create RepeaterStubs #29599

Draft
wants to merge 9 commits into
base: master
Choose a base branch
from
Draft

Create RepeaterStubs #29599

wants to merge 9 commits into from

Conversation

kaapstorm
Copy link
Contributor

@kaapstorm kaapstorm commented Apr 23, 2021

Summary

Repeat Records Couch-to-SQL migration PR 4 of 6

I'm opening this as a draft PR to give context to the conversation around this migration. Following on for a conversation with Danny, this and the two PRs that build on it will need some changes, in order to roll out this migration with more caution.

The changes to these PRs that we discussed are:

  • Mirroring repeat record creation and updates across both Couch and SQL (following migration best practices) to allow us to switch from one backend to the other as easily as possible.
  • Repeat records and RepeaterStub to live in their own database.
  • Splitting out the change in behavior from the migration.

Currently this series of PRs introduces a change in behavior enabled by the switch to SQL; instead of iterating repeat records we can iterate repeaters, send payloads in the order they were created, and handle offline services more intelligently. But to reduce the risk inherent in behavior changes, we should deploy the behavior change after the switch to SQL, instead of with it.

Safety Assurance

  • Risk label is set correctly
  • All migrations are backwards compatible and won't block deploy
  • The set of people pinged as reviewers is appropriate for the level of risk of the change
  • If QA is part of the safety story, the "Awaiting QA" label is used
  • I have confidence that this PR will not introduce a regression for the reasons below

Automated test coverage

PR includes test coverage. If there are aspects that are not covered by tests, please point them out.

QA Plan

Left to the discretion of the SaaS team.

Safety story

This PR will be subject to change. It is not safe to merge.

Rollback instructions

  • This PR can be reverted after deploy with no further considerations

@kaapstorm kaapstorm added the product/invisible Change has no end-user visible impact label Apr 23, 2021
@dimagimon dimagimon added the reindex/migration Reindex or migration will be required during or before deploy label Apr 23, 2021
@@ -402,3 +400,122 @@ def test_yes(self):
with make_repeat_record(self.repeater_stub, RECORD_PENDING_STATE):
is_migrated = are_repeat_records_migrated(DOMAIN)
self.assertTrue(is_migrated)


class RepeaterStubOneToOneRepeaterTests(RepeaterFixtureMixin, TestCase):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F821 undefined name 'RepeaterFixtureMixin'

).count(), 0)


class TestMigrationCantDuplicate(RepeaterFixtureMixin, TestCase):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F821 undefined name 'RepeaterFixtureMixin'

).count(), 1)


class PauseResumeRetireRepeaterTests(RepeaterFixtureMixin, TestCase):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F821 undefined name 'RepeaterFixtureMixin'

@kaapstorm kaapstorm requested a review from mjriley April 23, 2021 10:01
This was referenced Apr 23, 2021
Copy link
Contributor

@mjriley mjriley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good, although I'm concerned with how the two systems handle downtime. I'd like to see some test changes, and, if possible, RepeaterStub to be called anything other than Stub

Comment on lines +25 to +27
# It's faster to violate uniqueness constraint and ask
# forgiveness than to use `.get()` or `.exists()` first.
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't get_or_create be the canonical way to do this?


def create_repeaterstubs(apps, schema_editor):
for repeater in iter_repeaters():
with transaction.atomic():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be a transaction? Unless something will be added later, this is only performing a single create, which is going to be auto-committed anyhow


def test_only_one_in_domain(self):
with transaction.atomic():
# (Run in its own transaction so as not to mess with tearDown)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What error was this giving you? Failing to create a duplicate object should not affect tearDown

Comment on lines +405 to +410
class RepeaterStubOneToOneRepeaterTests(RepeaterFixtureMixin, TestCase):

def test_one_in_domain(self):
self.assertEqual(RepeaterStub.objects.filter(
domain=DOMAIN,
repeater_id=self.repeater.get_id,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are difficult to follow. They are deceptively short, because all of the work is done in the setup methods. Ideally, tests should follow the AAA pattern (Arrange, Act, Assert). I want to be able to see what is being constructed, what you are doing on those constructed objects, and what ultimately is the expected result we're asserting against. In this case, all of our arrange blocks are hidden behind multiple layers -- they don't exist in the test body itself, they don't exist in the class, so they're rather cryptic to find in RepeaterFixtureMixin. Does this behavior warrant 4 tests? It seems like we're just testing Django's models.UniqueConstraint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. Test mix-ins are awful, and I'm not sure how practical these tests are. I'll rework this.

Comment on lines +439 to +442
self.repeater = FormRepeater(
domain=DOMAIN,
url='https://www.example.com/api/',
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FormRepeater is still a couch model, right? Do our tests get stuck if this repeater already exists?

@@ -185,6 +185,17 @@ def _iter_repeat_records_by_repeater(domain, repeater_id, chunk_size,
yield doc['id']


def iter_repeaters():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does iter convey any useful information? Judging by how it's used, get_all_repeaters() makes its usage in create_repeaterstubs a bit easier for me to understand. It was not obvious to me that it was fetching all repeaters

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Developers have been frustrated in the past when they discover that get_foo() returns a generator, instead of a list or tuple that can be iterated more than once. So I use iter_foo() to make callers aware that they aren't getting a list.

Maybe a type hint would check both boxes?

def get_all_repeaters() -> Generator:
    ...


def create_repeaterstubs(apps, schema_editor):
for repeater in iter_repeaters():
with transaction.atomic():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thought given to doing this process via bulk_create? It should be substantially faster

Comment on lines +221 to +224
constraints = [
models.UniqueConstraint(fields=['repeater_id'],
name='one_to_one_repeater')
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why isn't this done using unique?

Comment on lines +322 to +325
def delete(self):
if self.repeater_stub.id:
self.repeater_stub.delete()
super().delete()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these are two separate databases, it seems possible that deleting the stub could succeed, while deleting from the SQL database could fail. Is that a situation that needs to be avoided? If so, it seems like it would be slightly safer to wrap this in a transaction so we can rollback the stub's deletion in the event that the couch deletion fails.

Comment on lines +483 to +488
# Deleting RepeaterStub will cascade-delete SQLRepeatRecords
# and SQLRepeatRecordAttempts too. (RequestLog will still keep a
# record of all send attempts.)
self.repeater_stub.delete()
# NOTE: Undeleting a Repeater needs to include creating a
# RepeaterStub for it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this, pause, and resume, are we concerned about what happens if couch or SQL are not both simultaneously available? Again, using SQL transactions might be useful to mitigate that here. It might also be useful to have at least one test that illustrates how we handle when one of the systems is not available

@kaapstorm
Copy link
Contributor Author

Thanks for the review @mjriley

I'll be spending some time changing the way this migration is rolled out, and I'll take the opportunity to implement the feedback you've given here and on the "Switch over" PR.

I am looking for suggestions for what to rename "RepeaterStub". Its purpose is to link a Repeater's repeat records, and allow them to be managed collectively (which we aren't doing with Couch repeat records). I don't like the current name either, but I couldn't think of a good name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
product/invisible Change has no end-user visible impact reindex/migration Reindex or migration will be required during or before deploy
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants