-
Notifications
You must be signed in to change notification settings - Fork 12
Add versioning adr #135
base: main
Are you sure you want to change the base?
Add versioning adr #135
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,115 @@ | ||
# Versioning for Nimbus ExperimentSchema | ||
|
||
- Status: Draft | ||
- Author: Kate Hudson | ||
- Deciders: Nimbus team, Cirrus team | ||
- Date: October 2020 | ||
|
||
## Context and Problem Statement | ||
|
||
As per ADR-0002, we decided to use | ||
[JSON Schema Draft-07](https://tools.ietf.org/html/draft-handrews-json-schema-01) for the Nimbus | ||
Experiment Schema. Besides having some way of validating data across these applications, we also | ||
have to manage the additional complexity that any given time experiments may be live that are only | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit "that at any" |
||
compatible with a subset of clients. | ||
|
||
In order to prevent clients from receiving experiment defitions that could cause breakage while | ||
preserving the flexibility to update the Nimbus Experiment Schema when necessary, we need: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't have a lot of historical context here, so I'm curious: what sort of breakage have bad experiment definitions caused in the past? Clients not enrolling in experiments that should, or enrolling in ones they shouldn't? Crashers or other user-visible bugs? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This feeds into a broader point that I raised with @k88hudson in slack, but would welcome further opinions on: I think it's hard to define a meaningful versioning policy without understanding what you're trying to communicate or prevent; IOW you need to define what "breaking" means before you can determine whether something is a "breaking change". For SemVer, "breaking" means "if a consumer updates to this version, there is some chance that their code will no longer compile/run correctly". What does "breaking" mean for the consumers of these experiment definitions? For the Client SDK I can think of several:
I don't think versioning is a good mechanism to prevent (3), which seems better served by having clients aggressively ignore experiment records that don't validate against their expected schema. But there does seem to be value in using a version number to manage (1) and (2). I'm skeptical of needing three levels of versioning, at least from the client SDK perspective. A
The third level of versioning may be useful for other consumers though, in which case I've no objection to it. |
||
|
||
- A versioning strategy that can be used to express compatibility ranges between experiments and | ||
clients | ||
- A better process for making changes to the schema and coordinating those changes across | ||
applications | ||
|
||
## Decision Drivers | ||
|
||
- Is the versioning strategy clearly understood and relatively simple to implement across all Nimbus | ||
applications? | ||
- How much complexity does the versioning strategy introduce to manage backwards compatibilty? While | ||
we occasionally need to target older versions of clients, our primary use case is to be able to | ||
launch experiments to the most recent version of clients without causing breakage to old ones. | ||
- Is the versioning strategy appropriate for our current early phase of development, in which we | ||
expect to make changes to the schema relatively frequently? | ||
|
||
## Decision Outcome | ||
|
||
Option A was chosen and each record contains the version of the schema used to validate it. | ||
|
||
## Recommendation | ||
|
||
### Option A - Use per-record versioning | ||
|
||
Because at any given time a Remote Settings Collection could be read by clients with multiple | ||
versions of the SDK, using per-record versioning would allow us to update to a new schema without | ||
breaking older clients or disrupting existing experiments. We would need to: | ||
|
||
- associate each version of the schema with a version | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. FWIW I'm not sure I properly parsed the distinction between the two uses of "version" in this sentence. Do you mean that every time we change the schema we need to give it a new version number? |
||
- add a `schemaVersion` field to each record. | ||
- add some logic to the client to ignore new experiments of a different major version | ||
|
||
#### Example | ||
|
||
Let's say the current latest version of the Nimbus Experiment schema is `2.1.2`. Applications are in | ||
following state: | ||
|
||
- Experimenter publishes new experiments with `schemaVersion`= `2.1.2` | ||
- The latest version of the SDK uses schema version `2.1.2` | ||
- Clients in Fenix release are on `2.1.2` (it has the latest version of the SDK) | ||
- Clients in Latest Desktop Firefox 84 are on `2.0.4` (it has an older version of the SDK) | ||
- Clients in Desktop Firefox 83 are on `2.0.0`, and Clients in Firefox 82 are on `1.0.1`. | ||
|
||
In this example, new experiments would be accepted by Fenix, Firefox Desktop 84 and 83 but not on | ||
Firefox 82. | ||
|
||
#### SemVer-ish? | ||
|
||
- Could we use | ||
[SchemaVer](https://snowplowanalytics.com/blog/2014/05/13/introducing-schemaver-for-semantic-versioning-of-schemas/) | ||
to make version updates more relevant to data models/schema changes? | ||
|
||
#### Pros | ||
|
||
- Allows us to introduce breaking changes without overhead of a new RS bucket | ||
- Semver is realtively familiar, SemVer comparison implementations widely available | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit - relatively |
||
|
||
#### Cons | ||
|
||
- SemVer can get confusing - hard to decide what consitutes a breaking change | ||
- Fairly course-grained especially for something that changes often like targeting attributes; not | ||
as flexible as capabilities-based system | ||
- Requires specific client logic to check major version | ||
|
||
## Other options considered | ||
|
||
### Option B - Use major version on API + application version ranges on records | ||
|
||
This is what we currently do for other services in Fx Desktop like Mesasging System, including | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: Messaging |
||
Snippets. Applications are tied to a single major version of the service (e.g. Snippets version 5) | ||
that is very infrequently changed. If the major version is bumped, the client throws out all cached | ||
data and re-queries the new endpoint | ||
|
||
Individual records have a version range (e.g. version >= Firefox 81) to avoid potential breakage on | ||
older versions and need to be manually tested. Breaking changes require a new remote settings | ||
collection. | ||
|
||
#### Pros | ||
|
||
- No additional client-side versioning logic | ||
- Simpler to implement for Experimenter | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One pro of this option is that there are no choices we can make right now that would prevent us from exercising this option in the future :-) |
||
|
||
#### Cons | ||
|
||
- The one-version-for-all-records model doesn't work well if clients on multiple versions have to | ||
read the same remote settings collection. | ||
- Having to change the remote settings collection for every breaking change adds a lot of complexity | ||
on both the client and the server. | ||
|
||
### Option B - Use capabilities-based versioning | ||
|
||
See Capabilities Proposal in[1] | ||
|
||
## Links | ||
|
||
- [1][versioning for nimbus proposal](https://docs.google.com/document/d/1dmO94BjFtdSzsN9z9tuRrc-QIUKC9DLQdk-66f9V3Dg/edit#) | ||
- [Proposed release process for Nimbus](https://docs.google.com/document/d/1gI2jFBAq8fDWFeL8xCSfA7_nnJj5jR4pDaao4rZpk0I/edit#) | ||
- [Understanding how Nimbus SDK interacts with the Nimbus schema](https://docs.google.com/document/d/1_9X6ozvGXNqbd8e5xIoF845nb3bHExLsD8fbeFSP7DA/edit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The JSON schema contains several stringly-typed JEXL fields which AFAICT are not validated, such as the targeting expression. Would the versioning discussed here cover changes to the semantics of those fields as well? I assume so, but it may be worth calling out explicitly that a change in the version number might not map directly to an observable change in the JSON Schema definitions.