Skip to content
This repository has been archived by the owner on Jan 29, 2025. It is now read-only.

Add versioning adr #135

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
115 changes: 115 additions & 0 deletions docs/adr/0003-version-strategy.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
# Versioning for Nimbus ExperimentSchema

- Status: Draft
- Author: Kate Hudson
- Deciders: Nimbus team, Cirrus team
- Date: October 2020

## Context and Problem Statement

As per ADR-0002, we decided to use
[JSON Schema Draft-07](https://tools.ietf.org/html/draft-handrews-json-schema-01) for the Nimbus
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The JSON schema contains several stringly-typed JEXL fields which AFAICT are not validated, such as the targeting expression. Would the versioning discussed here cover changes to the semantics of those fields as well? I assume so, but it may be worth calling out explicitly that a change in the version number might not map directly to an observable change in the JSON Schema definitions.

Experiment Schema. Besides having some way of validating data across these applications, we also
have to manage the additional complexity that any given time experiments may be live that are only
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit "that at any"

compatible with a subset of clients.

In order to prevent clients from receiving experiment defitions that could cause breakage while
preserving the flexibility to update the Nimbus Experiment Schema when necessary, we need:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a lot of historical context here, so I'm curious: what sort of breakage have bad experiment definitions caused in the past? Clients not enrolling in experiments that should, or enrolling in ones they shouldn't? Crashers or other user-visible bugs?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feeds into a broader point that I raised with @k88hudson in slack, but would welcome further opinions on: I think it's hard to define a meaningful versioning policy without understanding what you're trying to communicate or prevent; IOW you need to define what "breaking" means before you can determine whether something is a "breaking change".

For SemVer, "breaking" means "if a consumer updates to this version, there is some chance that their code will no longer compile/run correctly". What does "breaking" mean for the consumers of these experiment definitions?

For the Client SDK I can think of several:

  1. A client that we want to have enrolled in the experiment, fails to do so, potentially skewing the data.
  2. A client that we didn't want to enroll in the experiment, does so, potentially skewing the data.
  3. A client trying to interpret this experiment info will crash or otherwise behave badly.

I don't think versioning is a good mechanism to prevent (3), which seems better served by having clients aggressively ignore experiment records that don't validate against their expected schema. But there does seem to be value in using a version number to manage (1) and (2).

I'm skeptical of needing three levels of versioning, at least from the client SDK perspective. A MAJOR.MINOR schema (or perhaps MODEL.REVISION in the style of schemaver) seems sufficient. Something like:

  • Is the MAJOR number on this record larger than the one we support? Ignore it, we might do the wrong thing even if we manage to parse it correctly.
  • Is the MINOR number on this record larger than the one we support? Attempt to validate/parse/execute it, but log and ignore if that fails.
  • Otherwise, attempt to validate/parse/execute it and treat failure to do so as a noteworthy error for telemetry etc.

The third level of versioning may be useful for other consumers though, in which case I've no objection to it.


- A versioning strategy that can be used to express compatibility ranges between experiments and
clients
- A better process for making changes to the schema and coordinating those changes across
applications

## Decision Drivers

- Is the versioning strategy clearly understood and relatively simple to implement across all Nimbus
applications?
- How much complexity does the versioning strategy introduce to manage backwards compatibilty? While
we occasionally need to target older versions of clients, our primary use case is to be able to
launch experiments to the most recent version of clients without causing breakage to old ones.
- Is the versioning strategy appropriate for our current early phase of development, in which we
expect to make changes to the schema relatively frequently?

## Decision Outcome

Option A was chosen and each record contains the version of the schema used to validate it.

## Recommendation

### Option A - Use per-record versioning

Because at any given time a Remote Settings Collection could be read by clients with multiple
versions of the SDK, using per-record versioning would allow us to update to a new schema without
breaking older clients or disrupting existing experiments. We would need to:

- associate each version of the schema with a version
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I'm not sure I properly parsed the distinction between the two uses of "version" in this sentence. Do you mean that every time we change the schema we need to give it a new version number?

- add a `schemaVersion` field to each record.
- add some logic to the client to ignore new experiments of a different major version

#### Example

Let's say the current latest version of the Nimbus Experiment schema is `2.1.2`. Applications are in
following state:

- Experimenter publishes new experiments with `schemaVersion`= `2.1.2`
- The latest version of the SDK uses schema version `2.1.2`
- Clients in Fenix release are on `2.1.2` (it has the latest version of the SDK)
- Clients in Latest Desktop Firefox 84 are on `2.0.4` (it has an older version of the SDK)
- Clients in Desktop Firefox 83 are on `2.0.0`, and Clients in Firefox 82 are on `1.0.1`.

In this example, new experiments would be accepted by Fenix, Firefox Desktop 84 and 83 but not on
Firefox 82.

#### SemVer-ish?

- Could we use
[SchemaVer](https://snowplowanalytics.com/blog/2014/05/13/introducing-schemaver-for-semantic-versioning-of-schemas/)
to make version updates more relevant to data models/schema changes?

#### Pros

- Allows us to introduce breaking changes without overhead of a new RS bucket
- Semver is realtively familiar, SemVer comparison implementations widely available
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit - relatively


#### Cons

- SemVer can get confusing - hard to decide what consitutes a breaking change
- Fairly course-grained especially for something that changes often like targeting attributes; not
as flexible as capabilities-based system
- Requires specific client logic to check major version

## Other options considered

### Option B - Use major version on API + application version ranges on records

This is what we currently do for other services in Fx Desktop like Mesasging System, including
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Messaging

Snippets. Applications are tied to a single major version of the service (e.g. Snippets version 5)
that is very infrequently changed. If the major version is bumped, the client throws out all cached
data and re-queries the new endpoint

Individual records have a version range (e.g. version >= Firefox 81) to avoid potential breakage on
older versions and need to be manually tested. Breaking changes require a new remote settings
collection.

#### Pros

- No additional client-side versioning logic
- Simpler to implement for Experimenter
Copy link

@rfk rfk Oct 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One pro of this option is that there are no choices we can make right now that would prevent us from exercising this option in the future :-)


#### Cons

- The one-version-for-all-records model doesn't work well if clients on multiple versions have to
read the same remote settings collection.
- Having to change the remote settings collection for every breaking change adds a lot of complexity
on both the client and the server.

### Option B - Use capabilities-based versioning

See Capabilities Proposal in[1]

## Links

- [1][versioning for nimbus proposal](https://docs.google.com/document/d/1dmO94BjFtdSzsN9z9tuRrc-QIUKC9DLQdk-66f9V3Dg/edit#)
- [Proposed release process for Nimbus](https://docs.google.com/document/d/1gI2jFBAq8fDWFeL8xCSfA7_nnJj5jR4pDaao4rZpk0I/edit#)
- [Understanding how Nimbus SDK interacts with the Nimbus schema](https://docs.google.com/document/d/1_9X6ozvGXNqbd8e5xIoF845nb3bHExLsD8fbeFSP7DA/edit)