Testing Practices and Standards #284
Replies: 1 comment 1 reply
-
Looks good, a few things: More emphasis on "no black-box testing, only white-box!" (Maybe I missed somewhere) Separate unit and integration tests There shouldn't be a Addition to "Testing behaviour, not lines" If a given line is not covered in any of the test cases and we can see no reasons to add extra cases to test it, we can ask "can we remove that line?" If somewhere is there and we can't remove, there has to be a reason why it is there, therefore a behavior exists we can test. |
Beta Was this translation helpful? Give feedback.
-
Testing Practices and Standards
Status: approved
Date: 2021-11-23
Authors: @Callisto13
Deciders: @Callisto13 @jmickey @richardcase @yitsushi
Context
The codebase is moving from POC to a production alpha release. Part of getting
production ready is increasing confidence in and quality of the code via automated
tests. As it stands, a fair amount needs to be backfilled, and as the product
grows more will need to be added.
This proposal establishes some base rules and practices which all maintainers
and contributors agree to follow when writing tests.
Scope
The scope is quite broad, and the topic one which inevitably invites Opinions.
This doc establishes a baseline of what testing elements would constitute an
acceptable PR submission.
This ADR is more capturing what we currently do and giving space to expand,
consider and alter, than me prescribing a new way of doing things.
Topics (include but not limited to):
Decision
When to write tests
Always. A PR which adds new or changes existing Go code must always come with
corresponding tests which cover all paths.
Coverage
All PRs currently run
test-with-cov
which generates a coverage report.Coverage is a useless metric (unless it is zero) and it is disturbingly easy
to get to 100% without actually testing anything at all.
Furthermore, our report is notoriously inaccurate (coverage appears to change on
pure docs PRs) although this has recently improved.
We have discussed the value of having this report at all, given few of us read
it and it is mostly noise. We did concede that it is useful to have something
on PRs purely to check that tests have been added, if for no other reason than
it saves the maintainers/reviewers having to read and nudge.
While again the coverage report should be taken with a bucket of salt, a
variation in the red of more than 5% should raise flags: the submitter should
review their both tests and code to discover gaps.
True coverage
Generated coverage is not true coverage*: it indicates that code paths have
been hit... and that's it. Automation will not know, for example, whether
anything is verified after a method call: it will not know whether something
is tested well.
* Lol yeh true coverage is not a "thing", as really it is subjective and
circumstantial. In this case I mean a shorthand for "coverage of what we
care about", which is; "does my stuff behave the way it should?"
Tests should be reviewed as critically as any other code, and it is up to the
reviewer to determine whether coverage is sufficient.
Those who T/BDD will naturally get to 100% true coverage with very little
thought or effort.
Those who don't should actively try to break their code and tests after they
have written them.
Testing behaviour, not lines
In order to get true coverage we test against behaviour rather than
implementation. In other words; we don't test that various internal pathways
have been traversed, we test that the correct set of outcomes have occurred
based on a given set of inputs.
Eg:
are as expected given the inputs.
exists at the location.
met, verify the error and message that is returned.
Testing against behaviour also has the benefit of influencing the design of the
code, making it easier to change (and test) in the future. \o/
If we notice that a given line is not covered in any of the test cases and we can see no reason
to add a case for it, we can reasonably ask if that line can be removed. Why is it there?
If it turns out that the line can't be removed, there has to be a reason why it is there,
therefore a behaviour exists which we can test.
Separate packages
In order to facilitate testing against behaviour, and to ensure that each 'unit'
is actually that (a neatly encapsulated package of code which fulfils a
contract with no sprawling hidden interdependencies), all tests are created in
a separate
_test
package.Black-box vs White-box
On a similar theme to the above, we are strongly opposed to black-box testing; none
should be added, in any circumstances. If you are tempted, resist. It always causes pain at the end.
Mocking
We currently use gomock to generate interface mocks,
for our own internal interfaces.
BUT gomock is really crap, so we are making a conscious effort to move
over to counterfeiter.
We shouldn't mock what we don't own. Instead we should create a light shim
around the external component; we integration test against the real thing and
unit test against the mocks of our delegating wrapper.
We should be conscious of overmocking: we should use mocks sparingly when calling
other internal interfaces which are not instrumental to the thing we are testing.
If the interface has a profound effect on the test scenario, we should favour
fakes, stubs and (more sparingly still) the real thing.
Refactoring
Refactoring tests should be done with a lighter hand than one would refactor
the actual code. While it may make sense to DRY out some tests, if it comes
at a heavy expense of readability or accessibility, then it is not worth it.
Ginkgo/Gomega
We use the standard Go
testing
package for tests.We also use Gomega matchers to make assertions.
Eg:
See the docs linked above for more examples.
Currently we do not use Ginkgo.
Table tests
Table tests are simultaneously very useful and the devil's work.
When used incorrectly they can lead to hard-to-read, hard-to-extend, complex
and increasingly brittle tests. A good rule of thumb is: if your test has the
same amount of or more logic than your actual code, then table tests are off
the table (heh heh).
Another one is: is this table being added because it
makes sense to group a bunch of identical tests together, or because the tests
don't feel DRY enough? As noted above, we prioritise readability and stability
over refactoring in tests.
Separate test functions are preferable over complex tables.
Tests as documentation
A bonus feature of tests is that (when done correctly) they can serve as low
level documentation on how a package/interface/unit behaves.
This makes them a good starting point for new people coming to the codebase.
Human readable tests are also just nicer and easier to maintain.
All test functions must be clearly named, eg:
In table tests, the cases struct should have a
"message"
or"name"
field,eg:
Test separation
Tests are grouped together, or separated, based on purpose.
For example, each
foo.go
unit of code should be accompanied bya corresponding
foo_test.go
unit test file.foo.go
may not require an integration test or an e2e test, but if any behaviourin
foo.go
does require those, then they will live in a separate location away from theunit. Eg. our e2es currently live in
test/e2e/
.Unit tests
All code changes must come with a corresponding unit test addition or change.
Units should aim only to test their corresponding unit. While sometimes the
lesser evil is to use real package_Y in package_Z_test, there is nothing
more annoying than making a change in package_X, or package_W, and seeing that
test fail inexplicably.
Integration tests
Integration tests should be added to test a thin layer of interactions with 3rd
party services. Ours include Firecracker, Containerd and Netlink, as well as any
IO operations.
Not all code changes will require an integration test addition, but changes to
or more of those 3rd party components will.
End to end tests
E2es mimic the top-level user experience of the product. They are happy-path
CRUD (technically CRD) only and do not check finer variances of behaviour.
Not all code changes will require an E2E addition, but they should be when there
is a change to the end user experience.
Consequences
not zero)
Beta Was this translation helpful? Give feedback.
All reactions