-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enhancement(tests): Kubernetes E2E test framework #2702
Changes from 61 commits
d2ea174
8eff64c
656df58
e2f03f2
fd97475
eff8b30
3b4fd8e
ff65e04
17d6af9
81fc716
ef47617
c1c36ac
ab5703d
0783690
0d5a541
291e513
b067258
e241e62
5459179
2289834
c96c14c
b22a26d
ea926d7
3a23d85
d82e44b
1716843
1b3631c
c3656b9
ae62815
e27ade7
b34f78f
a619453
4b56580
6aaa21d
d0d02cb
945ac52
977d5fc
3f2e1d0
c793ab3
9bcd7a9
e9548cc
2691e10
71690f3
7cb5dc7
db531ca
de903fd
97af531
0ca655d
3ae989f
3110bb8
5b7f17b
bf37696
dba975f
a173f5d
ff84036
f29e56f
3f4fdd7
f1bd433
b418c47
949cae4
b995cbc
c73c968
a9b63c8
ec3f9c5
c418e3a
7f16dce
1e702ba
ae93ca3
ce7284c
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -265,21 +265,49 @@ jobs: | |
- run: make slim-builds | ||
- run: make test-integration-splunk | ||
|
||
test-integration-kubernetes: | ||
name: Integration - Linux, Kubernetes, flaky | ||
# This is an experimental test. Allow it to fail without failing the whole | ||
# workflow, but keep it executing on every build to gather stats. | ||
continue-on-error: true | ||
test-e2e-kubernetes: | ||
name: E2E - K8s ${{ matrix.kubernetes_version }} / ${{ matrix.container_runtime }} | ||
runs-on: ubuntu-latest | ||
strategy: | ||
matrix: | ||
kubernetes: | ||
- v1.18.2 | ||
- v1.17.5 | ||
- v1.16.9 | ||
- v1.15.11 | ||
- v1.14.10 | ||
minikube_version: | ||
- 'v1.11.0' # https://github.com/kubernetes/minikube/issues/8799 | ||
kubernetes_version: | ||
- 'v1.18.6' | ||
- 'v1.17.9' | ||
- 'v1.16.13' | ||
- 'v1.15.12' | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why do we test these versions? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. According to the RFC, we test all versions from MSKV to the latest released version. MSKV is 1.15. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We don't really have to test 1.14, but we do it cause people asked if we'll support it, and our code works for 1.14 too. We might want to reduce MSKV to 1.14. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So over time as k8s releases grow, so will this list? Meaning more and more of our CI jobs will be just k8s tests? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm afraid so, at least that's the plan. We'll probably bump MSKV along. K8s unfortunately have plenty of edge cases, and we'd better test as many variants as we can. |
||
- 'v1.14.10' | ||
container_runtime: | ||
- docker | ||
- containerd | ||
- crio | ||
fail-fast: false | ||
steps: | ||
- name: Temporarily off | ||
run: "true" | ||
- name: Setup Minikube | ||
run: | | ||
set -xeuo pipefail | ||
|
||
curl -Lo kubectl \ | ||
'https://storage.googleapis.com/kubernetes-release/release/${{ matrix.kubernetes_version }}/bin/linux/amd64/kubectl' | ||
sudo install kubectl /usr/local/bin/ | ||
|
||
curl -Lo minikube \ | ||
'https://storage.googleapis.com/minikube/releases/${{ matrix.minikube_version }}/minikube-linux-amd64' | ||
sudo install minikube /usr/local/bin/ | ||
|
||
minikube config set profile minikube | ||
minikube config set vm-driver docker | ||
minikube config set kubernetes-version '${{ matrix.kubernetes_version }}' | ||
minikube config set container-runtime '${{ matrix.container_runtime }}' | ||
# Start minikube, try again once if fails and print logs if the second | ||
# attempt fails too. | ||
minikube start || minikube delete && minikube start || minikube logs | ||
kubectl cluster-info | ||
- name: Checkout | ||
uses: actions/checkout@v1 | ||
- run: USE_CONTAINER=none make slim-builds | ||
- run: make test-e2e-kubernetes | ||
env: | ||
USE_MINIKUBE_CACHE: "true" | ||
PACKAGE_DEB_USE_CONTAINER: docker |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -37,6 +37,8 @@ expanding into more specifics. | |
1. [Benchmarking](#benchmarking) | ||
1. [Profiling](#profiling) | ||
1. [Kubernetes](#kubernetes) | ||
1. [Dev flow](#kubernetes-dev-flow) | ||
1. [E2E tests](#kubernetes-e2e-tests) | ||
1. [Humans](#humans) | ||
1. [Documentation](#documentation) | ||
1. [Changelog](#changelog) | ||
|
@@ -550,13 +552,15 @@ navigated in your favorite web browser. | |
|
||
### Kubernetes | ||
|
||
#### Kubernetes Dev Flow | ||
|
||
There is a special flow for when you develop portions of Vector that are | ||
designed to work with Kubernetes, like `kubernetes_logs` source or the | ||
`deployment/kubernetes/*.yaml` configs. | ||
|
||
This flow facilitates building Vector and deploying it into a cluster. | ||
|
||
#### Requirements | ||
##### Requirements | ||
|
||
There are some extra requirements besides what you'd normally need to work on | ||
Vector: | ||
|
@@ -570,7 +574,7 @@ Vector: | |
* [`minikube`](https://minikube.sigs.k8s.io/)-powered or other k8s cluster | ||
* [`cargo watch`](https://github.com/passcod/cargo-watch) | ||
|
||
#### The dev flow | ||
##### The dev flow | ||
|
||
Once you have the requirements, use the `scripts/skaffold.sh dev` command. | ||
|
||
|
@@ -596,7 +600,7 @@ the cluster state and exit. | |
`scripts/skaffold.sh` wraps `skaffold`, you can use other `skaffold` subcommands | ||
if it fits you better. | ||
|
||
#### Troubleshooting | ||
##### Troubleshooting | ||
|
||
You might need to tweak `skaffold`, here are some hints: | ||
|
||
|
@@ -614,7 +618,7 @@ You might need to tweak `skaffold`, here are some hints: | |
* For the rest of the `skaffold` tweaks you might want to apply check out | ||
[this page](https://skaffold.dev/docs/environment/). | ||
|
||
#### Going through the dev flow manually | ||
##### Going through the dev flow manually | ||
|
||
Is some cases `skaffold` may not work. It's possible to go through the dev flow | ||
manually, without `skaffold`. | ||
|
@@ -627,6 +631,79 @@ required. | |
Essentially, the steps you have to take to deploy manually are the same that | ||
`skaffold` will perform, and they're outlined at the previous section. | ||
|
||
#### Kubernetes E2E tests | ||
|
||
Kubernetes integration has a lot of parts that can go wrong. | ||
|
||
To cope with the complexity and ensure we maintain high quality, we use | ||
E2E (end-to-end) tests. | ||
|
||
> E2E tests normally run at CI, so there's typically no need to run them | ||
> manually. | ||
|
||
##### Requirements | ||
|
||
* `kubernetes` cluster (`minikube` has special support, but any cluster should | ||
work) | ||
* `docker` | ||
* `kubectl` | ||
* `bash` | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So these tests can't work from a Windows machine? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Technically, you should be able to run tests on Windows via git bash. I didn't test it though... There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah I have bash through busybox so I think this might work. I think this is fine for now. We'll need to support windows fully at some point though. We do support users developing on windows. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yeah, that's a bit tricky to do for k8s, but we'll get there. The dev flow (the one that uses skaffold) is currently limited to linux (on the contrary to e2e tests, what can be launched from Linux/macOS/Windows in theory). We can add support for using Windows/macOS there if we cross-compile There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Tried it on Windows - got stuck at the building vector into release artifacts. This should be easy to fix. |
||
|
||
Vector release artifacts are prepared for E2E tests, so the ability to do that | ||
is required too, see Vector [docs](https://vector.dev) for more details. | ||
|
||
##### Running the E2E tests | ||
|
||
To run the E2E tests, use the following command: | ||
|
||
```shell | ||
CONTAINER_IMAGE_REPO=<your name>/vector-test make test-e2e-kubernetes | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why not use There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Users don't have push access to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That sounds reasonable. :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added the default when There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You still have to pass
|
||
``` | ||
|
||
Where `CONTAINER_IMAGE_REPO` is the docker image repo name to use, without part | ||
after the `:`. Replace `<your name>` with your Docker Hub username. | ||
|
||
You can also pass additional parameters to adjust the behavior of the test: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I suggest you namespace the k8s specific variables to like There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Prefixing all of them with Given that those are used in a very specific command, I don't think it's big issue. It's easy to type and remember though - and this is a big plus IMO. |
||
|
||
* `QUICK_BUILD=true` - use development build and a skaffold image from the dev | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How about There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm pretty sure There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It doesn't yet! :( You've probably noticed we use separate make jobs for this distinction. Maybe that's appropriate here too? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You're thinking about I understand what you mean with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I do not think |
||
flow instead of a production docker image. Significantly speeds up the | ||
preparation process, but doesn't guarantee the correctness in the release | ||
build. Useful for development of the tests or Vector code to speed up the | ||
iteration cycles. | ||
|
||
* `USE_MINIKUBE_CACHE=true` - instead of pushing the built docker image to the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So by default when someone runs There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yep. This shouldn't be surprising. Delivering images via registry is a standard way to make an image available to the Kubernetes. There's simply no other generic way of doing it. |
||
registry under the specified name, directly load the image into | ||
a `minikube`-controlled cluster node. | ||
Requires you to test against a `minikube` cluster. Eliminates the need to have | ||
a registry to run tests. | ||
When `USE_MINIKUBE_CACHE=true` is set, we provide a default value for the | ||
`CONTAINER_IMAGE_REPO` so it can be omitted. | ||
|
||
* `CONTAINER_IMAGE=<your name>/vector-test:tag` - completely skip the step | ||
of building the Vector docker image, and use the specified image instead. | ||
Useful to speed up the iterations speed when you already have a Vector docker | ||
image you want to test against. | ||
|
||
* `SKIP_CONTAINER_IMAGE_PUBLISHING=true` - completely skip the image publishing | ||
step. Useful when you want to speed up the iteration speed and when you know | ||
the Vector image you want to test is already available to the cluster you're | ||
testing against. | ||
|
||
* `SCOPE` - pass a filter to the `cargo test` command to filter out the tests, | ||
effectively equivalent to `cargo test -- $SCOPE`. | ||
|
||
Passing additional commands is done like so: | ||
|
||
```shell | ||
QUICK_BUILD=true USE_MINIKUBE_CACHE=true make test-e2e-kubernetes | ||
``` | ||
|
||
or | ||
|
||
```shell | ||
QUICK_BUILD=true CONTAINER_IMAGE_REPO=<your name>/vector-test make test-e2e-kubernetes | ||
``` | ||
|
||
## Humans | ||
|
||
After making your change, you'll want to prepare it for Vector's users | ||
|
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,16 @@ | ||
[package] | ||
name = "k8s-test-framework" | ||
version = "0.1.0" | ||
authors = ["Vector Contributors <[email protected]>"] | ||
edition = "2018" | ||
description = "Kubernetes Test Framework used to test Vector in Kubernetes" | ||
|
||
[dependencies] | ||
k8s-openapi = { version = "0.9", default-features = false, features = ["v1_15"] } | ||
serde_json = "1" | ||
tempfile = "3" | ||
once_cell = "1" | ||
tokio = { version = "0.2", features = ["process", "io-util"] } | ||
|
||
[dev-dependencies] | ||
tokio = { version = "0.2", features = ["macros", "rt-threaded"] } |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
//! Perform a log lookup. | ||
|
||
use super::{Reader, Result}; | ||
use std::process::Stdio; | ||
use tokio::process::Command; | ||
|
||
/// Exec a `tail` command reading the specified `file` within a `Container` | ||
/// in a `Pod` of a specified `resource` at the specified `namespace` via the | ||
/// specified `kubectl_command`. | ||
/// Returns a [`Reader`] that managed the reading process. | ||
pub fn exec_tail( | ||
kubectl_command: &str, | ||
namespace: &str, | ||
resource: &str, | ||
file: &str, | ||
) -> Result<Reader> { | ||
let mut command = Command::new(kubectl_command); | ||
|
||
command.stdin(Stdio::null()).stderr(Stdio::inherit()); | ||
|
||
command.arg("exec"); | ||
command.arg("-n").arg(namespace); | ||
command.arg(resource); | ||
command.arg("--"); | ||
command.arg("tail"); | ||
command.arg("--follow=name"); | ||
command.arg("--retry"); | ||
command.arg(file); | ||
|
||
let reader = Reader::spawn(command)?; | ||
Ok(reader) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this is... 5 * 3 test runners just for k8s?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it was worth it during development - helped to find a couple of interesting bugs along the way in unexpected places, that were only present in non-trivial combinations.
I don't mind reducing it, but it's not clear what subset of those is reasonable.
On a side note, there are a lot of ways we could optimize the E2E tests in general, to make them run quicker so that they don't create as much load on the system. Technically, the "payload" of these tests is very lightweight, and the most time, yet again, is spent during env prep and build. If we could share build artifacts - it would be great.
I feel like if we optimize the process in general we won't have to worry about having 15 tests...
We could move them into a separate workflow though, such that they'll live under their own section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this would consume actually 15% of our total possible github actions runners so we should consider the value here very, very carefully.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Well, we can exclude some of the matrix incarnations if we start facing issues... Just need to figure out which ones are the safest to omit. Alternatively, this can be transformed into a single sequential invocation - but I'd hold on with that until we hit an actual problem.