Skip to content

Commit

Permalink
Integration test for reinitialize-pods
Browse files Browse the repository at this point in the history
(Note this will fail until linkerd/linkerd2#11699 lands)

The `integration-cni-plugin.yml` workflow (formerly known as `cni-plugin-integration.yml`) has been expanded to run the new recipe `reinitialize-pods-integration`, which performs the following steps:

- Rebuilds the `linkerd-reinitialize-pods` crate and `cni-plugin`. The `Dockerfile-cni-plugin` file has been refactored to have two main targets `runtime` and `runtime-test`, the latter picking the `linkerd-reinitialize-pods` that has just been built locally.
- Creates a new cluster at version `v1.27.6-k3s1` (version required for Calico to work)
- Triggers a new `./reinitialize-pods/integration/run.sh` script which:
  - Installs Calico
  - Installs the latest linkerd-edge CLI
  - Installs `linkerd-cni` and wait for it to become ready
  - Install the linkerd control plane in CNI mode
  - Install a `pause` DaemonSet

The `linkerd-cni` instance has been configured to include an extra initContainer that will delay its start for 15s. Since we waited for it to become ready, this doesn't affect the initial install. But then a new node is added to the cluster, and this delay allows for the new `pause` DaemonSet replica to start before the full CNI config is ready, so we can observe its failure to come up. Once the new `linkerd-cni` replica becomes ready we observe how the `pause` failed replica is replaced by a new healthy one.
  • Loading branch information
alpeb committed Dec 11, 2023
1 parent d0e2384 commit 2f34615
Show file tree
Hide file tree
Showing 9 changed files with 143 additions and 12 deletions.
1 change: 0 additions & 1 deletion .dockerignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,3 @@ Cargo.toml
Cargo.lock
rust-toolchain
validator/
target/
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
name: cni-plugin-integration
name: integration-cni-plugin

on:
workflow_dispatch:
pull_request:
paths:
- .github/workflows/integration-cni-plugin.yml
- Dockerfile-cni-plugin
- cni-plugin/integration/flannel/Dockerfile-tester
- cni-plugin/integration/run.sh
- cni-plugin/**
- reinitialize-pods/**

jobs:
cni-flannel-test:
Expand Down Expand Up @@ -46,3 +46,11 @@ jobs:
- uses: actions/checkout@3df4ab11eba7bda6032a0b82a6bb43b11571feac
- name: Run CNI ordering tests
run: just cni-plugin-test-ordering
reinitialize-pods:
timeout-minutes: 15
runs-on: ubuntu-latest
steps:
- uses: linkerd/dev/actions/setup-tools@v42
- uses: actions/checkout@3df4ab11eba7bda6032a0b82a6bb43b11571feac
- name: Run reinitialize-pods tests
run: just reinitialize-pods-integration
2 changes: 1 addition & 1 deletion .github/workflows/release-reinitialize-pods.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ jobs:
container: ghcr.io/linkerd/dev:v42-rust-musl
steps:
- uses: actions/checkout@3df4ab11eba7bda6032a0b82a6bb43b11571feac
- run: just reinitialize-pods arch=${{ matrix.arch }} profile=release version=${{ needs.meta.outputs.version }} package
- run: just --justfile=justfile-rust reinitialize-pods arch=${{ matrix.arch }} profile=release version=${{ needs.meta.outputs.version }} package
- uses: actions/upload-artifact@v3
with:
name: ${{ matrix.arch }}-artifacts
Expand Down
18 changes: 14 additions & 4 deletions Dockerfile-cni-plugin
Original file line number Diff line number Diff line change
Expand Up @@ -16,12 +16,17 @@ FROM --platform=$BUILDPLATFORM curlimages/curl:7.86.0 as fetch
WORKDIR /build
ARG TARGETARCH
ARG LINKERD_REINITIALIZE_PODS_VERSION=v0.1.0
# TODO: replace org with linkerd, once we make a first release
RUN curl -LO https://github.com/alpeb/linkerd2-proxy-init/releases/download/reinitialize-pods%2F${LINKERD_REINITIALIZE_PODS_VERSION}/linkerd-reinitialize-pods-${LINKERD_REINITIALIZE_PODS_VERSION}-${TARGETARCH}.tgz
RUN curl -LO https://github.com/linkerd/linkerd2-proxy-init/releases/download/reinitialize-pods%2F${LINKERD_REINITIALIZE_PODS_VERSION}/linkerd-reinitialize-pods-${LINKERD_REINITIALIZE_PODS_VERSION}-${TARGETARCH}.tgz
RUN tar -zxvf linkerd-reinitialize-pods-${LINKERD_REINITIALIZE_PODS_VERSION}-${TARGETARCH}.tgz && \
mv linkerd-reinitialize-pods-${LINKERD_REINITIALIZE_PODS_VERSION}-${TARGETARCH}/linkerd-reinitialize-pods .

FROM --platform=$TARGETPLATFORM alpine:3.18.5 as runtime
FROM --platform=$BUILDPLATFORM golang:1.21-alpine as copy-test
WORKDIR /build
COPY ./target/package/linkerd-reinitialize-pods-test-amd64.tgz .
RUN tar -zxvf linkerd-reinitialize-pods-test-amd64.tgz && \
mv ./linkerd-reinitialize-pods-test-amd64/linkerd-reinitialize-pods .

FROM --platform=$TARGETPLATFORM alpine:3.18.5 as runtime-base
WORKDIR /linkerd
RUN apk add \
# For inotifywait
Expand All @@ -32,10 +37,15 @@ RUN apk add \
jq

COPY --from=go /go/bin/linkerd-cni /opt/cni/bin/
COPY --from=fetch /build/linkerd-reinitialize-pods /usr/lib/linkerd/
COPY LICENSE .
COPY cni-plugin/deployment/scripts/install-cni.sh .
COPY cni-plugin/deployment/linkerd-cni.conf.default .
COPY cni-plugin/deployment/scripts/filter.jq .
ENV PATH=/linkerd:/opt/cni/bin:$PATH
CMD ["install-cni.sh"]

FROM --platform=$TARGETPLATFORM runtime-base as runtime-test
COPY --from=copy-test /build/linkerd-reinitialize-pods /usr/lib/linkerd/

FROM --platform=$TARGETPLATFORM runtime-base as runtime
COPY --from=fetch /build/linkerd-reinitialize-pods /usr/lib/linkerd/
11 changes: 11 additions & 0 deletions cni-plugin/integration/manifests/calico/k3s-images.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
{
"name": "docker.io/rancher/k3s",
"channels": {
"stable": "v1.27.6-k3s1",
"latest": "v1.27.6-k3s1",
"v1.27": "v1.27.6-k3s1"
},
"digests": {
"v1.27.6-k3s1": "sha256:9486bbb9ca9b81c098ecd07f1c45441e143dab12577e22cf062586edcfd9d952"
}
}
15 changes: 12 additions & 3 deletions justfile
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ lint: sh-lint md-lint rs-clippy action-lint action-dev-check

go-lint *flags: (proxy-init-lint flags) (cni-plugin-lint flags)

test: rs-test proxy-init-test-unit proxy-init-test-integration
test: rs-test proxy-init-test-unit proxy-init-test-integration reinitialize-pods-integration

# Check whether the Go code is formatted.
go-fmt-check:
Expand Down Expand Up @@ -78,9 +78,18 @@ validator *args:
## reinitialize-pods
##

reinitialize-pods *args:
reinitialize-pods version *args:
TARGETCRATE=linkerd-reinitialize-pods \
{{ just_executable() }} --justfile=justfile-rust {{ args }}
{{ just_executable() }} --justfile=justfile-rust version={{version}} {{ args }}

# The K3S_IMAGES_JSON file used instructs the creation of a cluster on version
# v1.27.6-k3s1, because after that Calico won't work.
# See https://github.com/k3d-io/k3d/issues/1375
reinitialize-pods-integration $K3S_IMAGES_JSON='./cni-plugin/integration/manifests/calico/k3s-images.json': (reinitialize-pods "test" "package") (build-cni-plugin-image "--target=runtime-test")
@{{ just_executable() }} K3D_CREATE_FLAGS='{{ _K3D_CREATE_FLAGS_NO_CNI }}' _k3d-cni-create
@just-k3d use
@just-k3d import {{ cni-plugin-image }}
./reinitialize-pods/integration/run.sh {{ cni-plugin-image }}

##
## cni-plugin
Expand Down
4 changes: 4 additions & 0 deletions reinitialize-pods/integration/linkerd-cni-config.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
extraInitContainers:
- name: sleep
image: busybox
command: ["/bin/sh", "-c", "sleep 15"]
19 changes: 19 additions & 0 deletions reinitialize-pods/integration/pause-ds.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: pause
spec:
selector:
matchLabels:
app: pause-app
template:
metadata:
annotations:
linkerd.io/inject: enabled
labels:
app: pause-app
spec:
priorityClassName: system-node-critical
containers:
- name: pause-container
image: k8s.gcr.io/pause
71 changes: 71 additions & 0 deletions reinitialize-pods/integration/run.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
#!/usr/bin/env bash

set -euo pipefail

function step() {
repeat=$(seq 1 $(echo $1 | wc -c))
printf "%0.s#" $repeat
printf "#####\n# $1...\n"
printf "%0.s#" $repeat
printf "#####\n"
}

regex='(.*):(.*)'
if [[ ! "$1" =~ $regex ]]; then
echo 'Usage: run.sh name:tag'
exit 1
fi
cni_plugin_image=${BASH_REMATCH[1]}
cni_image_version=${BASH_REMATCH[2]}

cd "${BASH_SOURCE[0]%/*}"

step 'Installing Calico'
kubectl apply -f https://k3d.io/v5.1.0/usage/advanced/calico.yaml
kubectl --namespace=kube-system wait --for=condition=available --timeout=120s \
deploy/calico-kube-controllers

step 'Installing latest linkerd edge'
scurl https://run.linkerd.io/install-edge | sh
export PATH=$PATH:$HOME/.linkerd2/bin
linkerd install --crds | kubectl apply -f -
# The linkerd-cni-config.yml config adds an extra initContainer that will make
# linkerd-cni to delay its start for 15s, so to allow time for the pause
# DaemonSet to start before the full CNI config is ready and enter a failure
# mode
linkerd install-cni \
--use-wait-flag \
--cni-image $cni_plugin_image \
--cni-image-version $cni_image_version \
--set reinitializePods.image.name=$cni_plugin_image \
--set reinitializePods.image.version=$cni_image_version \
-f linkerd-cni-config.yml \
| kubectl apply -f -
linkerd check --pre --linkerd-cni-enabled
linkerd install --linkerd-cni-enabled | kubectl apply -f -
linkerd check

step 'Installing pause DaemonSet'
kubectl apply -f pause-ds.yml
kubectl wait --for=condition=ready --timeout=120s -l app=pause-app po

step 'Adding a node'
cluster=$(just-k3d --evaluate K3D_CLUSTER_NAME)
image=$(just --evaluate cni-plugin-image)
k3d node create node2 --cluster $cluster
k3d image import --cluster $cluster $image

step 'Checking new DS replica fails with code 95'
sleep 10
kubectl wait \
--for=jsonpath='{.status.initContainerStatuses[0].lastState.terminated.exitCode}'=95 \
--field-selector=spec.nodeName=k3d-node2-0 \
pod

step 'Checking new DS replica gets replaced'
for i in {1..5}; do
if kubectl wait --for=condition=ready --timeout=10s -l app=pause-app po; then
break
fi
done
kubectl wait --for=condition=ready --timeout=10s -l app=pause-app po;

0 comments on commit 2f34615

Please sign in to comment.