This repository (https://gitlab.com/Obmondo/k8sops) implements a solution to run any Kubernetes cluster, using gitops principles and allowing for customization, in a separate repository.
This way you can always pull the latest commits, and get security updates, fixes, new features etc., while still being able to do your own adjustments.
This is copyright EnableIT ApS (of Denmark) and released under GPL 3 license of fsf.org.
Obmondo.com offers subscriptions, where we manage your Kubernetes cluster, using this code, and react to incidents 24/7/365 if this should be of interest.
In this way, we typically help out smaller setups, where there is only 1 or 2 operations people, and hence 24/7 operations isn't possible to run reliably - with our subscription, we take care of the 24/7 monitoring and our knowledgeable developers and operations people are ready to resolve any issues that may arise - while you can work on the day 2 day stuff, as you'd usually do - and still get your vacations, weekends etc. without having to be on-call yourself.
This operations design, is designed to deliver a setup, that delivers on all 3 aspects of security Confidentiality, Integrity and Availability.
This setup uses Calico CNI - and ensure network filtering - on both ingress and egress - which means we work hard to avoid any pod in this cluster, has (or needs) internet access. In this way, should a compromise of a pod happen - its harder to get data out, or pull necessary code from the internet to further exploit your systems.
You can disable network filtering for any application - but do so at your own peril - with above risk in mind.
We store EVERY Helm chart from an application upstream inside this repository. This has the following benefits:
- You can always install your application/do recovery - with no internet needed. If an upstream chart repo is down this will not affect you (and this happens quite often when you have many sources/charts in use).
- On EVERY upgrade, we review the diff of the chart for any unexpected changes, and this way we HOPE to catch a
compromised upstream release, before its included in this repository.
- At the very least we can see from how other it has been found that other codebases has been compromised - that this is the structure that enabled it to be detected.
- We run a CI job, that daily fetches the same charts we have in our repo, from upstream chart repo - and check if
upstream matches OUR copy. If it does not:
- Upstream updated an existing release - which is a VERY BAD idea.
- Upstream Helm repo suffered a supply-chain attack
We change EVERY docker image, to use the sha256sum
of the docker image instead of the tag. This means if upstream mirror
(e.g. Docker Hub) is compromised (and tag pointing to different docker image) - we will still get the original image -
or fail.
This repository has a CI job, that on every MR / PR fetches docker images used and pushes to YOUR docker registry (in GitLab, Github Docker service or whatever you prefer), so your Kubernetes cluster doesn't talk to anything on the Internet to pull the necessary Docker images for operations.
Here we also run a daily CI job to compare tag to the sha256sum
we have, to detect any supply chain attacks (which
won't affect those using this setup) so they can be reported upstream.
We scan Docker images used for known vulnerabilities (software in image with known CVE f.ex.) and report upstream, and if upstream will accept it we will gladly submit fixes to upstream to avoid users of this repository deploying vulnerable code.
This repo contains all the kubernetes applications helm charts that are used on Enableit managed clusters, and the
install script for setting up a new cluster with root app. In build
there is also a script for setting up
kube-prometheus. Each customer has their own kubernetes-config repo, and each cluster has a folder in such a repo. This
repo works in tandem with those cluster folders, they reference the charts stored here, and the install scripts here
create the content in those folders.
folder |
description |
---|---|
argocd-helm-charts |
Contains ArgoCD helm charts, that points to the actual helm charts (as a dependency listed in Charts.yaml ) - and with the default values we want. Each cluster can add override/extra values by listing an extra values file in their argocd-clusters-managed/$clustername folder. |
argocd-k8sconfig |
Kubernetes config objects. Used by all in common and per-cluster in their individual $clustername folder. |
argocd-application-templates |
collection of applications, to be optionally modified and copied into argocd-clusters-managed/$clustername/templates to be installed on that cluster. |
build |
contains script and config files for installing kube-prometheus on a cluster, has its own README with usage guide |
-
kubectl
,helm
,kubeseal
,bcrypt-tool
andpwgen
-
you can connect to k8s from your work station/laptop/desktop.
-
See wiki/guides/kubernetes-desktop-setup on how to install everything including switch
# optionally switch cluster switch # To verify kubectl get nodes
-
To generate access token for
https
access and for GitHub repo follow this guide# https://gitlab.enableit.dk/kubernetes/kubernetes-config-enableit/-/settings/access_tokens > Give token name > "read_repository" scopes > "Maintainer" role > submit
-
Generate username and password for OCI
# TODO: We dont use OCI currently. Update this when it is implemented.
-
create git repo for customer k8s data. it SHOULD be in this format only
kubernetes-config-<customer-id>
and sits at the same level where yourargocd-apps
is cloned# https://gitlab.enableit.dk/kubernetes/kubernetes-config-enableit.git
-
Wiki
wiki/guides/kuberenetes-desktop-setup.md
-
For puppet managed cluster
./bin/generate-puppet-hiera.sh --cluster-name kam.obmondo.com --version 1.22.5 --san-domains kam.obmondo.com,localhost,176.9.67.43,htzsb44fsn1a.enableit.dk:78.46.72.21,htzsb45fsn1a.enableit.dk:176.9.124.207,htzsb45fsn1b.enableit.dk:85.10.211.48 --customer-id enableit
-
Login onto server and run puppet
-
If things got broken refer to wiki/guides/kubernetes-cluster-setup.md
-
Run cluster setup script:
./bin/setup-k8s-cluster.sh --customer-id enableit --cluster-name kam.obmondo.com --settings-file <path-to-file>/customer-settings.yaml --install-k8s false
-
push all the files to git repo from the
kubernetes-config-<customer-id>
git repo -
Add the ArgoCD password in pass repo (you will get the password in step 4)
-
Get the k8s cluster up and running with KOPS with one single command line.
./bin/setup-k8s-cluster.sh --customer-id <customer-id> --cluster-name <cluster-name> --settings-file <path-to-settings-file> --k8s-type kops
-
Add the ArgoCD password in pass repo (you will get the password in step 1)
Always run uninstall script with --keep-files
or it wont save the keys for reading sealed secrets.
./bin/uninstall-argocd.sh --keep-files
This will remove sealed-secrets and all argocd apps and argocd itself from the current k8s context (undoing install script) It will prompt for cluster_name and argocd password (which you recieved from the install script)
The install script has a --recovery
mode which you use by calling them like this.
./bin/setup-k8s-cluster.sh --customer-id enableit --cluster-name kam.obmondo.com --settings-file customer-settings.yaml --recovery --private-key-path private_keys --public-key-path public_keys --install-k8s false
In recovery mode the install script will install argocd using the existing manifests in your local repo clone.
./bin/setup-k8s-cluster.sh --cluster-name k8s.staging.blackwoodseven.com --settings-file customer-settings.yaml --setup-root-app false --recovery --private-key-path ./private_keys --public-key-path ./public_certs --customer-id bw7 --install-k8s false --setup-sealed-secret false
install script will install ArgoCD using the existing manifests in your local repo clone.
Note: we don't support this
Setup GitLab user and generate SSH key set (and add public part to that GitLab user). Grant that user ONLY developer access to the projects it needs. Make sure those have master branch and tags protected in config.
add secret with ssh keys for GitLab ArgoCD SSH access:
kubectl create secret generic argocd-sshkey --from-file=ssh-privatekey=/path/to/.ssh/id_rsa --from-file=ssh-publickey=/path/to/.ssh/id_rsa.pub
Make sure sshPrivateKeySecret.name
for repositories in argocd-clusters-managed/$yourclustername/values-argocd.yaml
has this repo added, matching above secret name.
A command to get the existing ssh private key
kubectl get secrets -n argocd argo-cd-enableit-gitlab-ssh -o jsonpath="{.data.ssh-privatekey}" | base64 --decode
Login to the UI. To get the credentials refer ArgoCD admin credentials.
The install script adds root.yaml to the cluster folder under argocd-apps/templates in kubernetes-config-customerid. This argocd app has the clusters argocd-apps folder as its path. Making argocd detect all the apps in there automatically. So we can install apps just by adding them to the clusters argocd-apps folder in the customers kubernetes-config repo, and then syncing with argocd.
- If a helm chart creates a secret - ArgoCD will expect it to remain unchanged (otherwise complain application is out-of-sync).
- If this happens it means you have a secret that's changed via the application (typically user login password), and we NEED backup of these.
To resolve out-of-sync complaint in ArgoCD - AND backup/recovery do this:
- let helm chart create secret and application generate it - so you get out-of-sync complaint from ArgoCD.
- dump secret in JSON format, remove unnecessary metadata/Helm labels and encode into cluster secrets repo and delete the secret from k8s (before pushing to secrets repo).
- update values for chart as to NOT generate secret. Typically the setting is
called something like
useExistingSecret: $name-of-secret
-
you might see pods getting evicted, mostly likely disk is used around 70% or you have less disk size (
> 5GB
). to fix it increase the disk sizeroot@htzsb44fsn1a ~ # kubectl get pods NAME READY STATUS RESTARTS AGE argo-cd-argocd-application-controller-d6c576f5d-4d9bv 0/1 Evicted 0 84m argo-cd-argocd-application-controller-d6c576f5d-5xwq5 0/1 Evicted 0 44m argo-cd-argocd-application-controller-d6c576f5d-7cm78 0/1 Evicted 0 100m
-
It's important to ensure that:
acme.cert-manager.io/http01-edit-in-place: true
is placed in annotations for ingress whentraefik
is used -
Please refer: Wiki for kubernetes
-
run
helm dep
helm dep up argocd-helm-charts/<nameofchart>
-
run
helm template
helm template argocd-helm-charts/<nameofchart> --values ../kubernetes-config-<customer-id>/k8s/<targetcluster>/argocd-apps/values-<nameofchart>.yaml >/tmp/before.yaml
-
Read YAML and see if you like it.
-
Adjust values to your liking and run step 2 again, saving the result to
/tmp/after.yaml
. -
Check the difference by running
diff -bduNr /tmp/before.yaml /tmp/after.yaml
and verify the changes your value update should have caused are present.
- create/copy application YAML and set:
targetRevision
to$yourbranchname
instead ofHEAD
- Load that YAML manually (
kubectl create -ns argocd -f yourapplication.yaml
) - Work in your branch - adjust values as needed etc.
- When it works: simply update the application YAML to pointing to
targetRevision: HEAD
and make your MR/PR
- Main script -
helm-chart-cache.sh
- Run
helm-chart-cache.sh
script.
bash helm-chart-cache.sh -u <Username> -p <Password> -r <Registry>
In the above:
u - Username to login to your registry. Eg - Obmondo
p - Password. Here you need to pass the PA token which you can create on your Github profile
r - Name of the registry. Eg - ghcr.io/Obmondo
This script will first download the charts from Upstream repo and then save and
push it your registry. When the script has run without any issues it will update
the Chart.lock
file of your charts to point it your OCI repo path. So, you
need to just commit and push the Chart.lock file and then your apps will start
pointing to GHCR.
Remember: the lock file will only change for charts which are still not in your registry.
- Repository which are configured via ssh might end up not working, since the ssh known_host key is changed.
- To fix
ssh-keyscan -p <your-port-number> <git-server>
- Copy the
ecdsa-sha2-nistp256
key pair type. Other types might work too, but we haven't tried it. - Now the ArgoCD won't be working, since it can't connect to the git server.
- Go to repository and click on 'Certificates' and remove the old entry and create a new one and add the key that you
got from
ssh-keyscan
- Go to application and click on the
argo-cd
and sync theargocd-ssh-known-hosts-cm
and then it should work
See https://github.com/Obmondo/github-action-obmondo-deploy
KUBERNETES_CONFIG_REPO_TOKEN
At /project access token/ used for accessing the config repo.