Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge latest changes into release-1.31 #829

Closed
wants to merge 122 commits into from
Closed

Merge latest changes into release-1.31 #829

wants to merge 122 commits into from

Conversation

bschimke95
Copy link
Contributor

Notable changes

Cilium socketLB and apiserver-proxy change

We've configured Cilium to talk to the localhost address instead of the kube-proxy provided service to address the issue we've faced with socketLB. We've introduced a new way to determine the localhost address with #775 which provides a smooth upgrade path. No actions are needed.
LoadBalancer change from Cilium to MetalLB

We've changed the load-balancer implementation from Cilium to MetalLB. This will introduce down-time/interruption to these services while the upgrade is happening. It is necessary to follow the specific steps for the upgrade. Not following these steps properly will introduce nasty issues and conflicts!

Feature components version upgrade

The versions for underlying feature implementations have been upgraded.

Containerd Default Path Change

Does not affect existing nodes, new nodes start using the new designated path. Older nodes might need the adjustment by hand to their args files.

addyess and others added 30 commits September 13, 2024 18:35
At the moment, the "k8sd cluster-recover" displays interactive
prompts and text editors that assist the user in updating the dqlite
configuration.

We need to be able to run the command non-interactively in order
to automate the cluster recovery procedure.

This change adds a "--non-interactive" flag. If set, we'll no longer
show confirmation prompts and we'll assume that the configuration
files have already been updated, proceeding with the dqlite recovery.
---------

Co-authored-by: Yanisa Haley Scherber <[email protected]>
* Automerge every 4-hours any PR with passing tests labeled with 'automerge'
* Make sure the bot can approve the PRs too
* Update Bot information only if git email currently unset
* consistently use private key secret to setup ssh git-remote
* Rename secret to BOT_SSH_KEY
* Reimagine auto-merge scripts as python
Warnings that k8s service may not work (#657)
KU-1475
* Add epa-howto

Co-authored-by: Louise K. Schmidtgen <[email protected]>
Co-authored-by: Yanisa Haley Scherber <[email protected]>
* Add how-to for capi in place upgrades

* Addressing comments

* Linting fixes

* Update docs/src/capi/howto/in-place-upgrades.md

---------

Co-authored-by: Nick Veitch <[email protected]>
* Create more tests on branches and recipes

* Apply review comments
* Add unit tests for coredns
KU-1515
Metrics Server feature lacks unit test this PR implements tests for Metrics Server functionality.
KU-1515
---------

Co-authored-by: Adam Dyess <[email protected]>
* Return non-zero exit code in case of errors

At the moment, k8s and k8sd return 0 even if the command fails,
which is a problem especially when used inside scripts.

We'll ensure that a non-zero exit code is returned if the commands
fail.

* Update the cluster recovery command to use cobra "Run"

The cluster recovery command currently uses "RunE" and returns an
error in case of failures.

To stay consistent with other commands, we'll use "Run" and call
env.Exit as part of the command callback instead of returning the
errors.
Refactor Certificates Refresh endpoints to flush the response early and restart the services asynchronously
* Update dualstack.md

We have determined that /108 is the maximum supported size. Cluster fails to bootstrap with /64 and /96.
eaudetcobello and others added 26 commits November 14, 2024 12:09
* fix pic order

in a previous PR the order of the arch docs got mixed up. Back in correct places now

* clarify k8sd diagram

fix typos in diagram and help clarify in text and on diagram difference in deploying with Juju vs snap
Corrected typos in docs and created more extensive custom_wordslist
change spellcheck to manual launch rather than on every pull request
* fix: ensure nf_conntrack module loaded for kube-proxy.

This patch ensures that the `nf_conntrack` kernel module is loaded
before the `kube-proxy` service is started so it can read some
necessary conntrack module-related params from procfs.

Previously, although the `kube-proxy` service always crashed if the module
wasn't loaded, this wasn't that common of an occurrence in practice as
there are quite a few ways `nf_conntrack` gets loaded transparently:
* Cilium [automatically loads `iptable_nat`](https://github.com/cilium/cilium/blob/63cd391f93b4e2c865268241d384504348672042/pkg/datapath/iptables/iptables.go#L367-L368)
after a small startup delay, whose dependency tree includes `nf_conntrack`
* starting firewalld/ufw/most other firewall services
* setting iptables/nftables rules which imply session tracking

By explicitly loading `nf_conntrack` from the `kube-proxy` service
wrapper directly, it should ensure the procfs values kube-proxy reads
are always present on startup.

Signed-off-by: Nashwan Azhari <[email protected]>

* ci: install nf_conntrack module in integration test base LXC image.

Signed-off-by: Nashwan Azhari <[email protected]>

---------

Signed-off-by: Nashwan Azhari <[email protected]>
---------

Co-authored-by: Benjamin Schimke <[email protected]>
We need to properly clean up the containerd path on snap removal.
For that, the path needs to be stored in a file.
This serves two purposes:
1. The existence of the file indicates that the cluster was already bootstrapped
   and the containerd directory is not created by some other service.
2. The containerd path is configurable, having this information in a file makes it easy
   to access even after the k8sd service is already stopped.
* reformatted annotations table

due to formatting issues the annotations table was quite unclear. Edited to make it more readable

Co-authored-by: Nick Veitch <[email protected]>
reviewed tutorials and edited them to make them clearer, fix md linter issues and formatting
apiv1 is deprecated for annotations.
`apiv1_annotations` is the recommended package now.
We now use the our own Rocks everywhere, so syncing
is not required anymore.
Currently, when removing the snap, the /var/run/containerd folder is not
properly removed, as it is a folder. This fixes this issue.

Additionally removes other containerd-related folders: /etc/containerd
and /var/lib/containerd.

We're also removing /opt/cni/bin on snap removal, which is created when
bootstrapping the node. As we're removing the k8s snap, we no longer
need this folder either.
@bschimke95 bschimke95 requested a review from a team as a code owner November 22, 2024 08:43

steps:
- name: Harden Runner
uses: step-security/harden-runner@v2

Check warning

Code scanning / Scorecard

Pinned-Dependencies Medium

score is 0: third-party GitHubAction not pinned by hash
Remediation tip: update your workflow using https://app.stepsecurity.io
Click Remediation section below for further remediation help
with:
egress-policy: audit
- name: Checking out repo
uses: actions/checkout@v4

Check warning

Code scanning / Scorecard

Pinned-Dependencies Medium

score is 1: GitHub-owned GitHubAction not pinned by hash
Remediation tip: update your workflow using https://app.stepsecurity.io
Click Remediation section below for further remediation help
egress-policy: audit
- name: Checking out repo
uses: actions/checkout@v4
- uses: actions/setup-python@v5

Check warning

Code scanning / Scorecard

Pinned-Dependencies Medium

score is 1: GitHub-owned GitHubAction not pinned by hash
Remediation tip: update your workflow using https://app.stepsecurity.io
Click Remediation section below for further remediation help
@@ -27,6 +27,9 @@
uses: actions/checkout@v4
with:
ref: ${{matrix.branch}}
- uses: actions/setup-python@v5

Check warning

Code scanning / Scorecard

Pinned-Dependencies Medium

score is 1: GitHub-owned GitHubAction not pinned by hash
Remediation tip: update your workflow using https://app.stepsecurity.io
Click Remediation section below for further remediation help
spell-check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

Check warning

Code scanning / Scorecard

Pinned-Dependencies Medium

score is 1: GitHub-owned GitHubAction not pinned by hash
Remediation tip: update your workflow using https://app.stepsecurity.io
Click Remediation section below for further remediation help
sudo apt install dqlite-tools libdqlite-dev

- name: golangci-lint
uses: golangci/golangci-lint-action@v6

Check warning

Code scanning / Scorecard

Pinned-Dependencies Medium

score is 1: third-party GitHubAction not pinned by hash
Remediation tip: update your workflow using https://app.stepsecurity.io
Click Remediation section below for further remediation help
@@ -121,6 +136,13 @@
uses: step-security/harden-runner@v2
with:
egress-policy: audit
- name: Login to GitHub Container Registry
uses: docker/login-action@v3

Check warning

Code scanning / Scorecard

Pinned-Dependencies Medium

score is 1: third-party GitHubAction not pinned by hash
Remediation tip: update your workflow using https://app.stepsecurity.io
Click Remediation section below for further remediation help
echo "artifact_name=inspection-reports-${{ matrix.os }}-${{ matrix.arch }}" | sed 's/:/-/g' >> $GITHUB_ENV
- name: Upload inspection report artifact
if: failure()
uses: actions/upload-artifact@v4

Check warning

Code scanning / Scorecard

Pinned-Dependencies Medium

score is 1: GitHub-owned GitHubAction not pinned by hash
Remediation tip: update your workflow using https://app.stepsecurity.io
Click Remediation section below for further remediation help
path: ${{ github.workspace }}/inspection-reports.tar.gz
- name: Tmate debugging session
if: ${{ failure() && github.event_name == 'pull_request' }}
uses: mxschmitt/action-tmate@v3

Check warning

Code scanning / Scorecard

Pinned-Dependencies Medium

score is 1: third-party GitHubAction not pinned by hash
Remediation tip: update your workflow using https://app.stepsecurity.io
Click Remediation section below for further remediation help
@bschimke95
Copy link
Contributor Author

used wrong branch for source - closing

@bschimke95 bschimke95 closed this Nov 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.