Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade systemd from 255 to 256 #2145

Merged
merged 8 commits into from
Dec 20, 2024

Conversation

ader1990
Copy link
Contributor

@ader1990 ader1990 commented Jul 23, 2024

Upgrade systemd from 255 to 256

Fixes: flatcar/Flatcar#1501

Testing done

[Describe the testing you have done before submitting this PR. Please include both the commands you issued as well as the output you got.]

  • Changelog entries added in the respective changelog/ directory (user-facing change, bug fix, security fix, update)
  • Inspected CI output for image differences: /boot and /usr size, packages, list files for any missing binaries, kernel modules, config files, kernel modules, etc.

Patches required on other subprojects:

@@ -254,14 +254,11 @@ src_prepare() {
"${FILESDIR}/systemd-test-process-util.patch"
# Flatcar: Adding our own patches here.
"${FILESDIR}/0001-wait-online-set-any-by-default.patch"
"${FILESDIR}/0002-networkd-default-to-kernel-IPForwarding-setting.patch"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be investigated if the 256 changes are making this patch irelevant or not: https://github.com/systemd/systemd-stable/blob/v256/src/network/networkd-network.c#L470

@ader1990
Copy link
Contributor Author

The github actions fail with vm being emergency shelled at boot with the error: systemd: system is tainted: unmerged bin, needs investigation.

Copy link

github-actions bot commented Jul 23, 2024

Build action triggered: https://github.com/flatcar/scripts/actions/runs/12409046532

@ader1990
Copy link
Contributor Author

ader1990 commented Jul 24, 2024

After some investigation, it seems that ignition-setup.service fails to run because /usr is now read only.

From https://lists.freedesktop.org/archives/systemd-devel/2024-June/050407.html:

        Service Management:

        * New system manager setting ProtectSystem= has been added. It is
          analogous to the unit setting, but applies to the whole system. It is
          enabled by default in the initrd.

          Note that this means that code executed in the initrd cannot naively
          expect to be able to write to /usr/ during boot. This affects
          dracut <= 101, which wrote "hooks" to /lib/dracut/hooks/. See
          https://github.com/dracut-ng/dracut-ng/commit/a45048b80c27ee5a45a380.

dracut-ng/dracut-ng@a45048b80c27ee5a45a380 -> this commit shows how to fix dracut 100, not applicable, as dracut used by Flatcar is an older version 0.53.

But ignition-setup.service fails to run at this line, as /usr is mount as ro: https://github.com/flatcar/bootengine/blob/flatcar-master/dracut/30ignition/ignition-setup.sh#L15.

@ader1990
Copy link
Contributor Author

One option, is, obviously, to disable the default ProtectSystem, as the initrd Flatcar workflow is reliant on writing to rootfs, similar to this, in bootengine, using a dracut module definition, similar to: https://github.com/flatcar/bootengine/blob/flatcar-master/dracut/99switch-root/nocgroup.conf

Another would be to fix all the bootengine /usr writes and maybe move those to /etc or /var.

@ader1990 ader1990 force-pushed the ader1990/systemd-major-version-upgrade-256 branch from 9d88f29 to a3f885c Compare July 25, 2024 07:43
@chewi
Copy link
Contributor

chewi commented Aug 13, 2024

Independently of this, I have tried to update Dracut to 060, and am having trouble with cyclic boot dependencies. I wonder if this is somehow related to the above. Here's what it looks like with the verity stuff disabled, otherwise it's a bit more complicated.

Screenshot_20240813_120607

@ader1990
Copy link
Contributor Author

Independently of this, I have tried to update Dracut to 060, and am having trouble with cyclic boot dependencies. I wonder if this is somehow related to the above. Here's what it looks like with the verity stuff disabled, otherwise it's a bit more complicated.

Screenshot_20240813_120607

Last time I tried a few months ago, I also got the same cyclical dependencies and gave up. Our bootengine heavily modifies the dracut upstream logic, so things need to be modified (again) there to make the dracut upgrade.

@ader1990
Copy link
Contributor Author

Full error for the initrd break point:

journalctl -xeu ignition-setup.service

Sep 25 08:17:15 localhost systemd[1]: Starting ignition-setup.service - Ignition (setup)...
Sep 25 08:17:15 localhost ignition-setup[891]: cp: cannot create regular file '/bin/is-live-image': Read-only file system
Sep 25 08:17:15 localhost systemd[1]: ignition-setup.service: Main process exited, code=exited, status=1/FAILURE
Sep 25 08:17:15 localhost systemd[1]: ignition-setup.service: Failed with result 'exit-code'.
Sep 25 08:17:15 localhost systemd[1]: Failed to start ignition-setup.service - Ignition (setup).
Sep 25 08:17:15 localhost systemd[1]: ignition-setup.service: Triggering OnFailure= dependencies.

@ader1990
Copy link
Contributor Author

To overcome the current limitation imposed by systemd 256 regarding the ignition-setup[891]: cp: cannot create regular file '/bin/is-live-image': Read-only file system, there are two options I can think of:

  1. use /mnt/oem or even /tmp as a temporary place to store the is-live-image but this needs a $PATH change so that it can be found by Ignition -> https://github.com/search?q=repo%3Acoreos%2Fignition%20is-live-image&type=code
  2. Temporarily remount the /usr as mount -o remount,rw /usr, do the required change and then mount -o remount,ro /usr

I am not convinced these two ideas are the best, maybe there is another option?

Thanks.

@ader1990 ader1990 self-assigned this Sep 25, 2024
@ader1990 ader1990 requested a review from a team September 25, 2024 08:36
@ader1990
Copy link
Contributor Author

ader1990 commented Sep 25, 2024

With the is-live-image issue fixed, systemd 256 expects dracut 058 -> see systemd/systemd@1c585a4

Until dracut is updated, we need to revert this commit manually in systemd: systemd/systemd@1c585a4

@chewi
Copy link
Contributor

chewi commented Sep 25, 2024

Updating Dracut has proven tricky, mainly due to size issues, so I wouldn't wait for that.

@ader1990
Copy link
Contributor Author

Updating Dracut has proven tricky, mainly due to size issues, so I wouldn't wait for that.

yeap, will add the patch to systemd ebuild.

@ader1990
Copy link
Contributor Author

ader1990 commented Sep 26, 2024

AMD64 Flatcar running with systemd 256 and linux 6.10:

root@localhost ~ # cat /etc/os-release

NAME="Flatcar Container Linux by Kinvolk"
ID=flatcar
ID_LIKE=coreos
VERSION=4102.0.0+nightly-20240923-2100-26-g730775213c
VERSION_ID=4102.0.0
BUILD_ID=nightly-20240923-2100-26-g730775213c
SYSEXT_LEVEL=1.0
PRETTY_NAME="Flatcar Container Linux by Kinvolk 4102.0.0+nightly-20240923-2100-26-g730775213c (Oklo)"
ANSI_COLOR="38;5;75"
HOME_URL="https://flatcar.org/"
BUG_REPORT_URL="https://issues.flatcar.org"
FLATCAR_BOARD="amd64-usr"
CPE_NAME="cpe:2.3:o:flatcar-linux:flatcar_linux:4102.0.0+nightly-20240923-2100-26-g730775213c:*:*:*:*:*:*:*"

root@localhost ~ # uname -a
Linux localhost 6.10.9-flatcar #1 SMP PREEMPT_DYNAMIC Thu Sep 26 06:21:38 -00 2024 x86_64 Intel(R) Xeon(R) Gold 6134 CPU @ 3.20GHz GenuineIntel GNU/Linux

root@localhost ~ # systemctl --version
systemd 256 (256.2)
+PAM +AUDIT +SELINUX -APPARMOR +IMA +SMACK +SECCOMP +GCRYPT -GNUTLS +OPENSSL -ACL +BLKID +CURL +ELFUTILS -FIDO2 +IDN2 -IDN +IPTC +KMOD +LIBCRYPTSETUP +LIBCRYPTSETUP_PLUGINS +LIBFDISK +PCRE2 -PWQUALITY -P11KIT -QRENCODE +TPM2 +BZIP2 +LZ4 +XZ +ZLIB +ZSTD -BPF_FRAMEWORK -XKBCOMMON +UTMP -SYSVINIT +LIBARCHIVE

root@localhost ~ # df /boot
Filesystem     1K-blocks  Used Available Use% Mounted on
/dev/vda1         129039 62922     66118  49% /boot

@ader1990 ader1990 force-pushed the ader1990/systemd-major-version-upgrade-256 branch from a3f885c to af427b9 Compare September 26, 2024 07:33
@ader1990 ader1990 marked this pull request as ready for review September 26, 2024 07:34
@ader1990
Copy link
Contributor Author

The github actions failed because of the Mantle tests using cgroupv1. See https://github.com/systemd/systemd/releases/tag/v256-rc3 -> systemd will refuse to boot in normal circumstances.

Support for cgroup v1 ('legacy' and 'hybrid' hierarchies) is now
      considered obsolete and systemd by default will refuse to boot under
      it. To forcibly reenable cgroup v1 support,
      SYSTEMD_CGROUP_ENABLE_LEGACY_FORCE=1 must be set on kernel command
      line. The meson option 'default-hierarchy=' is also deprecated, i.e.
      only cgroup v2 ('unified' hierarchy) can be selected as build-time
      default.

@jepio
Copy link
Member

jepio commented Sep 27, 2024 via email

@jepio
Copy link
Member

jepio commented Sep 27, 2024 via email

@ader1990
Copy link
Contributor Author

ader1990 commented Sep 30, 2024

Or is it worth still letting users stay with cgroups v1 if they set this flag? we would still want to validate before commiting the update.

________________________________ From: Jeremi Piotrowski @.> Sent: Thursday, September 26, 2024 8:15:27 PM To: flatcar/scripts @.>; flatcar/scripts @.> Cc: Review requested @.> Subject: Re: [flatcar/scripts] Upgrade systemd from 255 to 256 (PR #2145) We should add code to our update postinstall hook to detect if the user is still on cgroups v1 and abort the update.
________________________________ From: Adrian Vladu @.> Sent: Thursday, September 26, 2024 3:39:36 AM To: flatcar/scripts @.> Cc: Jeremi Piotrowski @.>; Review requested @.> Subject: Re: [flatcar/scripts] Upgrade systemd from 255 to 256 (PR #2145) The github actions failed because of the Mantle tests using cgroupv1. See https://github.com/systemd/systemd/releases/tag/v256-rc3 -> systemd will refuse to boot in normal circumstances. Support for cgroup v1 ('legacy' and 'hybrid' hierarchies) is now considered obsolete and systemd by default will refuse to boot under it. To forcibly reenable cgroup v1 support, SYSTEMD_CGROUP_ENABLE_LEGACY_FORCE=1 must be set on kernel command line. The meson option 'default-hierarchy=' is also deprecated, i.e. only cgroup v2 ('unified' hierarchy) can be selected as build-time default. — Reply to this email directly, view it on GitHub<#2145 (comment)>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABXINVQUZN6IBMQMAL3GTWLZYPP6RAVCNFSM6AAAAABLKHBRBSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNZWGU4DGMZWHA. You are receiving this because your review was requested.Message ID: @.***>

Hello @jepio, indeed, we have just a few paths forward:

  1. follow systemd approach no questions asked related to the cgroupv1 and block the Flatcar upgrade in the postinstall hook.
    Document the transition and disseminate the information on all possible channels about it.
    Document the usage of SYSTEMD_CGROUP_ENABLE_LEGACY_FORCE=1 with ignition / manual update for the new/old instances.
  2. set SYSTEMD_CGROUP_ENABLE_LEGACY_FORCE=1 on all images (not the best), this flag might be also removed in subsequent versions
  3. patch systemd to remove the check (even less)
  4. block the systemd upgrade until another option appears. Meanwhile, announce this future upgrade path and wait for more feedback from the community.

Thanks.

@dongsupark
Copy link
Member

we have just a few paths forward:

1. follow systemd approach no questions asked related to the cgroupv1 and block the Flatcar upgrade in the postinstall hook.
   Document the transition and disseminate the information on all possible channels about it.
   Document the usage of SYSTEMD_CGROUP_ENABLE_LEGACY_FORCE=1 with ignition / manual update for the new/old instances.

2. set SYSTEMD_CGROUP_ENABLE_LEGACY_FORCE=1 on all images (not the best), this flat might be also removed in subsequent versions

3. patch systemd to remove the check (even less)

4. block the systemd upgrade until another option appears. Meanwhile, announce this future upgrade path and wait for more feedback from the community.

I am for 1, deprecate cgroup v1 and document as much as possible.
Other options look like nothing more than delaying issues.
Even if we go for 1, users still have workaround to revert the behavior.

@sayanchowdhury
Copy link
Member

  1. follow systemd approach no questions asked related to the cgroupv1 and block the Flatcar upgrade in the postinstall hook.

I agree with Dongsu to take the 1 approach, and add to the notes section in the Release notes section, and into documentation.

@ader1990 ader1990 force-pushed the ader1990/systemd-major-version-upgrade-256 branch from 1514993 to 120668a Compare December 17, 2024 09:55
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can update this to pull the latest version of upstream Gentoo and we need to be sure that it's a regular copy (i.e no modification).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is already to the latest gentoo upstream and checked.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I asked because I see this commit here: gentoo/gentoo@44519e4 that brings 256.9 (what we do in the next commit: "apply flatcar modification")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While at here, can you drop the following Flatcar modification:

+ # Flatcar: Drop sec-policy/selinux-ntp from deps (under selinux use
+ # flag). The image stage fails with "Failed to resolve
+ # typeattributeset statement at
+ # /var/lib/selinux/mcs/tmp/modules/400/ntp/cil:120"
selinux? (
		sec-policy/selinux-base-policy[systemd]
-		sec-policy/selinux-ntp

and add selinux-ntp to the selinux policies:

  1. Add the package to ::portage-stable
  2. Add the package to .github/workflows/portage-stable-packages-list
  3. Add the package to sdk_container/src/third_party/coreos-overlay/coreos-base/coreos/coreos-0.0.1-r316.ebuild

This will help in the SELinux effort.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do it after this change in another patch, as it is already complicated to properly test as is.

# the filesystem is already read-write. Conveniently the
# systemd Makefile sets this up completely wrong.
#
# Flatcar: TODO: Is this still a problem?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see quite a few TODO here. Can we open a tracking issue and track those TODO? These can be done on the side but they will be forgot if it's not tracked somewhere.

…setting.patch

According to https://github.com/systemd/systemd-stable/blob/v256/src/network/networkd-network.c#L470,
the forwarding settings have changed on systemd 256.

From the discussions upstream, if a systemd is configured to manage an interface,
it will manage it completely, and it will set that interface to not forward packets
by default.

From the current systemd code, it would be easy to either enable the forwarding or disable it,
but there does not seem to be a way now to inherit it from the sysctl / kernel implementation.
@ader1990 ader1990 force-pushed the ader1990/systemd-major-version-upgrade-256 branch from 120668a to e8996b1 Compare December 17, 2024 14:57
@ader1990 ader1990 requested a review from a team December 18, 2024 09:22
@ader1990
Copy link
Contributor Author

ader1990 commented Dec 18, 2024

This PR is ready to be merged as-is, with the following to do items:

  • Suggested by @tormath1: handle the selinux-ntp in another PR (the functionality is unrelated to this PR, the issue to be solved is an historical one)

flatcar/update_engine#41 -> needs to be merged during the same release window - MERGED

  • Investigation needed for 0002-networkd-default-to-kernel-IPForwarding-setting.patch - this patch is not appliable anymore and has been dropped because of that. My local checks are good and the CI passes - the IP forwarding works without a similar patch, but more careful investigation is required to see if there are real-world usecase affected

@sayanchowdhury
Copy link
Member

@ader1990 ader1990 force-pushed the ader1990/systemd-major-version-upgrade-256 branch from 93e3768 to 089df88 Compare December 18, 2024 21:17
@tormath1
Copy link
Contributor

CI http://localhost:8080/job/container/job/sdk/1883/

@sayanchowdhury can you restart a CI please? I'd like to see the update test passing (as the update-engine commit was missing in this run)

@ader1990 ader1990 merged commit 6b96a3f into main Dec 20, 2024
5 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

Successfully merging this pull request may close these issues.

[RFE] Upgrade systemd 255 to systemd 256
8 participants