Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LTS 8.8] tipc: Fix use-after-free of kernel socket in cleanup_bearer(). #115

Open
wants to merge 1 commit into
base: ciqlts8_8
Choose a base branch
from

Conversation

dvdgomez
Copy link

@dvdgomez dvdgomez commented Feb 12, 2025

jira VULN-8672
cve CVE-2024-56642
commit-author Kuniyuki Iwashima [email protected]
commit 6a2fa13

Commit that this fixes: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=26abe14379f8
Our matching commit no diff seen: 26abe14379f8

syzkaller reported a use-after-free of UDP kernel socket in cleanup_bearer() without repro. [0][1]

When bearer_disable() calls tipc_udp_disable(), cleanup of the UDP kernel socket is deferred by work calling cleanup_bearer().

tipc_net_stop() waits for such works to finish by checking tipc_net(net)->wq_count.  However, the work decrements the count too early before releasing the kernel socket, unblocking cleanup_net() and resulting in use-after-free.

Let's move the decrement after releasing the socket in cleanup_bearer().

[0]:
ref_tracker: net notrefcnt@000000009b3d1faf has 1/1 users at
     sk_alloc+0x438/0x608
     inet_create+0x4c8/0xcb0
     __sock_create+0x350/0x6b8
     sock_create_kern+0x58/0x78
     udp_sock_create4+0x68/0x398
     udp_sock_create+0x88/0xc8
     tipc_udp_enable+0x5e8/0x848
     __tipc_nl_bearer_enable+0x84c/0xed8
     tipc_nl_bearer_enable+0x38/0x60
     genl_family_rcv_msg_doit+0x170/0x248
     genl_rcv_msg+0x400/0x5b0
     netlink_rcv_skb+0x1dc/0x398
     genl_rcv+0x44/0x68
     netlink_unicast+0x678/0x8b0
     netlink_sendmsg+0x5e4/0x898
     ____sys_sendmsg+0x500/0x830

[1]:
BUG: KMSAN: use-after-free in udp_hashslot include/net/udp.h:85 [inline] BUG: KMSAN: use-after-free in udp_lib_unhash+0x3b8/0x930 net/ipv4/udp.c:1979
 udp_hashslot include/net/udp.h:85 [inline]
 udp_lib_unhash+0x3b8/0x930 net/ipv4/udp.c:1979
 sk_common_release+0xaf/0x3f0 net/core/sock.c:3820
 inet_release+0x1e0/0x260 net/ipv4/af_inet.c:437
 inet6_release+0x6f/0xd0 net/ipv6/af_inet6.c:489
 __sock_release net/socket.c:658 [inline]
 sock_release+0xa0/0x210 net/socket.c:686
 cleanup_bearer+0x42d/0x4c0 net/tipc/udp_media.c:819
 process_one_work kernel/workqueue.c:3229 [inline]
 process_scheduled_works+0xcaf/0x1c90 kernel/workqueue.c:3310
 worker_thread+0xf6c/0x1510 kernel/workqueue.c:3391
 kthread+0x531/0x6b0 kernel/kthread.c:389
 ret_from_fork+0x60/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:244

Uninit was created at:
 slab_free_hook mm/slub.c:2269 [inline]
 slab_free mm/slub.c:4580 [inline]
 kmem_cache_free+0x207/0xc40 mm/slub.c:4682
 net_free net/core/net_namespace.c:454 [inline]
 cleanup_net+0x16f2/0x19d0 net/core/net_namespace.c:647
 process_one_work kernel/workqueue.c:3229 [inline]
 process_scheduled_works+0xcaf/0x1c90 kernel/workqueue.c:3310
 worker_thread+0xf6c/0x1510 kernel/workqueue.c:3391
 kthread+0x531/0x6b0 kernel/kthread.c:389
 ret_from_fork+0x60/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:244

CPU: 0 UID: 0 PID: 54 Comm: kworker/0:2 Not tainted 6.12.0-rc1-00131-gf66ebf37d69c #7 91723d6f74857f70725e1583cba3cf4adc716cfa
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
Workqueue: events cleanup_bearer

Fixes: 26abe14 ("net: Modify sk_alloc to not reference count the netns of kernel sockets.")
Reported-by: syzkaller [email protected]
Signed-off-by: Kuniyuki Iwashima [email protected]
Link: https://patch.msgid.link/[email protected]
Signed-off-by: Paolo Abeni [email protected]

(cherry picked from commit 6a2fa13)
Signed-off-by: David Gomez [email protected]

Build.log:

INSTALL sound/virtio/virtio_snd.ko
  INSTALL sound/x86/snd-hdmi-lpe-audio.ko
  INSTALL sound/xen/snd_xen_front.ko
  INSTALL virt/lib/irqbypass.ko
  DEPMOD  4.18.0-dgomez_ciqlts8_8_CVE-2024-56642+
[TIMER]{MODULES}: 38s
Making Install
sh ./arch/x86/boot/install.sh 4.18.0-dgomez_ciqlts8_8_CVE-2024-56642+ arch/x86/boot/bzImage \
	System.map "/boot"
[TIMER]{INSTALL}: 24s
Checking kABI
Checking kABI
kABI check passed
Setting Default Kernel to /boot/vmlinuz-4.18.0-dgomez_ciqlts8_8_CVE-2024-56642+ and Index to 0
The default is /boot/loader/entries/047c42a070cd4f44bad852979f322840-4.18.0-dgomez_ciqlts8_8_CVE-2024-56642+.conf with index 0 and kernel /boot/vmlinuz-4.18.0-dgomez_ciqlts8_8_CVE-2024-56642+
The default is /boot/loader/entries/047c42a070cd4f44bad852979f322840-4.18.0-dgomez_ciqlts8_8_CVE-2024-56642+.conf with index 0 and kernel /boot/vmlinuz-4.18.0-dgomez_ciqlts8_8_CVE-2024-56642+
Generating grub configuration file ...
done
Hopefully Grub2.0 took everything ... rebooting after time metrices
[TIMER]{MRPROPER}: 7s
[TIMER]{BUILD}: 2650s
[TIMER]{MODULES}: 38s
[TIMER]{INSTALL}: 24s
[TIMER]{TOTAL} 2724s
Rebooting in 10 seconds

$ uname -a
Linux r88 4.18.0-dgomez_ciqlts8_8_CVE-2024-56642+ #1 SMP Tue Feb 11 23:12:50 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Where kselftest_before.log reflects $ uname -a Linux r88 4.18.0-477.27.1.el8_8.88ciq_lts.2.1.x86_64 #1 SMP Thu Dec 12 15:27:14 CST 2024 x86_64 x86_64 x86_64 GNU/Linux and the other reflects $ uname -a Linux r88 4.18.0-dgomez_ciqlts8_8_CVE-2024-56642+ #1 SMP Tue Feb 11 23:12:50 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

[dgomez@r88 kernel-src-tree]$ grep '^ok' kernel-selftest_before.log | wc -l
204
[dgomez@r88 kernel-src-tree]$ grep '^ok' kernel-selftest_after.log | wc -l
199
[dgomez@r88 kernel-src-tree]$ grep 'not ok' kernel-selftest_before.log | wc -l
46
[dgomez@r88 kernel-src-tree]$ grep 'not ok' kernel-selftest_after.log | wc -l
51

Kernel selftest logs:
kernel-selftest_before.log
kernel-selftest_after.log

jira VULN-8672
cve CVE-2024-56642
commit-author Kuniyuki Iwashima <[email protected]>
commit 6a2fa13

syzkaller reported a use-after-free of UDP kernel socket
in cleanup_bearer() without repro. [0][1]

When bearer_disable() calls tipc_udp_disable(), cleanup
of the UDP kernel socket is deferred by work calling
cleanup_bearer().

tipc_net_stop() waits for such works to finish by checking
tipc_net(net)->wq_count.  However, the work decrements the
count too early before releasing the kernel socket,
unblocking cleanup_net() and resulting in use-after-free.

Let's move the decrement after releasing the socket in
cleanup_bearer().

[0]:
ref_tracker: net notrefcnt@000000009b3d1faf has 1/1 users at
     sk_alloc+0x438/0x608
     inet_create+0x4c8/0xcb0
     __sock_create+0x350/0x6b8
     sock_create_kern+0x58/0x78
     udp_sock_create4+0x68/0x398
     udp_sock_create+0x88/0xc8
     tipc_udp_enable+0x5e8/0x848
     __tipc_nl_bearer_enable+0x84c/0xed8
     tipc_nl_bearer_enable+0x38/0x60
     genl_family_rcv_msg_doit+0x170/0x248
     genl_rcv_msg+0x400/0x5b0
     netlink_rcv_skb+0x1dc/0x398
     genl_rcv+0x44/0x68
     netlink_unicast+0x678/0x8b0
     netlink_sendmsg+0x5e4/0x898
     ____sys_sendmsg+0x500/0x830

[1]:
BUG: KMSAN: use-after-free in udp_hashslot include/net/udp.h:85 [inline]
BUG: KMSAN: use-after-free in udp_lib_unhash+0x3b8/0x930 net/ipv4/udp.c:1979
 udp_hashslot include/net/udp.h:85 [inline]
 udp_lib_unhash+0x3b8/0x930 net/ipv4/udp.c:1979
 sk_common_release+0xaf/0x3f0 net/core/sock.c:3820
 inet_release+0x1e0/0x260 net/ipv4/af_inet.c:437
 inet6_release+0x6f/0xd0 net/ipv6/af_inet6.c:489
 __sock_release net/socket.c:658 [inline]
 sock_release+0xa0/0x210 net/socket.c:686
 cleanup_bearer+0x42d/0x4c0 net/tipc/udp_media.c:819
 process_one_work kernel/workqueue.c:3229 [inline]
 process_scheduled_works+0xcaf/0x1c90 kernel/workqueue.c:3310
 worker_thread+0xf6c/0x1510 kernel/workqueue.c:3391
 kthread+0x531/0x6b0 kernel/kthread.c:389
 ret_from_fork+0x60/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:244

Uninit was created at:
 slab_free_hook mm/slub.c:2269 [inline]
 slab_free mm/slub.c:4580 [inline]
 kmem_cache_free+0x207/0xc40 mm/slub.c:4682
 net_free net/core/net_namespace.c:454 [inline]
 cleanup_net+0x16f2/0x19d0 net/core/net_namespace.c:647
 process_one_work kernel/workqueue.c:3229 [inline]
 process_scheduled_works+0xcaf/0x1c90 kernel/workqueue.c:3310
 worker_thread+0xf6c/0x1510 kernel/workqueue.c:3391
 kthread+0x531/0x6b0 kernel/kthread.c:389
 ret_from_fork+0x60/0x80 arch/x86/kernel/process.c:147
 ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:244

CPU: 0 UID: 0 PID: 54 Comm: kworker/0:2 Not tainted 6.12.0-rc1-00131-gf66ebf37d69c #7 91723d6f74857f70725e1583cba3cf4adc716cfa
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014
Workqueue: events cleanup_bearer

Fixes: 26abe14 ("net: Modify sk_alloc to not reference count the netns of kernel sockets.")
	Reported-by: syzkaller <[email protected]>
	Signed-off-by: Kuniyuki Iwashima <[email protected]>
Link: https://patch.msgid.link/[email protected]
	Signed-off-by: Paolo Abeni <[email protected]>

(cherry picked from commit 6a2fa13)
	Signed-off-by: David Gomez <[email protected]>
@dvdgomez dvdgomez self-assigned this Feb 12, 2025
@gvrose8192
Copy link
Collaborator

gvrose8192 commented Feb 12, 2025

David, the commit is clean and I'm pretty sure it won't cause a problem. I appreciate that you've attached a kernel selftest log as well.

The issue is that we need at least one kernel selftest log fome before the change was made then another after the change is made. Then a diff is done of the resultant logs. The way to see the diffs and changes is explained here: https://ciqinc.atlassian.net/wiki/spaces/ENG/pages/574652455/Kernel+Self+Testing

@dvdgomez
Copy link
Author

David, the commit is clean and I'm pretty sure it won't cause a problem. I appreciate that you've attached a kernel selftest log as well.

The issue is that we need at least one kernel selftest log fome before the change was made then another after the change is made. Then a diff is done of the resultant logs. The way to see the diffs and changes is explained here: https://ciqinc.atlassian.net/wiki/spaces/ENG/pages/574652455/Kernel+Self+Testing

@gvrose8192 Added the previous kernel selftest log and diffs as requested 👍

@PlaidCat
Copy link
Collaborator

David, the commit is clean and I'm pretty sure it won't cause a problem. I appreciate that you've attached a kernel selftest log as well.
The issue is that we need at least one kernel selftest log fome before the change was made then another after the change is made. Then a diff is done of the resultant logs. The way to see the diffs and changes is explained here: https://ciqinc.atlassian.net/wiki/spaces/ENG/pages/574652455/Kernel+Self+Testing

@gvrose8192 Added the previous kernel selftest log and diffs as requested 👍

I fixed up the PR header, it looks like some markdown got munged.

Where kselftest_before.log reflects $ uname -a Linux r88 4.18.0-477.27.1.el8_8.88ciq_lts.2.1.x86_64 #1 SMP Thu Dec 12 15:27:14 CST 2024 x86_64 x86_64 x86_64 GNU/Linux and the other reflects $ uname -a Linux r88 4.18.0-dgomez_ciqlts8_8_CVE-2024-56642+ #1 SMP Tue Feb 11 23:12:50 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

$ grep ^ok kselftest_before.log | wc -l
199
$ grep ^ok kselftest.log | wc -l
167

$ grep 'not ok' kselftest_before.log | wc -l
50
$ grep 'not ok' kselftest.log | wc -l
37

This looks odd, how did you run the test before and after?
There appears to be a 45 test delta in here (maybe skipped?)

@dvdgomez dvdgomez changed the title tipc: Fix use-after-free of kernel socket in cleanup_bearer(). [LTS 8.8] tipc: Fix use-after-free of kernel socket in cleanup_bearer(). Feb 13, 2025
@dvdgomez
Copy link
Author

David, the commit is clean and I'm pretty sure it won't cause a problem. I appreciate that you've attached a kernel selftest log as well.
The issue is that we need at least one kernel selftest log fome before the change was made then another after the change is made. Then a diff is done of the resultant logs. The way to see the diffs and changes is explained here: https://ciqinc.atlassian.net/wiki/spaces/ENG/pages/574652455/Kernel+Self+Testing

@gvrose8192 Added the previous kernel selftest log and diffs as requested 👍

I fixed up the PR header, it looks like some markdown got munged.

Where kselftest_before.log reflects $ uname -a Linux r88 4.18.0-477.27.1.el8_8.88ciq_lts.2.1.x86_64 #1 SMP Thu Dec 12 15:27:14 CST 2024 x86_64 x86_64 x86_64 GNU/Linux and the other reflects $ uname -a Linux r88 4.18.0-dgomez_ciqlts8_8_CVE-2024-56642+ #1 SMP Tue Feb 11 23:12:50 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

$ grep ^ok kselftest_before.log | wc -l
199
$ grep ^ok kselftest.log | wc -l
167

$ grep 'not ok' kselftest_before.log | wc -l
50
$ grep 'not ok' kselftest.log | wc -l
37

This looks odd, how did you run the test before and after? There appears to be a 45 test delta in here (maybe skipped?)

I ended up tearing down my previous lts8.8 VM as it had some artifacts from my 8.6 build which was causing issues so I started over from scratch. After my new lts8.8 VM was setup I ran the kerne-selftest and then ran kernel_build with the cherry-picked commit and ran the kernel-selftest again to get the following.

This is what the kernel selftest output looks like now:

[dgomez@r88 kernel-src-tree]$ grep '^ok' kernel-selftest_before.log | wc -l
204
[dgomez@r88 kernel-src-tree]$ grep '^ok' kernel-selftest_after.log | wc -l
199
[dgomez@r88 kernel-src-tree]$ grep 'not ok' kernel-selftest_before.log | wc -l
46
[dgomez@r88 kernel-src-tree]$ grep 'not ok' kernel-selftest_after.log | wc -l
51

@dvdgomez dvdgomez marked this pull request as ready for review February 13, 2025 21:57
@gvrose8192
Copy link
Collaborator

I ended up tearing down my previous lts8.8 VM as it had some artifacts from my 8.6 build which was causing issues so I started over from scratch. After my new lts8.8 VM was setup I ran the kerne-selftest and then ran kernel_build with the cherry-picked commit and ran the kernel-selftest again to get the following.

This is what the kernel selftest output looks like now:

[dgomez@r88 kernel-src-tree]$ grep '^ok' kernel-selftest_before.log | wc -l
204
[dgomez@r88 kernel-src-tree]$ grep '^ok' kernel-selftest_after.log | wc -l
199
[dgomez@r88 kernel-src-tree]$ grep 'not ok' kernel-selftest_before.log | wc -l
46
[dgomez@r88 kernel-src-tree]$ grep 'not ok' kernel-selftest_after.log | wc -l
51

Thank you David! I have one further suggestion. Try this:

[dgomez@r88 kernel-src-tree]$ grep '^ok' kernel-selftest_before.log > tmp-before-ok.log
[dgomez@r88 kernel-src-tree]$ grep '^ok' kernel-selftest_after.log > tmp-after-ok.log
[dgomez@r88 kernel-src-tree]$ grep 'not ok' kernel-selftest_before.log > tmp-before-notok.log
[dgomez@r88 kernel-src-tree]$ grep 'not ok' kernel-selftest_after.log > tmp-after-notok.log

And then check the diffs between before and after logs for both ok and not ok. Then we can see if any of the tests that were OK before but fail now may be related to the area where this change is impacting.

TBH I seriously doubt it - but it's good to check.

Thanks I appreciate your efforts!

@dvdgomez
Copy link
Author

Thank you David! I have one further suggestion. Try this:

[dgomez@r88 kernel-src-tree]$ grep '^ok' kernel-selftest_before.log > tmp-before-ok.log
[dgomez@r88 kernel-src-tree]$ grep '^ok' kernel-selftest_after.log > tmp-after-ok.log
[dgomez@r88 kernel-src-tree]$ grep 'not ok' kernel-selftest_before.log > tmp-before-notok.log
[dgomez@r88 kernel-src-tree]$ grep 'not ok' kernel-selftest_after.log > tmp-after-notok.log

And then check the diffs between before and after logs for both ok and not ok. Then we can see if any of the tests that were OK before but fail now may be related to the area where this change is impacting.

TBH I seriously doubt it - but it's good to check.

Thanks I appreciate your efforts!

Here is the diff on the ^ok logs:

$ diff tmp-before-ok.log tmp-after-ok.log 
75,79c75,79
< ok 1 selftests: livepatch: test-livepatch.sh # SKIP
< ok 2 selftests: livepatch: test-callbacks.sh # SKIP
< ok 3 selftests: livepatch: test-shadow-vars.sh # SKIP
< ok 4 selftests: livepatch: test-state.sh # SKIP
< ok 5 selftests: livepatch: test-ftrace.sh # SKIP
---
> ok 1 selftests: livepatch: test-livepatch.sh
> ok 2 selftests: livepatch: test-callbacks.sh
> ok 3 selftests: livepatch: test-shadow-vars.sh
> ok 4 selftests: livepatch: test-state.sh
> ok 5 selftests: livepatch: test-ftrace.sh
91a92
> ok 9 selftests: net: test_bpf.sh
99,100d99
< ok 17 selftests: net: ip_defrag.sh
< ok 18 selftests: net: udpgso_bench.sh # SKIP
102,104d100
< ok 22 selftests: net: udpgro_bench.sh
< ok 23 selftests: net: udpgro.sh
< ok 24 selftests: net: reuseport_addr_any.sh
107a104
> ok 29 selftests: net: l2tp.sh
116,117d112
< ok 39 selftests: net: udpgro_fwd.sh
< ok 40 selftests: net: veth.sh
204c199
< ok 1 selftests: zram: zram.sh # SKIP
---
> ok 1 selftests: zram: zram.sh

Here is the diff on the notok logs:

$ diff tmp-before-notok.log tmp-after-notok.log 
15c15,16
< not ok 9 selftests: net: test_bpf.sh # exit=1
---
> not ok 17 selftests: net: ip_defrag.sh # exit=1
> not ok 18 selftests: net: udpgso_bench.sh # exit=1
17a19,21
> not ok 22 selftests: net: udpgro_bench.sh # exit=255
> not ok 23 selftests: net: udpgro.sh # exit=255
> not ok 24 selftests: net: reuseport_addr_any.sh # exit=1
19d22
< not ok 29 selftests: net: l2tp.sh # exit=1
20a24,25
> not ok 39 selftests: net: udpgro_fwd.sh # exit=1
> not ok 40 selftests: net: veth.sh # exit=255

@gvrose8192
Copy link
Collaborator

Thanks David! LGTM

Copy link
Collaborator

@gvrose8192 gvrose8192 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants