netdev CI testing #6666

kuba-moo · 2024-03-27T20:02:33Z

Reusable PR for hooking netdev CI to BPF testing.

During socket release, sock_orphan() is called without considering that it sets sk->sk_wq to NULL. Later, if SO_LINGER is enabled, this leads to a null pointer dereferenced in virtio_transport_wait_close(). Orphan the socket only after transport release. Partially reverts the 'Fixes:' commit. KASAN: null-ptr-deref in range [0x0000000000000018-0x000000000000001f] lock_acquire+0x19e/0x500 _raw_spin_lock_irqsave+0x47/0x70 add_wait_queue+0x46/0x230 virtio_transport_release+0x4e7/0x7f0 __vsock_release+0xfd/0x490 vsock_release+0x90/0x120 __sock_release+0xa3/0x250 sock_close+0x14/0x20 __fput+0x35e/0xa90 __x64_sys_close+0x78/0xd0 do_syscall_64+0x93/0x1b0 entry_SYSCALL_64_after_hwframe+0x76/0x7e Reported-by: [email protected] Closes: https://syzkaller.appspot.com/bug?extid=9d55b199192a4be7d02c Fixes: fcdd224 ("vsock: Keep the binding until socket destruction") Signed-off-by: Michal Luczaj <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Explicitly close() a TCP_ESTABLISHED (connectible) socket with SO_LINGER enabled. May trigger a null pointer dereference. Signed-off-by: Michal Luczaj <[email protected]> Signed-off-by: NipaLocal <nipa@local>

NPAR 1.2 adds a transparent VLAN tag for all packets between the NIC and the switch. Because of that, RX VLAN acceleration cannot be supported for any additional host configured VLANs. The driver has to acknowledge that it can support no RX VLAN acceleration and set the NPAR 1.2 supported flag when registering with the FW. Otherwise, the FW call will fail and the driver will abort on these NPAR 1.2 NICs with this error: bnxt_en 0000:26:00.0 (unnamed net_device) (uninitialized): hwrm req_type 0x1d seq id 0xb error 0x2 Reviewed-by: Somnath Kotur <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Add a new bnxt_hwrm_cp_ring_alloc_p5() function to handle allocating one completion ring on P5_PLUS chips. This simplifies the existing code and will be useful later in the series. Reviewed-by: Ajit Khaparde <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Add a new bnxt_hwrm_tx_ring_alloc() function to handle allocating a transmit ring. This will be useful later in the series. Reviewed-by: Ajit Khaparde <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Add a wrapper routine to free L2 completion rings. This will be useful later in the series. Reviewed-by: Kalesh AP <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Signed-off-by: Somnath Kotur <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Modify bnxt_free_tx_rings() to free the skbs per TX ring. This will be useful later in the series. Reviewed-by: Michal Swiatkowski <[email protected]> Signed-off-by: Somnath Kotur <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: NipaLocal <nipa@local>

There is some common code for setting up RX and RX AGG ring allocation parameters for P5_PLUS chips. Refactor the logic into a new function. Reviewed-by: Hongguang Gao <[email protected]> Reviewed-by: Ajit Khaparde <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Newer firmware can use the NQ ring ID associated with each RX/RX AGG ring to enable PCIe Steering Tags on P5_PLUS chips. When allocating RX/RX AGG rings, pass along NQ ring ID for the firmware to use. This information helps optimize DMA writes by directing them to the cache closer to the CPU consuming the data, potentially improving the processing speed. This change is backward-compatible with older firmware, which will simply disregard the information. Reviewed-by: Hongguang Gao <[email protected]> Reviewed-by: Ajit Khaparde <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Signed-off-by: Andy Gospodarek <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: NipaLocal <nipa@local>

In order to program the correct Steering Tag during an IRQ affinity change, we need to free/re-allocate the RX completion ring during queue_restart. If TPH is enabled, call FW to free the Rx completion ring and clear the ring entries in queue_stop(). Re-allocate it in queue_start() if TPH is enabled. Note that TPH mode is not enabled in this patch and will be enabled later in the patch series. While modifying bnxt_queue_start(), remove the unnecessary zeroing of rxr->rx_next_cons. It gets overwritten by the clone in bnxt_queue_start(). Reviewed-by: Michal Swiatkowski <[email protected]> Signed-off-by: Somnath Kotur <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: NipaLocal <nipa@local>

In order to use queue_stop/queue_start to support the new Steering Tags, we need to free the TX ring and TX completion ring if it is a combined channel with TX/RX sharing the same NAPI. Otherwise TX completions will not have the updated Steering Tag. If TPH is not enabled, we just stop the TX ring without freeing the TX/TX cmpl rings. With that we can now add napi_disable() and napi_enable() during queue_stop()/ queue_start(). This will guarantee that NAPI will stop processing the completion entries in case there are additional pending entries in the completion rings after queue_stop(). There could be some NQEs sitting unprocessed while NAPI is disabled thereby leaving the NQ unarmed. Explicitly re-arm the NQ after napi_enable() in queue start so that NAPI will resume properly. Error handling in bnxt_queue_start() requires a reset. If a TX ring cannot be allocated or initialized properly, it will cause TX timeout. The reset will also free any partially allocated rings. Reviewed-by: Ajit Khaparde <[email protected]> Reviewed-by: Michal Swiatkowski <[email protected]> Signed-off-by: Somnath Kotur <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Add TPH support to the Broadcom BNXT device driver. This allows the driver to utilize TPH functions for retrieving and configuring Steering Tags when changing interrupt affinity. With compatible NIC firmware, network traffic will be tagged correctly with Steering Tags, resulting in significant memory bandwidth savings and other advantages as demonstrated by real network benchmarks on TPH-capable platforms. Co-developed-by: Somnath Kotur <[email protected]> Signed-off-by: Somnath Kotur <[email protected]> Co-developed-by: Wei Huang <[email protected]> Signed-off-by: Wei Huang <[email protected]> Signed-off-by: Manoj Panicker <[email protected]> Reviewed-by: Ajit Khaparde <[email protected]> Reviewed-by: Andy Gospodarek <[email protected]> Signed-off-by: Michael Chan <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Expected behaviour: In case we reach scheduler's limit, pfifo_tail_enqueue() will drop a packet in scheduler's queue and decrease scheduler's qlen by one. Then, pfifo_tail_enqueue() enqueue new packet and increase scheduler's qlen by one. Finally, pfifo_tail_enqueue() return `NET_XMIT_CN` status code. Weird behaviour: In case we set `sch->limit == 0` and trigger pfifo_tail_enqueue() on a scheduler that has no packet, the 'drop a packet' step will do nothing. This means the scheduler's qlen still has value equal 0. Then, we continue to enqueue new packet and increase scheduler's qlen by one. In summary, we can leverage pfifo_tail_enqueue() to increase qlen by one and return `NET_XMIT_CN` status code. The problem is: Let's say we have two qdiscs: Qdisc_A and Qdisc_B. - Qdisc_A's type must have '->graft()' function to create parent/child relationship. Let's say Qdisc_A's type is `hfsc`. Enqueue packet to this qdisc will trigger `hfsc_enqueue`. - Qdisc_B's type is pfifo_head_drop. Enqueue packet to this qdisc will trigger `pfifo_tail_enqueue`. - Qdisc_B is configured to have `sch->limit == 0`. - Qdisc_A is configured to route the enqueued's packet to Qdisc_B. Enqueue packet through Qdisc_A will lead to: - hfsc_enqueue(Qdisc_A) -> pfifo_tail_enqueue(Qdisc_B) - Qdisc_B->q.qlen += 1 - pfifo_tail_enqueue() return `NET_XMIT_CN` - hfsc_enqueue() check for `NET_XMIT_SUCCESS` and see `NET_XMIT_CN` => hfsc_enqueue() don't increase qlen of Qdisc_A. The whole process lead to a situation where Qdisc_A->q.qlen == 0 and Qdisc_B->q.qlen == 1. Replace 'hfsc' with other type (for example: 'drr') still lead to the same problem. This violate the design where parent's qlen should equal to the sum of its childrens'qlen. Bug impact: This issue can be used for user->kernel privilege escalation when it is reachable. Fixes: f70f90672a2c ("sched: add head drop fifo queue") Reported-by: Quang Le <[email protected]> Signed-off-by: Quang Le <[email protected]> Signed-off-by: Cong Wang <[email protected]> Signed-off-by: NipaLocal <nipa@local>

…limit==0 When limit == 0, pfifo_tail_enqueue() must drop new packet and increase dropped packets count of the qdisc. All test results: 1..16 ok 1 a519 - Add bfifo qdisc with system default parameters on egress ok 2 585c - Add pfifo qdisc with system default parameters on egress ok 3 a86e - Add bfifo qdisc with system default parameters on egress with handle of maximum value ok 4 9ac8 - Add bfifo qdisc on egress with queue size of 3000 bytes ok 5 f4e6 - Add pfifo qdisc on egress with queue size of 3000 packets ok 6 b1b1 - Add bfifo qdisc with system default parameters on egress with invalid handle exceeding maximum value ok 7 8d5e - Add bfifo qdisc on egress with unsupported argument ok 8 7787 - Add pfifo qdisc on egress with unsupported argument ok 9 c4b6 - Replace bfifo qdisc on egress with new queue size ok 10 3df6 - Replace pfifo qdisc on egress with new queue size ok 11 7a67 - Add bfifo qdisc on egress with queue size in invalid format ok 12 1298 - Add duplicate bfifo qdisc on egress ok 13 45a0 - Delete nonexistent bfifo qdisc ok 14 972b - Add prio qdisc on egress with invalid format for handles ok 15 4d39 - Delete bfifo qdisc twice ok 16 d774 - Check pfifo_head_drop qdisc enqueue behaviour when limit == 0 Signed-off-by: Quang Le <[email protected]> Signed-off-by: Cong Wang <[email protected]> Signed-off-by: NipaLocal <nipa@local>

qdisc_tree_reduce_backlog() notifies parent qdisc only if child qdisc becomes empty, therefore we need to reduce the backlog of the child qdisc before calling it. Otherwise it would miss the opportunity to call cops->qlen_notify(), in the case of DRR, it resulted in UAF since DRR uses ->qlen_notify() to maintain its active list. Fixes: f8d4bc4 ("net/sched: netem: account for backlog updates from child qdisc") Cc: Martin Ottens <[email protected]> Reported-by: Mingi Cho <[email protected]> Signed-off-by: Cong Wang <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Integrate the test case provided by Mingi Cho into TDC. All test results: 1..4 ok 1 ca5e - Check class delete notification for ffff: ok 2 e4b7 - Check class delete notification for root ffff: ok 3 33a9 - Check ingress is not searchable on backlog update ok 4 a4b9 - Test class qlen notification Cc: Mingi Cho <[email protected]> Signed-off-by: Cong Wang <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Add read only access to the 32-entry MAC address TCAM via debugfs. BMC filtering shares the same table so this is quite useful to access during debug. See next commit for an example output. Signed-off-by: Alexander Duyck <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

…en adding unicast addrs I realized when we were adding unicast addresses we were enabling promiscous mode. I did a bit of digging and realized we had overlooked setting the driver private flag to indicate we supported unicast filtering. Example below shows the table with 00deadbeef01 as the main NIC address, and 5 additional addresses in the 00deadbeefX0 format. # cat $dbgfs/mac_addr Idx S TCAM Bitmap Addr/Mask ---------------------------------- 00 0 00000000,00000000 000000000000 000000000000 01 0 00000000,00000000 000000000000 000000000000 02 0 00000000,00000000 000000000000 000000000000 ... 24 0 00000000,00000000 000000000000 000000000000 25 1 00100000,00000000 00deadbeef50 000000000000 26 1 00100000,00000000 00deadbeef40 000000000000 27 1 00100000,00000000 00deadbeef30 000000000000 28 1 00100000,00000000 00deadbeef20 000000000000 29 1 00100000,00000000 00deadbeef10 000000000000 30 1 00100000,00000000 00deadbeef01 000000000000 31 0 00000000,00000000 000000000000 000000000000 Before rule 31 would be active. With this change it correctly sticks to just the unicast filters. Signed-off-by: Alexander Duyck <[email protected]> Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

-Wflex-array-member-not-at-end was introduced in GCC-14, and we are getting ready to enable it, globally. Remove unused flexible-array member `buf` and, with this, fix the following warnings: drivers/net/ethernet/aquantia/atlantic/aq_hw.h:197:36: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] drivers/net/ethernet/aquantia/atlantic/hw_atl/../aq_hw.h:197:36: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] Suggested-by: Igor Russkikh <[email protected]> Signed-off-by: Gustavo A. R. Silva <[email protected]> Signed-off-by: NipaLocal <nipa@local>

-Wflex-array-member-not-at-end was introduced in GCC-14, and we are getting ready to enable it, globally. Move the conflicting declaration to the end of the structure. Notice that `struct ethtool_dump` is a flexible structure --a structure that contains a flexible-array member. Fix the following warning: ./drivers/net/ethernet/chelsio/cxgb4/cxgb4.h:1215:29: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] Reviewed-by: Simon Horman <[email protected]> Signed-off-by: Gustavo A. R. Silva <[email protected]> Signed-off-by: NipaLocal <nipa@local>

-Wflex-array-member-not-at-end was introduced in GCC-14, and we are getting ready to enable it, globally. So, in order to avoid ending up with a flexible-array member in the middle of other structs, we use the `struct_group_tagged()` helper to create a new tagged `struct mlx5e_umr_wqe_hdr`. This structure groups together all the members of the flexible `struct mlx5e_umr_wqe` except the flexible array. As a result, the array is effectively separated from the rest of the members without modifying the memory layout of the flexible structure. We then change the type of the middle struct member currently causing trouble from `struct mlx5e_umr_wqe` to `struct mlx5e_umr_wqe_hdr`. We also want to ensure that when new members need to be added to the flexible structure, they are always included within the newly created tagged struct. For this, we use `static_assert()`. This ensures that the memory layout for both the flexible structure and the new tagged struct is the same after any changes. This approach avoids having to implement `struct mlx5e_umr_wqe_hdr` as a completely separate structure, thus preventing having to maintain two independent but basically identical structures, closing the door to potential bugs in the future. We also use `container_of()` whenever we need to retrieve a pointer to the flexible structure, through which we can access the flexible-array member, if necessary. So, with these changes, fix 124 of the following warnings: drivers/net/ethernet/mellanox/mlx5/core/en.h:664:48: warning: structure containing a flexible array member is not at the end of another structure [-Wflex-array-member-not-at-end] Signed-off-by: Gustavo A. R. Silva <[email protected]> Signed-off-by: NipaLocal <nipa@local>

tc_actions.sh keeps hanging the forwarding tests. sdf@: tdc & tdc-dbg started intermittenly failing around Sep 25th Signed-off-by: NipaLocal <nipa@local>

Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

It won't show up in 6.8 as kmemleak did not report per-cpu allocation leaks. But even with the latest kernel, it's probabilistic, some data somewhere may look like a pointer and not be reported (I couldn't reproduce it on arm64). It turns out to be a false positive. The __percpu pointers are referenced from node_data[]. The latter is populated in alloc_node_data() and kmemleak registers the pg_data_t object from the memblock allocation. However, due to an incorrect pfn range check introduced by commit 84c3262 ("mm: kmemleak: check physical address when scan"), we ignore the node_data[0] allocation. Some printks in alloc_node_data() show: nd_pa = 0x3ffda140 nd_size = 0x4ec0 min_low_pfn = 0x0 max_low_pfn = 0x3ffdf nd_pa + nd_size == 0x3ffdf000 So the "PHYS_PFN(nd_pa + nd_size) >= max_low_pfn" check in kmemleak is true and the whole pg_data_t object ignored (not scanned). The __percpu pointers won't be detected. The fix is simple: Signed-off-by: NipaLocal <nipa@local>

On Wed, Jan 29, 2025 at 08:13:02AM +0100, Greg Kroah-Hartman wrote: > > Both are needed, actually. Slightly longer term I would rather > > split full_proxy_{read,write,lseek}() into short and full variant, > > getting rid of the "check which pointer is non-NULL" and killed > > the two remaining users of debugfs_real_fops() outside of > > fs/debugfs/file.c; then we could union these ->..._fops pointers, > > but until then they need to be initialized. > > > > And yes, ->methods obviously needs to be initialized. > > > > Al, bloody embarrassed ;-/ > > No worries, want to send a patch to fix both of these up so we can fix > up Linus's tree now? [PATCH] Fix the missing initializations in __debugfs_file_get() both method table pointers in debugfs_fsdata need to be initialized, obviously, and calculating the bitmap of present methods would also go better if we start with initialized state. Fixes: 41a0ecc "debugfs: get rid of dynamically allocation proxy_ops" Fucked-up-by: Al Viro <[email protected]> Signed-off-by: Al Viro <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Disable tests we don't care about, we use alltests in kunit. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

# Conflicts: # drivers/firmware/cirrus/Kconfig # lib/Kconfig.debug

kuba-moo force-pushed the to-test branch from 6bd5e75 to bdd05e2 Compare March 27, 2024 21:49

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch 3 times, most recently from 4f22ee0 to 8a9a8e0 Compare March 28, 2024 04:46

kuba-moo force-pushed the to-test branch 11 times, most recently from 64c403f to 8da1f58 Compare March 29, 2024 00:01

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch 3 times, most recently from 78ebb17 to 9325308 Compare March 29, 2024 02:14

kuba-moo force-pushed the to-test branch 6 times, most recently from c8c7b2f to a71aae6 Compare March 29, 2024 18:01

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch from 9325308 to 7940ae1 Compare March 29, 2024 18:12

kuba-moo force-pushed the to-test branch 2 times, most recently from d8feb00 to b16a6b9 Compare March 30, 2024 00:01

kernel-patches-daemon-bpf bot force-pushed the bpf-next_base branch from 7940ae1 to 8f1ff3c Compare March 30, 2024 00:21

kuba-moo force-pushed the to-test branch 2 times, most recently from 4164329 to c5cecb3 Compare March 30, 2024 06:00

mmhal and others added 29 commits February 3, 2025 22:00

vsock/test: Add test for SO_LINGER null ptr deref

ca41425

Explicitly close() a TCP_ESTABLISHED (connectible) socket with SO_LINGER enabled. May trigger a null pointer dereference. Signed-off-by: Michal Luczaj <[email protected]> Signed-off-by: NipaLocal <nipa@local>

forwarding: set timeout to 3 hours

5c9e844

tc_actions.sh keeps hanging the forwarding tests. sdf@: tdc & tdc-dbg started intermittenly failing around Sep 25th Signed-off-by: NipaLocal <nipa@local>

profile patch

3d922d2

Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

tc_action dbg

2845f2a

Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

selftests: net: enable profiling

0b5dc21

Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

kunit: try to disable broken and unneccessary tests

22d9273

Disable tests we don't care about, we use alltests in kunit. Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: NipaLocal <nipa@local>

Merge branch 'net-next-2025-02-04--06-00' into HEAD

d13f53a

# Conflicts: # drivers/firmware/cirrus/Kconfig # lib/Kconfig.debug

kuba-moo force-pushed the to-test branch from d5c0a9b to d13f53a Compare February 4, 2025 06:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

netdev CI testing #6666

netdev CI testing #6666

kuba-moo commented Mar 27, 2024

netdev CI testing #6666

Are you sure you want to change the base?

netdev CI testing #6666

Conversation

kuba-moo commented Mar 27, 2024