-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rust: control factory: fix bug of control factory #13
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Makefile: make: add rros_defconfig Message The rros_defconfig is used to enable RROS kernel config and related config, such as `General setup->Rust support` and `Kernel hacking->Rust hacking->Optimization level`. BUPT-OS#10
- The `PAGE_SHARED` setting error caused the memory of mmap to be unwritable. I temporarily modified it in a HACK way, but I consider that this part could be implemented by Rust-For-Linux. - The `rust_helper_pa` function calls the `__virt_to_phys macro` in C code. The type of this macro is phys_addr_t (typedef u64 phys_addr_t). The return type is int, which leads to truncation of the return value, so I changed it to unsigned long. BUPT-OS#12
Maybe you need to pull the latest mainline? Why do you push three commits? |
The CI failed with
Fix it and Organize your commits with |
This was referenced Dec 17, 2023
Merged
There is a new pull request #29 . Close this. |
shannmu
pushed a commit
to shannmu/RROS
that referenced
this pull request
Dec 15, 2024
When a system with E810 with existing VFs gets rebooted the following hang may be observed. Pid 1 is hung in iavf_remove(), part of a network driver: PID: 1 TASK: ffff965400e5a340 CPU: 24 COMMAND: "systemd-shutdow" #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb BUPT-OS#1 [ffffaad04005fae8] schedule at ffffffff8b323e2d BUPT-OS#2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc BUPT-OS#3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930 BUPT-OS#4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf] BUPT-OS#5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513 BUPT-OS#6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa BUPT-OS#7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc BUPT-OS#8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e BUPT-OS#9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429 BUPT-OS#10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4 BUPT-OS#11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice] BUPT-OS#12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice] BUPT-OS#13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice] BUPT-OS#14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1 BUPT-OS#15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386 BUPT-OS#16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870 BUPT-OS#17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6 BUPT-OS#18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159 BUPT-OS#19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc BUPT-OS#20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d BUPT-OS#21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169 BUPT-OS#22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b RIP: 00007f1baa5c13d7 RSP: 00007fffbcc55a98 RFLAGS: 00000202 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1baa5c13d7 RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead RBP: 00007fffbcc55ca0 R8: 0000000000000000 R9: 00007fffbcc54e90 R10: 00007fffbcc55050 R11: 0000000000000202 R12: 0000000000000005 R13: 0000000000000000 R14: 00007fffbcc55af0 R15: 0000000000000000 ORIG_RAX: 00000000000000a9 CS: 0033 SS: 002b During reboot all drivers PM shutdown callbacks are invoked. In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE. In ice_shutdown() the call chain above is executed, which at some point calls iavf_remove(). However iavf_remove() expects the VF to be in one of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If that's not the case it sleeps forever. So if iavf_shutdown() gets invoked before iavf_remove() the system will hang indefinitely because the adapter is already in state __IAVF_REMOVE. Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE, as we already went through iavf_shutdown(). Fixes: 9745780 ("iavf: Add waiting so the port is initialized in remove") Fixes: a841733 ("iavf: Fix race condition between iavf_shutdown and iavf_remove") Reported-by: Marius Cornea <[email protected]> Signed-off-by: Stefan Assmann <[email protected]> Reviewed-by: Michal Kubiak <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]>
shannmu
pushed a commit
to shannmu/RROS
that referenced
this pull request
Dec 15, 2024
The following processes run into a deadlock. CPU 41 was waiting for CPU 29 to handle a CSD request while holding spinlock "crashdump_lock", but CPU 29 was hung by that spinlock with IRQs disabled. PID: 17360 TASK: ffff95c1090c5c40 CPU: 41 COMMAND: "mrdiagd" !# 0 [ffffb80edbf37b58] __read_once_size at ffffffff9b871a40 include/linux/compiler.h:185:0 !# 1 [ffffb80edbf37b58] atomic_read at ffffffff9b871a40 arch/x86/include/asm/atomic.h:27:0 !# 2 [ffffb80edbf37b58] dump_stack at ffffffff9b871a40 lib/dump_stack.c:54:0 # 3 [ffffb80edbf37b78] csd_lock_wait_toolong at ffffffff9b131ad5 kernel/smp.c:364:0 # 4 [ffffb80edbf37b78] __csd_lock_wait at ffffffff9b131ad5 kernel/smp.c:384:0 # 5 [ffffb80edbf37bf8] csd_lock_wait at ffffffff9b13267a kernel/smp.c:394:0 # 6 [ffffb80edbf37bf8] smp_call_function_many at ffffffff9b13267a kernel/smp.c:843:0 # 7 [ffffb80edbf37c50] smp_call_function at ffffffff9b13279d kernel/smp.c:867:0 # 8 [ffffb80edbf37c50] on_each_cpu at ffffffff9b13279d kernel/smp.c:976:0 # 9 [ffffb80edbf37c78] flush_tlb_kernel_range at ffffffff9b085c4b arch/x86/mm/tlb.c:742:0 BUPT-OS#10 [ffffb80edbf37cb8] __purge_vmap_area_lazy at ffffffff9b23a1e0 mm/vmalloc.c:701:0 BUPT-OS#11 [ffffb80edbf37ce0] try_purge_vmap_area_lazy at ffffffff9b23a2cc mm/vmalloc.c:722:0 BUPT-OS#12 [ffffb80edbf37ce0] free_vmap_area_noflush at ffffffff9b23a2cc mm/vmalloc.c:754:0 BUPT-OS#13 [ffffb80edbf37cf8] free_unmap_vmap_area at ffffffff9b23bb3b mm/vmalloc.c:764:0 BUPT-OS#14 [ffffb80edbf37cf8] remove_vm_area at ffffffff9b23bb3b mm/vmalloc.c:1509:0 BUPT-OS#15 [ffffb80edbf37d18] __vunmap at ffffffff9b23bb8a mm/vmalloc.c:1537:0 BUPT-OS#16 [ffffb80edbf37d40] vfree at ffffffff9b23bc85 mm/vmalloc.c:1612:0 BUPT-OS#17 [ffffb80edbf37d58] megasas_free_host_crash_buffer [megaraid_sas] at ffffffffc020b7f2 drivers/scsi/megaraid/megaraid_sas_fusion.c:3932:0 BUPT-OS#18 [ffffb80edbf37d80] fw_crash_state_store [megaraid_sas] at ffffffffc01f804d drivers/scsi/megaraid/megaraid_sas_base.c:3291:0 BUPT-OS#19 [ffffb80edbf37dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 BUPT-OS#20 [ffffb80edbf37dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 BUPT-OS#21 [ffffb80edbf37de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 BUPT-OS#22 [ffffb80edbf37e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 BUPT-OS#23 [ffffb80edbf37ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 BUPT-OS#24 [ffffb80edbf37ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 BUPT-OS#25 [ffffb80edbf37ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 BUPT-OS#26 [ffffb80edbf37f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 BUPT-OS#27 [ffffb80edbf37f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 PID: 17355 TASK: ffff95c1090c3d80 CPU: 29 COMMAND: "mrdiagd" !# 0 [ffffb80f2d3c7d30] __read_once_size at ffffffff9b0f2ab0 include/linux/compiler.h:185:0 !# 1 [ffffb80f2d3c7d30] native_queued_spin_lock_slowpath at ffffffff9b0f2ab0 kernel/locking/qspinlock.c:368:0 # 2 [ffffb80f2d3c7d58] pv_queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/paravirt.h:674:0 # 3 [ffffb80f2d3c7d58] queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/qspinlock.h:53:0 # 4 [ffffb80f2d3c7d68] queued_spin_lock at ffffffff9b8961a6 include/asm-generic/qspinlock.h:90:0 # 5 [ffffb80f2d3c7d68] do_raw_spin_lock_flags at ffffffff9b8961a6 include/linux/spinlock.h:173:0 # 6 [ffffb80f2d3c7d68] __raw_spin_lock_irqsave at ffffffff9b8961a6 include/linux/spinlock_api_smp.h:122:0 # 7 [ffffb80f2d3c7d68] _raw_spin_lock_irqsave at ffffffff9b8961a6 kernel/locking/spinlock.c:160:0 # 8 [ffffb80f2d3c7d88] fw_crash_buffer_store [megaraid_sas] at ffffffffc01f8129 drivers/scsi/megaraid/megaraid_sas_base.c:3205:0 # 9 [ffffb80f2d3c7dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 BUPT-OS#10 [ffffb80f2d3c7dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 BUPT-OS#11 [ffffb80f2d3c7de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 BUPT-OS#12 [ffffb80f2d3c7e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 BUPT-OS#13 [ffffb80f2d3c7ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 BUPT-OS#14 [ffffb80f2d3c7ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 BUPT-OS#15 [ffffb80f2d3c7ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 BUPT-OS#16 [ffffb80f2d3c7f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 BUPT-OS#17 [ffffb80f2d3c7f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 The lock is used to synchronize different sysfs operations, it doesn't protect any resource that will be touched by an interrupt. Consequently it's not required to disable IRQs. Replace the spinlock with a mutex to fix the deadlock. Signed-off-by: Junxiao Bi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Mike Christie <[email protected]> Cc: [email protected] Signed-off-by: Martin K. Petersen <[email protected]>
shannmu
pushed a commit
to shannmu/RROS
that referenced
this pull request
Dec 15, 2024
Currently when rmdir on an instance is done, eventfs_remove_events_dir() is called and it does a dput on the dentry and then frees the eventfs_inode that represents the events directory. But there's no protection against a reader reading the top level events directory at the same time and we can get a use after free error. Instead, use the dput() associated to the dentry to also free the eventfs_inode associated to the events directory, as that will get called when the last reference to the directory is released. This issue triggered the following KASAN report: ================================================================== BUG: KASAN: slab-use-after-free in eventfs_root_lookup+0x88/0x1b0 Read of size 8 at addr ffff888120130ca0 by task ftracetest/1201 CPU: 4 PID: 1201 Comm: ftracetest Not tainted 6.5.0-test-10737-g469e0a8194e7 BUPT-OS#13 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x57/0x90 print_report+0xcf/0x670 ? __pfx_ring_buffer_record_off+0x10/0x10 ? _raw_spin_lock_irqsave+0x2b/0x70 ? __virt_addr_valid+0xd9/0x160 kasan_report+0xd4/0x110 ? eventfs_root_lookup+0x88/0x1b0 ? eventfs_root_lookup+0x88/0x1b0 eventfs_root_lookup+0x88/0x1b0 ? eventfs_root_lookup+0x33/0x1b0 __lookup_slow+0x194/0x2a0 ? __pfx___lookup_slow+0x10/0x10 ? down_read+0x11c/0x330 walk_component+0x166/0x220 link_path_walk.part.0.constprop.0+0x3a3/0x5a0 ? seqcount_lockdep_reader_access+0x82/0x90 ? __pfx_link_path_walk.part.0.constprop.0+0x10/0x10 path_openat+0x143/0x11f0 ? __lock_acquire+0xa1a/0x3220 ? __pfx_path_openat+0x10/0x10 ? __pfx___lock_acquire+0x10/0x10 do_filp_open+0x166/0x290 ? __pfx_do_filp_open+0x10/0x10 ? lock_is_held_type+0xce/0x120 ? preempt_count_sub+0xb7/0x100 ? _raw_spin_unlock+0x29/0x50 ? alloc_fd+0x1a0/0x320 do_sys_openat2+0x126/0x160 ? rcu_is_watching+0x34/0x60 ? __pfx_do_sys_openat2+0x10/0x10 ? __might_resched+0x2cf/0x3b0 ? __fget_light+0xdf/0x100 __x64_sys_openat+0xcd/0x140 ? __pfx___x64_sys_openat+0x10/0x10 ? syscall_enter_from_user_mode+0x22/0x90 ? lockdep_hardirqs_on+0x7d/0x100 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 RIP: 0033:0x7f1dceef5e51 Code: 75 57 89 f0 25 00 00 41 00 3d 00 00 41 00 74 49 80 3d 9a 27 0e 00 00 74 6d 89 da 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 93 00 00 00 48 8b 54 24 28 64 48 2b 14 25 RSP: 002b:00007fff2cddf380 EFLAGS: 00000202 ORIG_RAX: 0000000000000101 RAX: ffffffffffffffda RBX: 0000000000000241 RCX: 00007f1dceef5e51 RDX: 0000000000000241 RSI: 000055d7520677d0 RDI: 00000000ffffff9c RBP: 000055d7520677d0 R08: 000000000000001e R09: 0000000000000001 R10: 00000000000001b6 R11: 0000000000000202 R12: 0000000000000000 R13: 0000000000000003 R14: 000055d752035678 R15: 000055d752067788 </TASK> Allocated by task 1200: kasan_save_stack+0x2f/0x50 kasan_set_track+0x21/0x30 __kasan_kmalloc+0x8b/0x90 eventfs_create_events_dir+0x54/0x220 create_event_toplevel_files+0x42/0x130 event_trace_add_tracer+0x33/0x180 trace_array_create_dir+0x52/0xf0 trace_array_create+0x361/0x410 instance_mkdir+0x6b/0xb0 tracefs_syscall_mkdir+0x57/0x80 vfs_mkdir+0x275/0x380 do_mkdirat+0x1da/0x210 __x64_sys_mkdir+0x74/0xa0 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Freed by task 1251: kasan_save_stack+0x2f/0x50 kasan_set_track+0x21/0x30 kasan_save_free_info+0x27/0x40 __kasan_slab_free+0x106/0x180 __kmem_cache_free+0x149/0x2e0 event_trace_del_tracer+0xcb/0x120 __remove_instance+0x16a/0x340 instance_rmdir+0x77/0xa0 tracefs_syscall_rmdir+0x77/0xc0 vfs_rmdir+0xed/0x2d0 do_rmdir+0x235/0x280 __x64_sys_rmdir+0x5f/0x90 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x6e/0xd8 The buggy address belongs to the object at ffff888120130ca0 which belongs to the cache kmalloc-16 of size 16 The buggy address is located 0 bytes inside of freed 16-byte region [ffff888120130ca0, ffff888120130cb0) The buggy address belongs to the physical page: page:000000004dbddbb0 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x120130 flags: 0x17ffffc0000800(slab|node=0|zone=2|lastcpupid=0x1fffff) page_type: 0xffffffff() raw: 0017ffffc0000800 ffff8881000423c0 dead000000000122 0000000000000000 raw: 0000000000000000 0000000000800080 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected Memory state around the buggy address: ffff888120130b80: 00 00 fc fc 00 05 fc fc 00 00 fc fc 00 02 fc fc ffff888120130c00: 00 07 fc fc 00 00 fc fc 00 00 fc fc fa fb fc fc >ffff888120130c80: 00 00 fc fc fa fb fc fc 00 00 fc fc 00 00 fc fc ^ ffff888120130d00: 00 00 fc fc 00 00 fc fc 00 00 fc fc fa fb fc fc ffff888120130d80: 00 00 fc fc 00 00 fc fc 00 00 fc fc 00 00 fc fc ================================================================== Link: https://lkml.kernel.org/r/[email protected] Link: https://lore.kernel.org/all/[email protected]/ Cc: Ajay Kaher <[email protected]> Cc: Masami Hiramatsu <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Andrew Morton <[email protected]> Fixes: 5bdcd5f eventfs: ("Implement removal of meta data from eventfs") Tested-by: Linux Kernel Functional Testing <[email protected]> Tested-by: Naresh Kamboju <[email protected]> Reported-by: Zheng Yejian <[email protected]> Signed-off-by: Steven Rostedt (Google) <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PAGE_SHARED
setting error caused the memory of mmap to be unwritable. I temporarily modified it in a HACK way, but I consider that this part could be implemented by Rust-For-Linux.rust_helper_pa
function calls the__virt_to_phys macro
in C code. The type of this macro is phys_addr_t (typedef u64 phys_addr_t). The return type is int, which leads to truncation of the return value, so I changed it to unsigned long.#12