Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cancellation events may not be processed #103

Open
elfmimi opened this issue Jan 1, 2025 · 3 comments
Open

Cancellation events may not be processed #103

elfmimi opened this issue Jan 1, 2025 · 3 comments

Comments

@elfmimi
Copy link
Contributor

elfmimi commented Jan 1, 2025

This issue is a sequel to the previous one I reported for probe-rs.
probe-rs/probe-rs#2995

When probe-rs is establishing connection with CMSIS-DAP probes,
it takes a moment longer than I would expect, depending on the setup of the USB devices.
one known setup I'm testing with is like this.

  1. ASMedia USB host controller ASM1142 (in my case)
  2. Use a USB Hub
  3. USB Full-Speed CMSIS-DAP debug probe
  4. Windows OS

The above mentioned issue was about the correct handling of the timeout.
But why the timeout here at the first place?

I have created a simple test code to mimic the initial part of probe-rs's connection process.
and I think I've come very close to what is happening.

the test code.
https://gist.github.com/elfmimi/a52416d87c1b616cd9c7145a6e272fe6

when it fails.

 TRACE nusb_cmsis_dap > initialized logger
 DEBUG nusb::platform::windows_winusb::enumeration > Probing device "USB\\VID_0451&PID_0002\\5&521A615&0&1"
 DEBUG nusb::platform::windows_winusb::enumeration > Probing device "USB\\VID_0819&PID_0101\\6&153B7FA1&0&1"
 DEBUG nusb::platform::windows_winusb::enumeration > Probing device "USB\\VID_FEED&PID_6061\\0"
 DEBUG nusb::platform::windows_winusb::enumeration > Probing device "USB\\VID_046D&PID_C52B\\6&153B7FA1&0&2"
 DEBUG nusb::platform::windows_winusb::enumeration > Probing device "USB\\VID_0D28&PID_0204\\0000008004A0247A00000000000000000000000097969902"
 DEBUG nusb::platform::windows_winusb::device      > Creating device for "USB\\VID_0D28&PID_0204\\0000008004A0247A00000000000000000000000097969902"
 DEBUG nusb::platform::windows_winusb::device      > Claiming device DevInst(13) interface 3 with interface path `\\?\USB#VID_0D28&PID_0204&MI_03#8&167822a1&1&0003#{cdb3b5ad-293b-4663-aa36-1aae46463776}`
 DEBUG nusb::platform::windows_winusb::transfer    > Submit transfer 0x22808423bc0 on endpoint 81 for 64 bytes IN
Start Timeout
End Timeout
 DEBUG nusb::platform::windows_winusb::transfer    > Cancelling transfer 0x22808423bc0
 DEBUG nusb::platform::windows_winusb::transfer    > Submit transfer 0x228084235f0 on endpoint 01 for 2 bytes OUT
 DEBUG nusb::platform::windows_winusb::transfer    > Handling completion for transfer 0x228084235f0
 DEBUG nusb::platform::windows_winusb::transfer    > Transfer 0x228084235f0 on endpoint 01 complete: 2 bytes transferred
 DEBUG nusb::platform::windows_winusb::transfer    > Handling completion for transfer 0x22808423bc0
bulk_out done
 DEBUG nusb::platform::windows_winusb::transfer    > Submit transfer 0x22808423c80 on endpoint 81 for 64 bytes IN

the program gets stuck and never completes.

when it is okay.

 TRACE nusb_cmsis_dap > initialized logger
 DEBUG nusb::platform::windows_winusb::enumeration > Probing device "USB\\VID_0451&PID_0002\\5&521A615&0&1"
 DEBUG nusb::platform::windows_winusb::enumeration > Probing device "USB\\VID_0819&PID_0101\\6&153B7FA1&0&1"
 DEBUG nusb::platform::windows_winusb::enumeration > Probing device "USB\\VID_FEED&PID_6061\\0"
 DEBUG nusb::platform::windows_winusb::enumeration > Probing device "USB\\VID_046D&PID_C52B\\6&153B7FA1&0&2"
 DEBUG nusb::platform::windows_winusb::enumeration > Probing device "USB\\VID_0D28&PID_0204\\0000008004A0247A00000000000000000000000097969902"
 DEBUG nusb::platform::windows_winusb::enumeration > Probing device "USB\\VID_05E3&PID_0608\\6&567194E&0&4"
 DEBUG nusb::platform::windows_winusb::enumeration > Probing device "USB\\VID_0451&PID_2046\\5&521A615&0&10"
 DEBUG nusb::platform::windows_winusb::device      > Creating device for "USB\\VID_0D28&PID_0204\\0000008004A0247A00000000000000000000000097969902"
 DEBUG nusb::platform::windows_winusb::device      > Opened WinUSB handle for \\?\USB#VID_0D28&PID_0204&MI_03#8&167822a1&1&0003#{cdb3b5ad-293b-4663-aa36-1aae46463776} (interface 3)
 DEBUG nusb::platform::windows_winusb::device      > Claiming interface 3 using handle for 3
 DEBUG nusb::platform::windows_winusb::transfer    > Submit transfer 0x2911e3a4c60 on endpoint 81 for 64 bytes IN
Start Timeout
End Timeout
 DEBUG nusb::platform::windows_winusb::transfer    > Cancelling transfer 0x2911e3a4c60
 DEBUG nusb::platform::windows_winusb::transfer    > Submit transfer 0x2911e3a4d20 on endpoint 01 for 2 bytes OUT
 DEBUG nusb::platform::windows_winusb::transfer    > Handling completion for transfer 0x2911e3a4c60
 DEBUG nusb::platform::windows_winusb::transfer    > Handling completion for transfer 0x2911e3a4d20
 DEBUG nusb::platform::windows_winusb::transfer    > Transfer 0x2911e3a4d20 on endpoint 01 complete: 2 bytes transferred
bulk_out done
 DEBUG nusb::platform::windows_winusb::transfer    > Submit transfer 0x2911e3a4990 on endpoint 81 for 64 bytes IN
 DEBUG nusb::platform::windows_winusb::transfer    > Handling completion for transfer 0x2911e3a4990
 DEBUG nusb::platform::windows_winusb::transfer    > Transfer 0x2911e3a4990 on endpoint 81 complete: 3 bytes transferred
recv 3 bytes
00, 01, 11
 DEBUG nusb::platform::windows_winusb::device      > Closing WinUSB handle for interface 3
OK

uncomment the following line, and then it starts to work fine every time.

// sleep(Duration::from_millis(1));

the same trick applies to probe-rs. adding sleep() to the end of drain() function seems to eliminate the delay I am seeing at the start of probe-rs's operation.
https://github.com/probe-rs/probe-rs/blob/5cc700f6160b1e3e40a0a569c635eed9371ec40b/probe-rs/src/probe/cmsisdap/commands/mod.rs#L252

btw, thanks for sharing the great piece of your work, the nusb!

@kevinmehall
Copy link
Owner

I tested and can't reproduce this, but I have a guess at what could be happening.

  1. The device has no data to send for the first IN transfer, because it does not yet have a command to reply to, so the transfer waits indefinitely.
  2. The Timer future completes first, so the transfer future is dropped.
  3. Dropping the transfer future requests cancellation of this transfer, but cancellation happens asynchronously.
  4. The OUT transfer is submitted, and the device receives and responds to it.
  5. The host controller has not yet processed the cancellation, so is still polling the IN endpoint and receives the response to the command sent in the OUT transfer.
  6. The cancelled transfer completes anyway, but because the future is dropped, the data is ignored.
  7. You then submit another IN transfer, but like in (1), the device doesn't have a command to reply to, so this never completes, and this one doesn't have a timeout.

This is consistent with your log showing both the event thread and the main thread handling the completion of the OUT transfer before we see the cancelled IN transfer come back. The extra delay gives the cancellation time to finish before submitting the OUT, which means the second IN transfer correctly receives the response.

You might be able to confirm this with the following patch to log the completion size and status earlier in the event handling, before it throws out the response because there is no future to wake.

diff --git a/src/platform/windows_winusb/transfer.rs b/src/platform/windows_winusb/transfer.rs
index b6134ba..14af6b9 100644
--- a/src/platform/windows_winusb/transfer.rs
+++ b/src/platform/windows_winusb/transfer.rs
@@ -326,7 +326,7 @@ impl PlatformSubmit<ControlOut<'_>> for TransferData {

 pub(super) fn handle_event(completion: *mut OVERLAPPED) {
     let completion = completion as *mut EventNotify;
-    debug!("Handling completion for transfer {completion:?}");
+    debug!("Handling completion for transfer {completion:?} size {} status {:x}", unsafe { (*completion).overlapped.InternalHigh }, unsafe { (*completion).overlapped.Internal });
     unsafe {
         let p = addr_of_mut!((*completion).ptr).read();
         notify_completion::<TransferData>(p)

I've known from the beginning that USB transfer semantics especially around cancellation are fundamentally incompatible with the way Rust Futures work. The Queue API is designed to address those issues, and in 0.2 I plan to promote the Queue API to be the main one, with convenience wrappers on top for simpler use cases. There definitely needs to be a simple way to do single transfers with a timeout that avoids these kinds of issues!

@elfmimi
Copy link
Contributor Author

elfmimi commented Jan 2, 2025

Thank you for your extensive explanation.

here are the logs with a suggested change to transfer.rs .

when it fails.

 TRACE nusb_cmsis_dap > initialized logger
 DEBUG nusb::platform::windows_winusb::enumeration > Probing device "USB\\VID_0451&PID_0002\\5&521A615&0&1"
 DEBUG nusb::platform::windows_winusb::enumeration > Probing device "USB\\VID_0819&PID_0101\\6&153B7FA1&0&1"
 DEBUG nusb::platform::windows_winusb::enumeration > Probing device "USB\\VID_FEED&PID_6061\\0"
 DEBUG nusb::platform::windows_winusb::enumeration > Probing device "USB\\VID_046D&PID_C52B\\6&153B7FA1&0&2"
 DEBUG nusb::platform::windows_winusb::enumeration > Probing device "USB\\VID_0D28&PID_0204\\0000008004A0247A00000000000000000000000097969902"
 DEBUG nusb::platform::windows_winusb::device      > Creating device for "USB\\VID_0D28&PID_0204\\0000008004A0247A00000000000000000000000097969902"
 DEBUG nusb::platform::windows_winusb::device      > Claiming device DevInst(13) interface 3 with interface path `\\?\USB#VID_0D28&PID_0204&MI_03#8&167822a1&1&0003#{cdb3b5ad-293b-4663-aa36-1aae46463776}`
 DEBUG nusb::platform::windows_winusb::transfer    > Submit transfer 0x222c3ea3f30 on endpoint 81 for 64 bytes IN
Start Timeout
End Timeout
 DEBUG nusb::platform::windows_winusb::transfer    > Cancelling transfer 0x222c3ea3f30
 DEBUG nusb::platform::windows_winusb::transfer    > Submit transfer 0x222c3ea42c0 on endpoint 01 for 2 bytes OUT
 DEBUG nusb::platform::windows_winusb::transfer    > Handling completion for transfer 0x222c3ea42c0 size 2 status 0
 DEBUG nusb::platform::windows_winusb::transfer    > Handling completion for transfer 0x222c3ea3f30 size 3 status c0000120
 DEBUG nusb::platform::windows_winusb::transfer    > Transfer 0x222c3ea42c0 on endpoint 01 complete: 2 bytes transferred
bulk_out done
 DEBUG nusb::platform::windows_winusb::transfer    > Submit transfer 0x222c3ea42f0 on endpoint 81 for 64 bytes IN

when it is okay.

 TRACE nusb_cmsis_dap > initialized logger
 DEBUG nusb::platform::windows_winusb::enumeration > Probing device "USB\\VID_0451&PID_0002\\5&521A615&0&1"
 DEBUG nusb::platform::windows_winusb::enumeration > Probing device "USB\\VID_0819&PID_0101\\6&153B7FA1&0&1"
 DEBUG nusb::platform::windows_winusb::enumeration > Probing device "USB\\VID_FEED&PID_6061\\0"
 DEBUG nusb::platform::windows_winusb::enumeration > Probing device "USB\\VID_046D&PID_C52B\\6&153B7FA1&0&2"
 DEBUG nusb::platform::windows_winusb::enumeration > Probing device "USB\\VID_0D28&PID_0204\\0000008004A0247A00000000000000000000000097969902"
 DEBUG nusb::platform::windows_winusb::device      > Creating device for "USB\\VID_0D28&PID_0204\\0000008004A0247A00000000000000000000000097969902"
 DEBUG nusb::platform::windows_winusb::device      > Claiming device DevInst(13) interface 3 with interface path `\\?\USB#VID_0D28&PID_0204&MI_03#8&167822a1&1&0003#{cdb3b5ad-293b-4663-aa36-1aae46463776}`
 DEBUG nusb::platform::windows_winusb::transfer    > Submit transfer 0x28b4e0559c0 on endpoint 81 for 64 bytes IN
Start Timeout
End Timeout
 DEBUG nusb::platform::windows_winusb::transfer    > Cancelling transfer 0x28b4e0559c0
 DEBUG nusb::platform::windows_winusb::transfer    > Submit transfer 0x28b4e055a20 on endpoint 01 for 2 bytes OUT
 DEBUG nusb::platform::windows_winusb::transfer    > Handling completion for transfer 0x28b4e0559c0 size 0 status c0000120
 DEBUG nusb::platform::windows_winusb::transfer    > Handling completion for transfer 0x28b4e055a20 size 2 status 0
 DEBUG nusb::platform::windows_winusb::transfer    > Transfer 0x28b4e055a20 on endpoint 01 complete: 2 bytes transferred
bulk_out done
 DEBUG nusb::platform::windows_winusb::transfer    > Submit transfer 0x28b4e055b10 on endpoint 81 for 64 bytes IN
 DEBUG nusb::platform::windows_winusb::transfer    > Handling completion for transfer 0x28b4e055b10 size 3 status 0
 DEBUG nusb::platform::windows_winusb::transfer    > Transfer 0x28b4e055b10 on endpoint 81 complete: 3 bytes transferred
recv 3 bytes
00, 01, 11
OK

note: 0xC0000120 is the value of STATUS_CANCELLED

Yes, now it's clear even for me that, in the failure case, 3 bytes of data is being trapped by the first IN transfer which has already been cancelled.

next I will try to implement drain() function with nusb::transfer::Queue .

@elfmimi
Copy link
Contributor Author

elfmimi commented Jan 3, 2025

I realized that I can just modify read_bulk() to achieve cancel-safety regarding timeout.

Before:

    fn read_bulk(&self, endpoint: u8, buf: &mut [u8], timeout: Duration) -> io::Result<usize> {
        let fut = async {
            let comp = self.bulk_in(endpoint, RequestBuffer::new(buf.len())).await;
            comp.status.map_err(io::Error::other)?;
            let n = comp.data.len();
            buf[..n].copy_from_slice(&comp.data);
            Ok(n)
        };
        block_on(fut.or(async {
            Timer::after(timeout).await;
            Err(std::io::ErrorKind::TimedOut.into())
        }))
    }


After:

    fn read_bulk(&self, endpoint: u8, buf: &mut [u8], timeout: Duration) -> io::Result<usize> {
        let mut queue = self.bulk_in_queue(endpoint);
        queue.submit(RequestBuffer::new(buf.len()));
        let Some(comp) = block_on(
            async {
                let comp = queue.next_complete().await;
                Some(comp)
            }
            .or(async {
                Timer::after(timeout).await;
                None
            }),
        ) else {
            queue.cancel_all();
            let _ = block_on(queue.next_complete());
            return Err(std::io::ErrorKind::TimedOut.into());
        };
        comp.status.map_err(io::Error::other)?;
        let n = comp.data.len();
        buf[..n].copy_from_slice(&comp.data);
        Ok(n)
    }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants