-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use RAW_IO on Windows for bulk and interrupt IN endpoints #6
Comments
I'm the author of the libusb RAW_IO support and have been running into this issue on different projects for many years. Everyone's instinct is always to try to abstract the issue away, but it's trickier than it seems. I don't think your approach of requiring all inbound bulk & interrupt transfers be a multiple of maxPacketSize is viable if you want the library to be usable for any and all devices. I think the maxPacketSize limitation should be removed from the Sometimes you will need to read a specific, small number of bytes from an endpoint, and the device just doesn't have any more than that to give you. Think simple serial protocols implemented over a pair of bulk endpoints, for instance. Making the host keep polling for more bytes than are actually expected may cause unwanted behaviour. Even if it is safe to keep polling, the maxPacketSize limitation means that the caller cannot express their wish for a transfer to return once it has N bytes, or at a given timeout. They must set either set a long timeout, and wait unnecessarily for more bytes even after N have arrived, or set a short one that works in most cases and then need to handle the case where the device is a little slower. It would make sense for the This raises a related issue, though: does a However, the current API appears to allow for individual transfers to be submitted on an endpoint, whilst that endpoint is simultaneously in use by a Personally I would have been inclined to make the API use an |
If you submit an IN transfer of maxPacketSize, and the device responds with a packet that is smaller than that, it's a short packet that ends the transfer and you get the data immediately, right? Or does WinUSB with RAW_IO not behave like other OSs in that regard? It seems like the only drawback is requiring you to allocate more memory than you actually use if you know the device never sends a full packet, but we're talking about <1024 bytes per transfer here, so not a big deal on the host.
Currently, no. If you use multiple queues on an endpoint or submit transfers outside of a queue, the ordering is not defined and you probably shouldn't do this, but it won't do anything bad.
This was my initial plan, basically making |
Ah, we're thinking at different layers of abstraction. I was thinking in terms of asking for length N causing the library to make as many requests to the OS as needed to fill that length, until hitting a timeout if set. Since you're treating a short packet as ending a transfer, and also requiring the use of a specific There are also cases where one may need to receive exactly N bytes from an endpoint, and where those N bytes may come in multiple short packets, but that can be addressed one layer above this API.
It may not do anything bad in the sense of memory safety, but it still seems bad - is there any use case where it would do something useful?
I think you're seeing a false dichotomy here. It's not a question of either Requesting an If you wanted to use the |
I started sketching out what a PR for this might look like. There's a few decisions to take. The first one is whether to require a program to specifically request RAW_IO. Personally I would love it if everything just worked, at maximum performance, without downstream developers ever having to think specifically about this nasty little quirk of WinUSB. And unlike libusb, you've made the decision upfront to require that all transfers be a multiple of the endpoint's maximum packet size, which makes it possible for this to be done transparently. So I'm in favour of this being fully automatic, but making that choice does impose some new complexity: nusb would have to transparently split up large transfers into smaller If nusb does take that approach, there's a further choice: when should it actually make the necessary Alternatively, the This is one area where I feel like including an |
Yes, I think it would ideally be automatic. What kind of per-transfer size limits are we talking about here? I remember seeing 4 MB somewhere, which is similar in magnitude to what you might want to limit yourself to on Linux given its 16MB default system wide transfer limit. Wondering if it's viable to punt that to the library user, either by documentation or by exposing the limit in a new API (though is there an API that would make sense for that cross-platform?). My design philosophy with nusb has been to keep everything as thin of a wrapper around OS capabilities as possible and avoid hidden internal state. So I think it would ideally not split up transfers, but I could be convinced that it's worth it here if the limit is small. Libusb goes too far in trying to cover up OS differences, but it is possible to go too far in the other direction and make it hard to write portable code.
Yeah, I've tentatively decided that this is a good idea for v0.2 or v1.0. I kind of want to get the library feature-complete before redesigning the API though just to make sure all constraints are considered (e.g. #11, #47). |
In older versions of Windows IIRC it can be as small as 64KB, minus a page or so. These days it's usually 2.1MB (0x1FEC00), but I think it can also be ~256KB for Low/Full speed devices. The problem with trying to document it is that there is no guarantee of these numbers. There's no defined constants, you can only query what the limit is at runtime for a specific endpoint. We could document the behaviour seen in practice, but Windows could legitimately just decide to start using different limits.
I think that the options are either:
Personally I'm in favour of option 1. I feel like option 2 just creates some extra needless hoops for the caller to jump through, without really providing any additional control or insight as a result. From the program's point of view, the possible outcomes for an internally-subdivided transfer are exactly the same as for one that's submitted as a unit: it either completes successfully, or it completes partially with an error. Option 2 may also lead to surprising differences in behaviour when the same program is run on different systems. Consider a program that needs to make a very large read. It queries the maximum transfer size it can use, and splits its transfers accordingly, as required. On the developer's Windows system the program reads at say 20-40MB/s and the max transfer size is 2.1MB. They find they can use the completion of a transfer as a good opportunity to update a progress bar during a large read operation; on Windows it will update smoothly at 10-20Hz or so. Then a user runs that same program on another OS with no transfer size limit, and now the progress bar hangs until the end of the operation. Or they run it on a system with a very small transfer limit, and now it eats a lot of CPU because it's trying to redraw the UI too often. If we choose option 1, then the behaviour from the point of view of the caller is always consistent. If you submit a large transfer, you get a single event when it completes. If you want more frequent events you submit smaller transfers. The observed behaviour from the caller's point of view is always the same, even if different things may be happening behind the scenes. In fact it's not necessary to choose just one or the other. We could both provide an API that allows querying the maximum OS transfer size supported for an endpoint, and also give nusb the ability to split up transfers that are larger than this. Then if the caller does not want their transfers to be split, they can query the max size and act accordingly. If they don't care, they can just submit whatever size they want - provided it's a multiple of max packet size - and nusb will deal with it.
I'm not sure how fallible it is either. But I don't think there's ever a case where we'd have to signal an error there. In the case where SetPipePolicy fails, we can just log a warning and continue without RAW_IO. It's only a performance loss. On libusb, this issue is much nastier, because potentially they may need to make it possible to disable RAW_IO, in order to go back to submitting odd-size transfers. If SetPipePolicy fails at that point, then the program is completely stuck.
That makes sense. One thing I think it would help with is reducing the amount of locking needed. Currently you may have multiple threads working different endpoints, but they all need to share the |
Rust dropped support for Windows versions older than 10, which seems like a reasonable minimum support target, and Windows 10 is the only version I've tested
Yeah, I wouldn't be comfortable with promising a specific value either, but it seems like the trend is upwards and unlikely that Microsoft would lower the limit.
Does WinUSB have a way to atomically cancel subsequent
I've been meaning to write some examples / convenience wrappers on top of Anyway, I'm not sure here. I think I'd prefer to start with the simple approach of not splitting transfers, and revisit if it actually becomes a practical problem for someone. But if you've already started that code, I'd be happy to take a look at it. |
I agree that targeting Windows 10 for support/testing is reasonable, but note that support was not completely dropped for older versions. Rust is still able to build programs for even much older Windows versions. They just no longer have Tier 1 support.
That's a great point, and no, not that I'm aware of. In all the cases I've encountered where I've needed RAW_IO, I've been streaming from endpoints where either I never expect a short packet at all, or where a short packet marks the end of the stream and no further data will follow if the device is polled again - or at least unless some control request is sent first. If you had an endpoint that will send chunks of data one after the other, with each delimited by a short packet, and you wanted to be sure to grab only one chunk and not poll further -- then I think really it's not safe to be queuing up multiple transfers at all. You can do so if you can be certain there is a feature like
That sounds good, and would certainly alleviate a lot of the hoop-jumping.
OK. I hadn't gotten as far as implementing that. So I'll go with the simpler option and see how it looks. |
api: add top-level public `UdevMonitor` API
Windows serializes IN transfers by default, negating the usefulness of
Queue
. RAW_IO disables this.Libusb is adding an option for RAW_IO and decided to not turn it on by default for fear of breaking existing code that makes requests that are not a multiple of maxPacketSize. But
nusb
has no such consideration, and already documents that IN transfers must be a multiple of maxPacketSize, because it's a good idea anyway to avoid errors on overflow. I thinkRAW_IO
should just be turned on unconditionally, and not doing so is just failing to provide the streaming behavior that users would expect.The text was updated successfully, but these errors were encountered: