-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add descriptor parsing. #1
Conversation
Nice! I've been meaning to open some "next steps" issues here, with descriptors near the top of that list, but you beat me to it. For Windows, I believe this ioctl on the parent hub device allows getting descriptors from any device, not just those with the WinUSB driver bound. We already have code to find the parent hub device and port number for a similar ioctl used to get several fields for I also have the |
/// Returns the device's unparsed descriptors. | ||
/// | ||
/// To access parsed data, use [`configurations()`](Self::configurations). | ||
pub fn descriptors(&self) -> Result<Vec<u8>, Error> { | ||
crate::platform::get_descriptors(self) | ||
} | ||
|
||
/// Returns a collection of the device's configurations. | ||
pub fn configurations(&self) -> Result<impl Iterator<Item = descriptor::Configuration>, Error> { | ||
let descriptors = self.descriptors()?; | ||
descriptor::parse_configurations(&descriptors) | ||
.map_err(|_| Error::new(ErrorKind::Unsupported, "malformed descriptor")) | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
descriptors
is probably too generic of a name. It seems like other libraries call this "get config descriptor" even though it contains other nested descriptors and not strictly just the configuration descriptor.
Eventually, I think we'll want a fn get_descriptor(descriptor_type: u8, descriptor_index: u8) -> Result<Vec<u8>>
since that's ultimately what the request on the bus looks like. Then convenience methods like get_string_descriptor
, get_config_descriptor
that call into it, or use OS and type specific APIs to get cached data, and return more specific types. We do want to be clear what DeviceInfo
methods potentially perform IO vs return cached data.
Also not sure whether it should return all configurations, or just specified or current configuration like libusb. Multi-configuration devices are rare in any case.
What if instead of separate methods for raw and parsed access, the Configuration
wrapped Vec<u8>
instead of doing all the parsing upfront? Then it could have an into_inner()
and/or as_slice()
for raw access, and methods to iterate and parse child descriptors on demand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
descriptors is probably too generic of a name. It seems like other libraries call this "get config descriptor" even though it contains other nested descriptors and not strictly just the configuration descriptor.
I mirrored the Linux API, which gives device descriptor + all config descriptors concatenated. It's 99% likely what someone enumerating devices to decide which one to open needs.
Eventually, I think we'll want a fn get_descriptor(descriptor_type: u8, descriptor_index: u8) -> Result<Vec> since that's ultimately what the request on the bus looks like. Then convenience methods like get_string_descriptor, get_config_descriptor that call into it, or use OS and type specific APIs to get cached data, and return more specific types. We do want to be clear what DeviceInfo methods potentially perform IO vs return cached data.
I feel the general get_descriptor
would have to do bus IO? It'd be strange to have it cache the "main" descriptors (device, configuration) but do IO for the others. IIUC sysfs /descriptors
is cached by the kernel on device plug, it doesn't do IO.
IMO if get_descriptor
does IO it is another "feature", and perhaps it needs to go into Device
instead of DeviceInfo
. Perhaps we can remove DeviceInfo::descriptors()
for now, and think about adding something later?
What if instead of separate methods for raw and parsed access, the Configuration wrapped Vec instead of doing all the parsing upfront? Then it could have an into_inner() and/or as_slice() for raw access, and methods to iterate and parse child descriptors on demand.
if parsing is on-demand, iterators have to be impl Iterator<Item = Result<...>>
which I find much less ergonomic. It's nicer to parse everything upfront, so you can handle errors in 1 place only, and have all the iteration be infallible. It also allows grouping interface altesettings by number.
Also parsing on-demand is a bit harder: to go from one configuration to the next you have to iterate all descriptors one-by-one, vs the current code relies on the fact that parsing a child advances the reader, leaving it at the next child.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mirrored the Linux API, which gives device descriptor + all config descriptors concatenated. It's 99% likely what someone enumerating devices to decide which one to open needs.
The common case is going to be having a single relevant configuration, and if it has multiple, you're kind of stuck anyway since WinUSB can't change the configuration.
The Linux API is certainly the nicer one here, but one of the failings of libusb that I'm trying to avoid is designing everything around Linux and then emulating that less-elegantly on the other OSs. On the other platforms and on the USB bus itself, the request returns a single configuration descriptor and all its associated interfaces, endpoints, etc.
I feel the general get_descriptor would have to do bus IO?
As far as I can tell, getting any raw descriptor on Windows (with either WinUSB or the hub ioctl) may perform IO, without documentation making promises that particular descriptors are going to be cached. macOS is a similar story.. Could test this and see for particular descriptors in particular circumstances, but I'm not sure how much to rely on undocumented behavior. So it's Linux as the outlier that guarantees that it will cache the descriptor read at enumeration time.
Yeah, I was also questioning whether it should be on DeviceInfo
or Device
for that reason. In terms of use cases:
lsusb -v
-like enumerating many devices and dumping as much info as possible -> would slightly prefer it onDeviceInfo
since it's not really opening the device to use it, but could do either and isn't necessarily the case to optimize for.- finding a device from the list that matches something from a descriptor to decide what to open -> would prefer
DeviceInfo
, and would strongly prefer not to do IO. nusb already has the serial number string descriptor inDeviceInfo
, which is a very common need that it's crazy that libusb requires opening the device and performing IO for. Maybe other attributes for this use case (interface class and subclass IDs?) could similarly be obtained cross-platform inDeviceInfo
using APIs that guarantee no IO, short of the full descriptor. - with an opened device, figuring out what interfaces or endpoints to use, or reading class-specific descriptors -> Would prefer
Device
or motivate a way to go back fromDevice
toDeviceInfo
. Already has the device open, doesn't care too much if it requires IO, and wants as much flexibility as possible in what it can retrieve.
So that makes me lean towards putting this on Device
, and drawing the line at "Device
does IO, and DeviceInfo
doesn't". I guess there could be a Linux-specific DeviceInfo
method that reads the sysfs attribute, and I'd be fine with that one returning all the configurations.
if parsing is on-demand, iterators have to be impl Iterator<Item = Result<...>> which I find much less ergonomic. It's nicer to parse everything upfront, so you can handle errors in 1 place only, and have all the iteration be infallible. It also allows grouping interface altesettings by number.
This is a good point. However on the other hand, I've come across devices with broken descriptors, and it would be nice if the non-broken parts were accessible instead of just returning an error for the whole thing. Not sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how broken? currently this code requires only device, config, iface, endpoint descriptors to be well-formed (others are only required to have a good bLength
so we can skip over them). If those are broken, wouldn't the OS reject the device? In that case I'd say it's OK for nusb
to fail as well. Otherwise, if the OS does accept the broken descriptors we could adjust our parser to match.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, the scenarios where it would help would basically be limited to bLength
being wrong for the descriptor type but correctly pointing to the next descriptor, or if the chain is broken, allowing access to descriptors prior to the problem.
Looks like libusb also bails out and returns nothing when handling most errors. So that's probably fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By the way, I just rearranged the Linux transfer code a little bit to allow submitting control transfers without claiming an interface, which the generic get_descriptor
could be built on for descriptors that aren't cached.
yeah I saw libusb does that... seems cursed (why the parent hub??) but it's definitely the way to go if it works for non-winusb devices. Will give it a try.
We discussed this a bit for embassy-usb. Also another challenge is keeping the descriptors and the "behavior" in sync. It has a builder that creates both the endpoints/interfaces, and the descriptors for them, so as long as you use that the "behavior" of the device will always match what the descriptors say. If you pre-bake the descriptors in ROM then you have to manually take care to instantiate the right things at runtime to match them. See example. So I'm not sure if there's much code to be shared. What you want is very different if you're building or parsing descriptors, and if you're on std you can use upfront parsing with nice easy Vec/HashMap, on no-std you don't want alloc so you have to do things more "on-demand". |
How USB devices fit into the Windows device model is certainly very cursed, but this makes a certain amount of sense given that -- USB device nodes all have different drivers, and there's not a common "USB Device" object with a single common driver to put that functionality in. So the hub is the closest thing to that. |
Good feedback on the I guess the most that library could do at this point is the Anyway, seems like my main motivation for doing it would just be putting the crate name to good use, which doesn't seem like a good enough reason. (I considered using the |
c20c589
to
f6951d6
Compare
Regarding the It's probably not exactly what's needed for other projects in its current form, but it would be interesting to see if we could collaborate to make a |
I've started iterating on this in #13 Main differences:
Any feedback and testing appreciated. While this only directly uses a little bit of your code, this PR and the resulting discussion was quite useful in designing this. |
Merged #13, but still open to feedback if you have any when trying it out. |
udev: add function `Udev::has_devtmpfs`
This PR adds methods to
DeviceInfo
to get either un-parsed descriptors (as raw bytes), or parsed descriptors (as a tree of Configuration, Interface, InterfaceAlternateSetting, Endpoint) structs.Parsing is based off what libusb does, API is similar to rusb's.
Tested working on Linux and Windows. On windows it's only possible to get descriptors for winusb devices.