-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More validation for pointer casts? Use buffer protocol for importing data? #78
Comments
Neither 🙃. I rely on
Looks like
Indeed! We are validating before calling the function, but validation should be inside it |
It is. I'm not well versed enough in your code to know what kind of array you'd expect there. However if you only ever need a |
@kylebarron have a look at #80. Maybe the |
Could you explain how that can possibly be an issue? Numpy arrays can’t have variable-length items inline, it’s either fixed length or pointers, right? Otherwise the invariants around shape, stride, and itemsize aren’t upheld, right? For reference: |
Yes, but it seems odd to me to go to bytes and back for object pointers |
Hmm, the numpy crate knows We don’t actually try to deal with object arrays/strings (yet), right? We only wrap our |
Yea we deny these data types early. Also I consider the variable length data type support in
Indeed, in the end we need to arrive at the This currently matches the on-disk representation for a proposed new |
I was reading through the code for accessing the numpy data:
zarrs-python/src/lib.rs
Lines 157 to 177 in cca07fc
I'm not sure I'm well versed enough to assert that it could be unsound, but it looks a little quick to cast the pointer. It seems to expect that the input array is C-order contiguous, but doesn't provide any validation to that effect.
Instead of doing raw unsafe casts through pointers, you could use the safe
as_slice
method to access the underlying&[u8]
of a numpy array.Alternatively, you could use the buffer protocol to access data. Which is unsafe for different reasons, and requires the Python user to not mutate your buffers, but more flexible for varied inputs. https://alexgaynor.net/2022/oct/23/buffers-on-the-edge/
The text was updated successfully, but these errors were encountered: