-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make raw substream data available to their corresponding user types? #5
Comments
I've missed one important use case: My
In such a case there's no parent element I could retrieve the underluying |
One could consider working with different instances of
This way all user types would always get their associated data without the need to manually consider substreams. Would even make the compilers per language a bit easier because creating the |
Ok, let me elaborate on this a little bit. First of all, KaitaiStream in its general sense (across the different languages and platforms) is a stream, and it's not really guaranteed to have all the bytes in memory (or even memory-mapped to some local file). Java implementation indeed uses a mmaped approach, which keeps things somewhat relatively simple, but other languages/platforms might not have that, so it's generally wise to think of KaitaiStream as a stream. If you want raw byte contents out of it — there's only one true way to do it — read it as bytes. If you have a KaitaiStruct-based object, this approach would always guaranteed to work in all supported languages (of course, syntax would change): FmtOmsRec omsRec;
// omsRec somehow gets populated
omsRec._io().seek(0);
byte[] omsRecRaw = omsRec._io().readBytesFull(); Of course, this might be an overkill, but it's the only way that would definitely work everywhere. That said, in Java we can cut some corners and I don't see any harm in providing access to underlying ByteBuffer, especially if it can be done in read-only maneer. Last, but not least, there's been a long-going initiative to actually reduce copying of raw byte buffers around and implementing proper substreams: kaitai-io/kaitai_struct#44 — that's probably what you'd also want to have. |
With your example and kaitai-io/kaitai_struct#44 in mind: Should all runtimes "officially" provide a way to get the bytes of their associated type? If yes, I'm going to close this issue and create one in kaitai_struct like kaitai-io/kaitai_struct#44 for all runtimes as a remainder and link all these 3 issues. If no, I'll only close here. In Java it was easy to add as a workaround, in other runtimes it might not even be a problem currently because of private fields don't exist anyway or such. But in the end, in my point of view it's a design decision one needs to make and provide a proper API, maybe with your (unoptimized) default implementation in the worst case. A name like "asRoBuffer" simply wouldn't hold if a raw In my opinion the runtimes should provide access to the raw data, simply because I needed it already. :-) |
Runtimes in general do not have raw byte data, thus there can't be easy generic way to provide access to it. The only generic way is to do the actual "seek to 0; read all bytes; return". It's possible to add exactly that to all runtimes, but I'm not sure if that's worth the effort, plus, it could encourage suboptimal programming style? Java-only methods, given possible byte buffer implementation, are perfectly ok from my point of view. |
I don't see that, either users need all bytes, like me in some cases, or they don't care. It's perfectly fine to argue that this might be a rare case not worth adding an "API" to all runtimes and therefore users need to deal with this on their own, though. Is this worth documenting? I could add a little paragraph to pitfalls, linking this discussion and such. |
Just for the records: |
I'm closing this in favour to some other statement, in the end I have my Java-runtime enhancement already. :-) |
Consider the following KSY:
The corresponding
_read
method is the following:fmt_oms_rec
is an opaque type from another file and I not only need to parse that type and access whatever I've described in the corresponding KSY, but in some circumstances I simply need to take the wholebyte[]
of such an instance and send it somewhere or store it in a database or such. It's only raw/low level read access to the data.The problem now is that while
KaitaiStream
is available to each instance based on a substream, it doesn't provide access to the raw data. One can onlyseek
, read some bytes etc., but accessing the backingbyte[]
of theByteBuffer
is not possible, even thoughByteBuffer
itself would allow it.The only way to access the raw data currently is to access the parent of an instance of
fmt_oms_rec
and use_raw_omsRec
from there. But that seems not that obvious to me and in case of an opaque type like mine, one has to cast to get the parent type and such.So, is it by design that the raw data is only accessible using the parent? Wouldn't it make sense to make it accessible using
KaitaiStream
as well? This could even be considered an implementation detail of the Java runtime and not necessarily be part of the API, likesize
,eof
or such. Because if/how the raw data is available to a user highly depends already on howKaitaiStream
is implemented in each runtime.The only thing to consider I see is if one wants to make the raw data available always or only in case of substreams. But I think making it always available and let the user decide if/how to use is the easiest.
Additionally, I would suggest simply making asReadOnlyBuffer available, using that one would have access to the underluying
byte[]
and can't break things too easily.The text was updated successfully, but these errors were encountered: