txth format specification #840
-
I a struggling with the .txth format. The documentation states:
I would understand this such that the new line character is only used to separate top-level messages. In that case the file would look something like the following - while it is not clearly defined how each line would actually look like:
If you use osi2read.py on a SensorView .osi you end up with something like this:
So the newline character is clearly also used within messages (to separate sub-messages and fields) and the number of lines may vary depending on the content (repeated fields etc.). The format of each top-level message seems to follow protbufs "reverse-engineered" Text Format Language Specification where "incompatibilities are likely to exist" between languages. While this format might be useful for debugging, it is clearly a bit challenging to automatically parse/import as you do not clearly know when a new top-level message begins. Ignoring the possible language incompatibility part stated by protobuf, one could look for the first line of the file to get the first unique key in the most outer level layer of the .txth file and use that as a separator (assuming order will not change based on different content). In the above example this would be the "version" key. I assume the order is maintained so it might be okay to assume it is always
In case my understanding is correct
@jdsika, @pmai this currently stops me a bit from finishing my PR for the osi-utilities, it would really help me if you could quickly clarify .txth a bit for me. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
If it would mean that it would state it as such, I would say. The txth format was, as far as I can tell, quickly devised as a debugging aid - where one-way conversion from *.osi is sufficient, not necessarily as an easy to round-trip format. As such it is a bit lossy, and as you point out, not easy to parse back into the original data stream. Would I have designed this format differently? Likely. Does this pose much of a problem currently: Not that people complain too much about. If we came up with a better format in the future, with added requirements, one could think of deprecating the current txth format. Until that time I don't see the need to either deprecate or put too much effort into specifying the current format, as this will only lead people down a wrong path. Actually specifying a round-trippable human readable format is also a bit of work to get right, so while I can see use cases for this, I have yet to see the cost-benefit analysis to warrant the invested effort. We might want to more plainly spell out the current state of affairs in the standard. |
Beta Was this translation helpful? Give feedback.
If it would mean that it would state it as such, I would say.
The txth format was, as far as I can tell, quickly devised as a debugging aid - where one-way conversion from *.osi is sufficient, not necessarily as an easy to round-trip format. As such it is a bit lossy, and as you point out, not easy to parse back into the original data stream.
Would I have designed this format differently? Likely. Does this pose much of a problem currently: No…