-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve json formatting to improve readability #147
Comments
Except for avoiding excess newlines, writing pointer payloads "inline" instead of in the end of the file make it much easier to read for a human. |
Personally I like the "excessive" newlines and I find that easier to read. |
Cap'n'proto flattens (i.e removes) all pointers except AnyPointers (unions) when going to json, so circular references doesn't work at all and all data gets duplicated. So to keep the structural integrety when pointers are referenced from multiple places they still need to be identifiable, i.e have a unique name or tag, and the inlined data could be written in either all or just one of the places, e.g where the first reference to the data is. If the data is flattened and written in all places then #14 could solve deduplicating it. And even if you would go that "flatten everything" route, you would still need to abort on cyclic references and have an idetifier to refer to. I guess the same thing is true for arrays, but since they are already written without a unique name I guess that dl already flattens them even if they refer to the same pointer? The two main points of the newlines is that:
It would be awesome if json formating could be made using formatting rules similar to "clang-format", so each user can choose their own style. The more I think of it, the more I think that reformatting the json is something which can be done after DL have created the json, i.e by pipe:ing the data to another tool. So DL could just avoid writing any whitespace, and let the formatting tool add all that. It would be slower, but if the data could be piped in chunks then formatting could mostly be done in parallel with DL's json generation, so a GB file would not require twice the time. |
Yes, member-data alignment I wouldn't mind either. If what you mean with that is: {
"member_1" : 1234,
"short" : 3456
} |
also, I think vectors of numbers are single-line right? Because if they are not I think they should be. |
but as you say... formatting is highly highly personal, so being able to pipe it via some kind of formatter might be the best solution. However the current api do not support streaming output and I think it would require quite a bit of new api that would probably "break" the current API-structure. But an "unformatted" json output, would that just be no newlines at all, basically just a big long single line? |
Yes, arrays of primitives and pointers are always single line, even when they are epic in length. And yes, you understood the data-alignment correctly. In my mind the unformatted style is just without any whitespace/newlines at all, the smallest memory footprint to start the reformatting from, no need to strip whitespace before adding new. The implementation used by cap'n'proto to make the formatting simple is to use a "string tree", where all elements are leafs in the tree and then parented by the lists and objects owning them. The branches can provide the summed length of all its children, making it trivial to know which lists are appropriate to keep in one line, and which elements to insert newlines and indentation between. It makes it easy to insert sub-strings into the tree while building it and can also reduce the memory footprint since identical strings can be reused instead of duplicated. Unfortunately it is terribly modern code, very big interfaces, very few lines in implementation and utterly impossible to understand by reading. Source here: |
When I compared dl to cap'n'proto, and the most striking thing cap'n'proto was better at was the awesome json formatting:
Example cap'n'proto:
And the same in dl:
The text was updated successfully, but these errors were encountered: