-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable JSON <-> YAML, JSON <-> binary conversion? #16
Comments
@julesjacobsen not sure this is absolutely needed? If we stick with JSON that will mean we encourage people to use JSON as the primary format? |
@julesjacobsen see new class DefaultPhenopacketIngestor. We could add some functions to this class such as public fromYamlFile(...) and DefaultPhenopacketIngestor(Message message). Thoughts? |
@ielis is this issue closable? I think this is supported for some operations |
In principle yes. Each command that reads or writes a phenopacket accepts/produces phenopacket, family, or cohort in any of these formats. We do not have a command solely for the format conversion (something similar to |
Just revisiting this - is JSON the primary format for phenopackets? Is this written somewhere else? I am trying to do some dataset sharing (ala EGA) - and was considering placing a phenopacket alongside each individuals' genomic artifacts. But I was assuming I needed that to be a protobuf file with some sort of known file suffix like e.g.
And so to that end - I was going to store some v2 JSON or YAML phenopackets for ease of editing - and then convert them over to protobuf using the CLI tool (so this is my +1 for the general feature of being able to convert between formats with just the CLI tool - which is currently not possible - convert requires the input to be v1 format) But if JSON is the primary way we think phenopackets are to be exchanged in the wild - then I can skip using protobuf entirely. Is there some suggested file naming conventions to let people know it is a phenopacket (in JSON)? |
I should add that I am starting via hand crafting some examples for a demonstration of how this would all work - hence the hand editing of JSON or YAML. Obviously for a real system I would be translating from some clinical source like an EHR or Redcap or something and so I guess I would do that using the Java library and output easily whatever format choice I wanted. I think the broader thought is still there - if I have unlimited choice here - what is the primary "phenopacket" file format and how should I name them to make this clear? |
Hi Andrew, there could be a lossless conversion from protobuf (binary), JSON, YAML, XML, SQL ... so there really isn't a primary format. My guess is that almost everybody would prefer JSON because of the tooling for JSON. |
In which case - having an tool that seamlessly converts between the formats might be useful (if I get a batch of phenopackets in protobuf but would prefer them in JSON) - I can just run the CLI tool to convert.. (rather than dusting off my java and writing a small snippet using the library to do the same) |
Currently the converter only handles JSON. Might be an idea to offer conversion of other formats too.
The text was updated successfully, but these errors were encountered: