You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
At the moment, the getData call returns either a single file or a ZIP file containing multiple files. In the latter case, the ZIP file contains only the requested files and the path of datafiles within the ZIP file is determined by the zipMapper class in the plugin.
The present feature request suggests to use some standardized package format in the case of returning a ZIP. Such a package should contain some minimum set of metadata along with the files. There are different efforts to define a common package format:
A recommendation from the RDA Research Data Repository Interoperability WG defines a package format that is based on BagIt but adds a few more requirements on the included metadata.
The have been a Approaches to Research Data Packaging BoF meeting at the last RDA plenary in March with the goal to start another group on data packaging in RDA. I don't know if such a group is going to be established though. The session page links more existing package formats.
In general, these package formats may be serialized as ZIP files, so this would fit into the schema of the getData call. The advantage of this would be to have some metadata included in the returned data, so that there is a chance to understand what this blob of data is supposed to be. The metadata also include manifest files with checksums, so that the receiving end may check the integrity of the data. Another advantage would be improved interoperability with other tools and repositories that are able to understand the package format. The drawback would be a little more effort in preparing the package and slightly larger downloads. I would estimate that the difference might be negligible compared to the size of the original data, though.
The text was updated successfully, but these errors were encountered:
At the moment, the
getData
call returns either a single file or a ZIP file containing multiple files. In the latter case, the ZIP file contains only the requested files and the path of datafiles within the ZIP file is determined by the zipMapper class in the plugin.The present feature request suggests to use some standardized package format in the case of returning a ZIP. Such a package should contain some minimum set of metadata along with the files. There are different efforts to define a common package format:
In general, these package formats may be serialized as ZIP files, so this would fit into the schema of the
getData
call. The advantage of this would be to have some metadata included in the returned data, so that there is a chance to understand what this blob of data is supposed to be. The metadata also include manifest files with checksums, so that the receiving end may check the integrity of the data. Another advantage would be improved interoperability with other tools and repositories that are able to understand the package format. The drawback would be a little more effort in preparing the package and slightly larger downloads. I would estimate that the difference might be negligible compared to the size of the original data, though.The text was updated successfully, but these errors were encountered: