-
Notifications
You must be signed in to change notification settings - Fork 177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Save and load in binary, compatible with NumPy/Matlab and others #486
Comments
This seems useful and in scope. I often work with MAT files (of various versions) from colleagues and I use SciPy.io For my own interoperable binary data between Fortran and Python, I use NetCDF. I don't think any of the language-specific binary formats will beat it in terms of features, performance, or stability. Likewise for HDF5 which is suitable for unstructured data. |
Both NetCDF and HDF5 are great. The only issue with HDF5 is that there is literally only one library that can read and write it and it's not that easy to build and ship. It's not easy to write a writer in pure Fortran, as an example. While it is easy for the So that makes me hesitant to just depend on HDF5. However, it is worth investigating what would it take to just support a very small subset of HDF5, say for writing a set of double precision arrays. It might not be that difficult to write a writer for just such a small subset in pure Fortran. Here is the format: https://portal.hdfgroup.org/display/HDF5/File+Format+Specification The huge advantage of that would be no dependency on the hdf5 library, and using a widely supported format. |
There is NPY for Fortran: allows saving numerical Fortran arrays in Numpy's .npy or .npz format, by MRedies which I have not tried. |
There is already an HDF5 writer/reader which looks promising: https://github.com/geospace-code/h5fortran. I uses the Fortran bindings of the C library. |
I got the basic structure for reading and writing npy files implemented in #581. Needs some polishing, especially the reading, and much more unit tests to cover all possible errors the loading can encounter. |
libnpy seems to be a library that provides simple routines for saving a C or Fortran array to a data file using NumPy's own binary format.
|
I agree with @MarDiehl . I recently used @scivision 's |
How do we want to handle the npz format? It is a zip archive with npy files. Probably, we have to develop a general interface for interacting with compressed archives first. For the mat format I found a specification of the layout (linked in description at the top), should be straight-forward to code up, but I don't think I have a matlab version I could use to verify it, but I could try SciPy. |
Was your idea to implement the reader/writer entirely in Fortran based upon the PDF document, or call into the MATLAB C API to Read MAT-File Data? The latter requires the client has the |
I was reading the specs, sounds easy enough to implement this from scratch and verify using SciPy. Unfortunately, the data can be compressed, and we need an interface to zlib or similar first. Having the possibility to dynamically load a library with dlopen in case the matlab runtime libraries are around would be another option. However, than we first need an interface for dynamic loading. |
My dynlib module in
https://sourceforge.net/p/flibs/svncode/HEAD/tree/trunk/src/dynlib/ could
serve as a starting point.
Op zo 12 dec. 2021 om 17:03 schreef Sebastian Ehlert <
***@***.***>:
… I was reading the specs, sounds easy enough to implement this from scratch
and verify using SciPy. Unfortunately, the data can be compressed, and we
need an interface to zlib or similar first.
Having the possibility to dynamically load a library with dlopen in case
the matlab runtime libraries are around would be another option. However,
than we first need an interface for dynamic loading.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#486 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAN6YR7K6GAMFCJOU6EIQWTUQTBVJANCNFSM5CM7FYQQ>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
I see stdlib has save_npy and load_npy functionality! I tested it out and it works great! I was wondering, if possible, if dim(:,:,:,:) arrays could also be supported. I only see interfaces up to rank-3. |
They should be supported up to the maximum rank stdlib was configured for. The docs are only generated up to rank 3 to save space, while the fpm version allow up to rank 4, the CMake version can go up to rank 15. |
Oh, thanks! I should have tested first, I took the docs too literally. |
While scrolling through the ARCHER2 super-computing service documentation I learned there is BSD-licensed library for MATLAB MAT files called As @awvwgk has remarked above, supporting MATLAB binary files would require a zlib interface and potentially also HDF5, both of which are available as C libraries. It looks more straightforward to just have a thin Fortran wrapper of a C/C++ implementation, than to write an interface/implementation for zlib (and HDF5) first. |
In case of the compressed npz files created with Irrespective of how we manage to do the zipping/compression (either in C or Fortran), with respect to the zipped format a big question is how to replace positional and keyword arguments in Fortran, without getting overwhelmed by the combinatorial explosion of type/kind/rank + number of saved arrays. |
Since Fortran doesn't have positional or keyword arguments in the way Python does, for subroutine add_npz(zipfile,var_name,array)
character(len=*), intent(in) :: zipfile
character(len=*), intent(in) :: var_name
real|complex|integer, intent(in) :: array(..) Alternatively, we could have a handle based approach: integer :: npz_unit
real :: A(2,2)
complex :: B(3,3)
call open_npz(newunit=npz_unit,filename="foo.npz")
call stage_npz(npz_unit,A,"A")
call stage_npz(npz_unit,B,"B")
call close_npz(npz_unit) Since Fortran uses integer units as file handles, the concept should be familiar already. |
The |
First requested here.
The text was updated successfully, but these errors were encountered: