You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Very early on, we explored using XArrays rather than pure numpy.ndarray. The reason was to add clarity about the dimensions, but it could also be used to add TileInformation:
importxarrayasxrimportnumpyasnp# Create a DataArray with metadatadata=xr.DataArray(
np.zeros((1, 1, 64, 64)),
dims=("s", "c", "y", "x"),
# one could also make use of data.coords to store some infoattrs={"file": "/path/to/file.tiff", "tile_ID": "meters", "tile_coords": [0, 0, 128, 128]}
)
print(data.attrs)
Now, I don't remember why we ended up not using it, but clearly one of the problem we would run into is that torch.Tensor cannot store the metadata (attrs). But I just learned that torch.Tensor can store labeled dimensions, which is nice, but maybe not strictly necessary for us.
Crazy idea
Could we create a subclass of torch.Tensor that maintains the Tensor API for the computational part (and can therefore be passed to all of PyTorch computation), but that would hold metadata equivalent to the metada stored in the XArray?
Of course, one of the problem is to make sure that internally PyTorch does not get rid of it somewhere (through a copy() for instance). But it seems that torch has a mechanism for that.
Problem
Very early on, we explored using
XArrays
rather than purenumpy.ndarray
. The reason was to add clarity about the dimensions, but it could also be used to addTileInformation
:Now, I don't remember why we ended up not using it, but clearly one of the problem we would run into is that
torch.Tensor
cannot store the metadata (attrs
). But I just learned thattorch.Tensor
can store labeled dimensions, which is nice, but maybe not strictly necessary for us.Crazy idea
Could we create a subclass of
torch.Tensor
that maintains the Tensor API for the computational part (and can therefore be passed to all of PyTorch computation), but that would hold metadata equivalent to the metada stored in the XArray?Of course, one of the problem is to make sure that internally PyTorch does not get rid of it somewhere (through a
copy()
for instance). But it seems that torch has a mechanism for that.[edit]: they actually have an example: https://pytorch.org/docs/main/notes/extending.html#extending-torch-with-a-tensor-wrapper-type ! They do warn about potential issues with some operations, but we could try out to see if anything we do is a problem.
The advantages for us:
TileInformation
, no need to collate itaxes
everywhere, we now at every point what they are, which also helps visualize data afterwardsscale
metadata (e.g. pixel size) which is very useful to scientists and helpful for viewing the imagesThe text was updated successfully, but these errors were encountered: