-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NXdata needs additional information for data to be plotted accurately #1527
Comments
The edit was to the final representation of the data array |
For the all-present bins case, text has been proposed in #1396 to include histogram axes that contain the bin edges - see L328. Not sure what can be a general solution for the missing/omitted bins case. Options include:
|
Thanks Peter - I'm going to close this issue and hopefully the rest of the discussion can be had on that PR. I did look at the first couple of pages of PRs, open and closed, but the linked one was tucked away on page 3! |
Discussion in #1396 suggests we re-open this issue. I would propose that I submit a PR, but only after the axes definition is pushed. Any thoughts @rayosborn ? |
NXdata states should have a shape that matches data dimension(s), such that a given value, obtained at
data[i][j][k]
can be plotted at some point given by other symbols. Additionally,However, it is not possible to use NXdata to plot data which has been integrated without making explicit assumptions. Suppose a set of data which has been integrated already (e.g. by a beam monitor) into bins of non-equal width, integrated in 1 ms periods (and equal each recorded period):
One could choose to use either the low edge, high edge, or their mid-point, to use as a point to plot the data; using the
FIELDNAME_errors
field could help to describe the range of the measurement too, but that is a contradiction to what it should represent in the specification. Therefore, either the first bin or last bin edge will be missed, and any information about the data being integrated will be lost. Thus, NXdata will fail to be able to represent the data accurately, since the recorded data currently would have to look like:This could be fixed with four additional requirements:
point
orintegrated
datum2b. And if
integrated
whether the value given corresponds to theleading
ortrailing
bin edge*none
,first
,last
, orboth
binThe data that is to be represented is equivalent to:
which would be represented with
There may be use cases where it is better to assign the value to the
leading
edge of the bin (e.g. if there is an overflow bin at the end without underflow), and so the data could also be written aswithout changing the
axes
.I understand that
but in this example it is vital to be able to specify that this is not point-like data - the integral of each bin is actually equal. Currently, although I can see a possibility to manipulate the data in such a way that bin edges are defined, there is no way to specify what is going on when that file is read by others, and it could easily be interpreted, inaccurately, as a set of points. For similar reasons, it should be explicit that bins must be contiguous.
*an alternative is to use the
centre
of the bin, which is most likely what is required for statistical analysis, but this complicates representation - in the example here, it becomes a bit convoluted to work out how big that bin actually is, and would require the length of each axis to be 1 element longer than data. Instead, one workaround might be to use centre always for statistical analysis unless explicitly stated by using the keyword again:trailing trailing
for analysis which should be filled at AND use the trailing edge of the bin,trailing leading
for analysis which should be filled at the trailing edge of the bin BUT use the leading edge of the bin.The text was updated successfully, but these errors were encountered: