-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correct Unexpected floats when reading LI L2 LFL #2998
base: main
Are you sure you want to change the base?
Conversation
…he _FillValue is not into the attributes
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2998 +/- ##
==========================================
- Coverage 96.11% 96.11% -0.01%
==========================================
Files 383 383
Lines 55673 55679 +6
==========================================
+ Hits 53511 53514 +3
- Misses 2162 2165 +3
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Pull Request Test Coverage Report for Build 12991062112Details
💛 - Coveralls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the investigations! I left some comments in the code.
I am also wondering if we still also need the fix for the float64
upcasting when multiplying with the scale_factor
, as discussed in the issue...
…tests to check the dtype improve the mock datas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix and the extra more comprehensive tests! Just two things inline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One suggestion inline.
Maybe we should start splitting the long test helpers to smaller chunks at some point?
I also restarted the one windows CI run.
satpy/readers/li_l2_nc.py
Outdated
@@ -158,11 +158,9 @@ def get_array_on_fci_grid(self, data_array: xr.DataArray): | |||
data_2d = da.map_blocks(_np_add_at_wrapper, data_2d, (rows, cols), data_array, | |||
dtype=data_array.dtype, | |||
chunks=(LI_GRID_SHAPE[0], LI_GRID_SHAPE[1])) | |||
data_2d = da.where(data_2d > 0, data_2d, np.nan) | |||
|
|||
data_2d = da.where(data_2d > 0, data_2d, np.nan).astype(np.float32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would prevent one unnecessary upcasting:
data_2d = da.where(data_2d > 0, data_2d, np.nan).astype(np.float32) | |
data_2d = da.where(data_2d > 0, data_2d, np.float32(np.nan)) | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pnuu casting a nan
who is a float32 is not enough to convert all the arrays into a float32
. For example if data_2d
is an int32
the method where will convert it to a float64
. To prevent it I have used the method astype(np.float32)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, that makes sense I guess.
But for integer data the data would still be converted to floats, which seemed to be the original problem reported #2854 , right? So should there be separate handling for the integer data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this specific case, handling the accumulated LI 2-d data arrays, we need floats since we need to support NaN values. So I'm ok with the solution here that avoids float64.
The problem in the original issue was that some integer variables, that do not have a FillValue attribute, were still being casted unnecessarily to float; that problem is fixed by the other modification of this PR in apply_fill_value
here https://github.com/pytroll/satpy/pull/2998/files#diff-3b2bff08b4001ec6f72cca67791cc4322b38e0db97e68f2109791093e56e6052R445
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, ok.
The current version will cast the data to float64
(when setting the 64-bit nan
) before recasting it to float32
. So for the staying in float32
world I think the data_2d
array should be converted to float32
before applying the nan
:
data_2d = data_2d.astype(np.float32)
data_2d = da.where(data_2d > 0, data_2d, np.nan)
Here the np.nan
respects the original dtype
and does not do an intermediate up-casting for the integer data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Modification applied
Makes sure that the method apply_fill_value is not applied when
_FillaValue
is not into the attributes. It is fixing the bug mentioned there #2854 .