Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correctly interpolate seasons in Grouper #2019

Merged
merged 38 commits into from
Feb 5, 2025

Conversation

saschahofmann
Copy link
Contributor

@saschahofmann saschahofmann commented Dec 11, 2024

Pull Request Checklist:

What kind of change does this PR introduce?

This PR adds a line to correctly interpolate seasonal values. It also changes the test_timeseries function that now accepts a calendar argument instead of cftime. Not providing it or providing None is equivalent to cftime=False and calendar='standard to the previous cftime=True. This allows for testing different calendar implementations e.g. 360_day calendars

@github-actions github-actions bot added the sdba Issues concerning the sdba submodule. label Dec 11, 2024
@saschahofmann
Copy link
Contributor Author

I just realised that the factor of 1/6 is assuming that all seasons have the same length which in gregorian calendars is not necessarily true but I am not sure it matters too much at least the function should be smooth.

@saschahofmann
Copy link
Contributor Author

Just to prove that this leads to a smooth result, same input as in the issue:
image

@Zeitsperre Zeitsperre requested a review from aulemahal December 11, 2024 16:53
@Zeitsperre Zeitsperre added bug Something isn't working standards / conventions Suggestions on ways forward labels Dec 11, 2024
Copy link

Warning

This Pull Request is coming from a fork and must be manually tagged approved in order to perform additional testing.

@saschahofmann
Copy link
Contributor Author

Weirdly and contrary to what I showed yesterday, today I am still getting clear transitions as if there still wasn't any linear interpolation.

@Zeitsperre
Copy link
Collaborator

@saschahofmann We recently changed the layout of xclim to use a src structure. It might be worthwhile to try reinstalling the library.

@Zeitsperre Zeitsperre mentioned this pull request Dec 12, 2024
5 tasks
@github-actions github-actions bot added the docs Improvements to documenation label Dec 12, 2024
@saschahofmann
Copy link
Contributor Author

I reinstalled xclim but I am still getting very similar results to before the "fix". You have any advice on where else I could look?

@Zeitsperre
Copy link
Collaborator

I reinstalled xclim but I am still getting very similar results to before the "fix". You have any advice on where else I could look?

Could it be that you have obsolete __pycache__ folders still among your cloned folders? @coxipi is looking into recreating your example based on your branch for validation, but if the tests are working as intended on CI, then it's likely a caching/installation issue.

@coxipi
Copy link
Contributor

coxipi commented Dec 13, 2024

I managed to install the environment, for some reason I only had the branch "main" when I cloned the fork yesterday

  • I confirmed that the function has the appropriate modifications inside the notebook I'm using
import inspect
print(inspect.getsource(sdba.base.Grouper.get_index))
  • I also find that the interpolation is wrong.

I'll try to have look later. Maybe the interp boolean condition is not triggered properly?

@saschahofmann
Copy link
Contributor Author

I am pretty sure that the get_index function is updated in my notebook. Either I am wrong in expecting a smoother result (it seems to have changed slightly to what I got earlier) or there is something else going on. I will keep investigating

@coxipi
Copy link
Contributor

coxipi commented Dec 16, 2024

It's simply interp which can't be "nearest", otherwise no interpolation takes place ... I think our only other option is linear.

from xclim import sdba
QM = sdba.EmpiricalQuantileMapping.train(
    ref, hist, nquantiles=15, group="time.season", kind="+"
)

scen = QM.adjust(sim, extrapolation="constant", interp="nearest")
scen_interp = QM.adjust(sim, extrapolation="constant", interp="linear")
outd = {
    "Reference":ref,
    "Model - biased":hist,
    "Model - adjusted - no interp":scen, 
    "Model - adjusted - linear interp":scen_interp, 
}
for k,da in outd.items(): 
    da.groupby("time.dayofyear").mean().plot(label=k)
plt.legend()

image

This doesn't reproduce your figure however. It seems your figure above was matching the reference very well, better than what I have even with the linear interpolation. But it does get rid of obvious discontinuities.

@coxipi
Copy link
Contributor

coxipi commented Dec 16, 2024

There is clearly something wrong going on. Comparing
hist - scen_month
scen_time - scen_month
scen_season - scen_month
scen_month - scen_month

scen_season is way off

image

@saschahofmann
Copy link
Contributor Author

@coxipi I think only mention this in the original issue: my analysis is done with QuantileDeltaMapping instead of EmpiricalQuantileMapping. Here the equivalent chart to yours for that:
image
season still seem kinda weird

@saschahofmann
Copy link
Contributor Author

saschahofmann commented Dec 18, 2024

A similar trend becomes apparent when looking at the adjusted - historical (now for EmpiricalQuantileMapping)
image

@coxipi
Copy link
Contributor

coxipi commented Dec 18, 2024

Yes, I have seen simlilar things by playing with the choice of how get_index. I feel this should not be this sensitive. Let me try and get this back

@saschahofmann
Copy link
Contributor Author

saschahofmann commented Jan 27, 2025

@Zeitsperre I added these two lines to allow linear interpolation with dayofyear but I am wondering whether this combination makes sense?
https://github.com/Ouranosinc/xclim/pull/2019/files#diff-3fdadf9776d54afe05de332ab8810fc3d2939d75bd98273f3661f05d71f296a4R314-R315

It was failing in a test

@Zeitsperre
Copy link
Collaborator

I added these two lines to allow linear interpolation with dayofyear but I am wondering whether this combination makes sense? https://github.com/Ouranosinc/xclim/pull/2019/files#diff-3fdadf9776d54afe05de332ab8810fc3d2939d75bd98273f3661f05d71f296a4R314-R315

It was failing in a test

I'm going to defer to @coxipi or @aulemahal here. This is a bit out of my depth haha

@coxipi
Copy link
Contributor

coxipi commented Jan 27, 2025

@Zeitsperre I added these two lines to allow linear interpolation with dayofyear but I am wondering whether this combination makes sense? https://github.com/Ouranosinc/xclim/pull/2019/files#diff-3fdadf9776d54afe05de332ab8810fc3d2939d75bd98273f3661f05d71f296a4R314-R315

It was failing in a test

I don't think there would be a point in doing linear interpolation for dayofyear. We have adjusting factors for each dayofyear, so we don't have in-between values where we need to interpolate the training data. I would say we probably need to change the failing test, let me see

I don't see the failing, can you point to a specific commit?

@saschahofmann
Copy link
Contributor Author

saschahofmann commented Jan 27, 2025

Yes. Its this test TestExtremeValues.test_real_data: https://github.com/Ouranosinc/xclim/actions/runs/12985440090/job/36210393696

I fixed it by allowing again linear interp of dayofyear. I think it was allowed previously but I slightly restructured the code so now by default it was disabled.

@saschahofmann
Copy link
Contributor Author

I reverted the change that allows linear interp with dayofyear grouping and instead changed the test to use nearest

@saschahofmann
Copy link
Contributor Author

saschahofmann commented Jan 29, 2025

@coxipi I think maybe reenabling linear interpolation from dayofyear for now is a better solution. Since otherwise it represents a breaking change, e.g. right now the tutorial does some similar (hence the breaking test in sdba-advanced.ipynb) where we do EQM with dayofyer with linear interp. Instead I could add a deprecation warning for maybe 0.57.0?

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@coveralls
Copy link

coveralls commented Jan 29, 2025

Coverage Status

coverage: 89.955% (+0.009%) from 89.946%
when pulling f518971 on saschahofmann:fix-#2014
into b0b2193 on Ouranosinc:main.

@saschahofmann
Copy link
Contributor Author

@aulemahal or @coxipi . Would love to get this over the line I think if you agree I add a DeprecationWarning and then ready to merge?

@coxipi
Copy link
Contributor

coxipi commented Feb 3, 2025

Hi @saschahofmann sorry for the late response, we were busy with a seminar last week and I missed your message

One thing I'm starting to realize is that the linear interpolation can act on two fronts, on the grouping dimension (e.g. season, month) but also on the quantiles, I forgot to take this into account in my last comment. We really want to leave open the possibility of linear interpolation on quantiles I think.

If you want to leave things as they were in the most economical way possible (if you need to deactivate a test, go for it), I really think this is beyond the scope of your PR anyways, you fixed many things indepently of this. I can take care of this in another PR. No need for deprecation warning, we will just fix this

@aulemahal, I think that in utils, we don't use the 2D interp function for prop=="group". We should not use it for prop=="dayofyear" either, agreed? interp should still be possible to use with quantiles, it's just implicit that dayofyear grouping will also be 1-dimensional

def interp_on_quantiles(
...
    if prop in ["group", "dayofyear"]:
        if prop =="group":
            if "group" in xq.dims:
                xq = xq.squeeze("group", drop=True)
            if "group" in yq.dims:
                yq = yq.squeeze("group", drop=True)
        out = xr.apply_ufunc(
            _interp_on_quantiles_1D,
            ...

@saschahofmann
Copy link
Contributor Author

I guess we can merge then 😬

Copy link
Contributor

@coxipi coxipi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just commenting so I can commit my suggestion

src/xclim/sdba/_adjustment.py Outdated Show resolved Hide resolved
@coxipi
Copy link
Contributor

coxipi commented Feb 4, 2025

Feel free to merge when it becomes possible again, we're just waiting on tests

@saschahofmann
Copy link
Contributor Author

I dont have write access so cant merge

@coxipi
Copy link
Contributor

coxipi commented Feb 5, 2025

I dont have write access so cant merge

Ah, right.

Thanks again for your nice contribution!

@coxipi coxipi merged commit 2c9fe06 into Ouranosinc:main Feb 5, 2025
21 checks passed
@saschahofmann saschahofmann deleted the fix-#2014 branch February 5, 2025 13:32
@coxipi coxipi mentioned this pull request Feb 7, 2025
5 tasks
@sylvainmarchi
Copy link

Hello,
I’m having trouble understanding this thread and the one titled "Correctly interpolate seasons in Grouper #2019." Could someone clarify the role of the interp argument, particularly in the context of MBCn as shown below? Does it apply to the group dimension, the quantiles, or both?
Thank you,
Sylvain

group = sdba.Grouper("time.dayofyear", window=31)

ADJ = sdba.MBCn.train(
	ref,
	hist,
	base_kws={"nquantiles": 50, "group": group},
	adj_kws={"interp": "linear", "extrapolation": "constant"}, 
	n_iter=20,  # perform 20 iteration
	n_escore=1000,  # only send 1000 points to the escore metric
)

@coxipi
Copy link
Contributor

coxipi commented Feb 11, 2025

Hi Sylvain,

MBCn with sdba.Grouper("time.dayofyear", window=31) and interp="linear" is not concerned by this, there is only interpolation on quantiles

In this case of sdba.Grouper("time.dayofyear", window=31), a linear interpolation will only be performed on the quantiles. Consider the adjustment on dayofyear-1. I will label the subset with rolling windows as ref1w31, hist1w31. We want to adjust the simulation on dayofyear-1 with this, sim1. You find the values of 50 equally spaced quantiles:

{ref1w31_q01, ref1w31_q03, ... , ref1w31_q99}
{hist1w31_q01, hist1w31_q03, ... , hist1w31_q99}

Compute adjustment factors from those
{af1_q01, af1_q03, ..., af1_q99}

Then the linear interpolation gives you a continuous function of the adjustment factors:
af1(q). You want to apply these adjustment factors on the ranks of sim1.

In the particular case of day of year grouping, no more interpolation is needed, we have adjustment factors af1 for sim1, af2 for sim2, af3 for sim3, etc.

When working with seasons, we have less adjustments factors, instead of af1,af2, ... af365, we only have afDJF, afMAM, afJJA, afSON. But we would still like to have distinct adjustment factors for each dayofyear, so apply different adjustment to sim1, sim2, ... sim365. To achieve this, we interpolate the results between seasons {afDJF, afMAM, afJJA, afSON}. So in this sense, there is also an interpolation on the grouping dimension, not only on quantiles. My description of what is done is not exact, because the two-fold interpolation on quantiles and season is done in one step, but I think it still schematizes well what is happening.

Hope this clears this up!

Éric

@sylvainmarchi
Copy link

Hello Éric,
Thank you for your very clear explanation.
Sylvain

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Approved for additional tests bug Something isn't working docs Improvements to documenation sdba Issues concerning the sdba submodule. standards / conventions Suggestions on ways forward
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants