Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cdtime bug on NERSC with 365 and 360 day calendars #46

Closed
mfwehner opened this issue May 29, 2020 · 13 comments
Closed

cdtime bug on NERSC with 365 and 360 day calendars #46

mfwehner opened this issue May 29, 2020 · 13 comments

Comments

@mfwehner
Copy link

I have encountered a cdtime bug that seriously impacts analyses of daily data. Now perhaps the bug is only on the NERSC system, I don’t know. I tested it on an old installation that Hari did for me and on jupyter.nersc.gov which Charles probably made.

The script below demonstrates it. It should print out the calendar date of the first and last entry in a nc file, which for CMIP6 daily files is in the name of the file. They do not match for the 365 and 360 day calendars. But they do for the proleptic_gregorian calendar.

About half the CMIP6 models use a 365 day calendar. To test, pick any daily file in ESGF from a CESM model. To test the 360 day calendar, use a HadCM model.

Here is the script.
import MV2 as MV, cdtime,os, cdms2 as cdms, sys, string
var=sys.argv[1]
f=cdms.open(sys.argv[2])

tim=f.dimensionarray('time')
u=f.getdimensionunits('time')

cdtime.DefaultCalendar=cdtime.NoLeapCalendar
tt=f.dimensionobject('time')
if hasattr(tt, 'calendar'):
if tt.calendar=='360_day':cdtime.DefaultCalendar=cdtime.Calendar360
if tt.calendar=='gregorian':cdtime.DefaultCalendar=cdtime.MixedCalendar
if tt.calendar=='365_day':cdtime.DefaultCalendar=cdtime.NoLeapCalendar
if tt.calendar=='noleap':cdtime.DefaultCalendar=cdtime.NoLeapCalendar
if tt.calendar=='proleptic_gregorian':cdtime.DefaultCalendar=cdtime.GregorianCalendar
if tt.calendar=='standard':cdtime.DefaultCalendar=cdtime.StandardCalendar

r = cdtime.reltime(tim[tim.shape[0]-1],u)
last_day=r.tocomp()
r = cdtime.reltime(tim[0],u)
first_day=r.tocomp()
print(first_day,last_day)

@jasonb5
Copy link
Contributor

jasonb5 commented May 29, 2020

I used the following source and observed this behavior.

http://esgf-data.ucar.edu/thredds/dodsC/esg_dataroot/CMIP6/CMIP/NCAR/CESM2/amip/r2i1p1f1/Amon/pr/gn/v20190319/pr_Amon_CESM2_amip_r2i1p1f1_gn_195001-201412.nc

Observed behavior

1948-9-28 12:0:0.0 2013-8-12 12:0:0.0

Does changing tocomp() to tocomp(cdtime.DefaultCalendar) resolve the issue?

@mfwehner
Copy link
Author

mfwehner commented May 29, 2020 via email

@jasonb5
Copy link
Contributor

jasonb5 commented May 29, 2020

Ok both those examples help confirm that cdtime functions are not using the DefaultCalendar. As soon as you explicitly pass the calendar, NoLeapCalendar in my example, you get the correct values.

I'll get into the code and figure out whats going on.

I think the reason y2=int(time2.torel('years since 0-1-1').value) doesn't work anymore can be explained here CDAT/cdms#334, year 0 is no longer valid.

@durack1
Copy link
Member

durack1 commented May 29, 2020

@pochedls this thread may interest you, @mzelinka same for you

@durack1
Copy link
Member

durack1 commented May 29, 2020

@jasonb5 thanks for doing your homework and finding CDAT/cdms#334 in the process of fixing this issue, it'd be great to also implement the change described by @pochedls, as year = 0 is invalid following the CF (climate forecast) convention that the software is built on see http://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/build/ch04s04.html

To be honest I would be amazed if cdms2 could deal with the paleo example on that page, with my particular query related to the calendar attribute:

Example 4.6. Paleoclimate time axis

double time(time) ;
  time:long_name = "time" ;
  time:units = "days since 1-1-1 0:0:0" ;
  time:calendar = "126 kyr B.P." ;
  time:month_lengths = 34, 31, 32, 30, 29, 27, 28, 28, 28, 32, 32, 34 ;

@mfwehner
Copy link
Author

mfwehner commented May 30, 2020 via email

@jasonb5
Copy link
Contributor

jasonb5 commented May 30, 2020

The c code is working correctly and using it's DefaultCalendar value.

The issue here is cdtime.DefaultCalendar=cdtime.NoLeapCalendar is actually setting the value in the python module and not the c module object. Which is why it has no effect. A reference to the c object is actually at cdtime._cdtime.

There's a couple paths forward. I think the preferred option would be the latter.

  1. Set the DefaultCalender on the c object. This would cause all subsequent calls to c functions to use the default calendar. Some of the python functions r2c, s2c, etc will actually override this becuase they use the DefaultCalender in the python module and pass it through to the c functions.
cdtime._cdtime.DefaultCalendar = cdtime.NoLeapCalendar
  1. Explicilty pass the target calender to tocomp or any call that require a calendar.

Lets continue the year 0 conversation here. Could you descirbe the desired behavior and we can continue from there.

@durack1
Copy link
Member

durack1 commented Jun 3, 2020

@jasonb5 your suggestion RE: 2 above is exactly what I would recommend, in most CMIPx model cases the calendar is explicitly defined as an axis variable, so for e.g.:

$ ncdump -h /p/css03/esgf_publish/CMIP6/CMIP/IPSL/IPSL-CM6A-LR/historical/r1i1p1f1/Amon/tas/gr/v20180803/tas_Amon_IPSL-CM6A-LR_historical_r1i1p1f1_gr_185001-201412.nc | grep time
	time = UNLIMITED ; // (1980 currently)
	double time(time) ;
		time:axis = "T" ;
		time:standard_name = "time" ;
		time:long_name = "Time axis" ;
		time:calendar = "gregorian" ;
		time:units = "days since 1850-01-01 00:00:00" ;
		time:time_origin = "1850-01-01 00:00:00" ;
		time:bounds = "time_bounds" ;
	double time_bounds(time, axis_nbounds) ;
	float tas(time, lat, lon) ;
		tas:cell_methods = "area: time: mean" ;
		:parent_time_units = "days since 1850-01-01 00:00:00" ;
		:branch_time_in_parent = 21914. ;
		:branch_time_in_child = 0. ;

@jasonb5
Copy link
Contributor

jasonb5 commented Jun 10, 2020

@mfwehner, @durack1

Just to clarify, setting the default calendar e.g. cdtime.DefaultCalendar = cdtime.NoLeapCalendar will never be able to produce the desired behavior. This is an inherit flaw from how cdtime is setup and there is no straight forward way to change this without making major changes to the structure of the package.

With that said here are a list of solutions that will produce the desired behavior.

  1. From @mfwehner's original example. You can retrieve the FileAxis for time with getAxis('time'). Any conversion called on this object should honor the calendar attribute if it's set, as mention by @durack1 in his previous comment.
t = f.getAxis('time')
comptime = t.asComponentTime()
print(t.attributes['calendar'], comptime[0], comptime[-1])
# noleap 1950-1-15 12:0:0.0 2014-12-15 12:0:0.0

If it's not set you can still pass it for conversion.

comptime = t.asComponentTime(cdtime.NoLeapCalendar)
  1. Modifying the DefaultCalendar on the c module object. I cannot guarantee any behavior when proceeding this way as cdtime was not intended to be used in this fashion.
import cdtime._cdtime as cdtime
cdtime.DefaultCalendar = cdtime.NoLeapCalendar

print(cdtime.reltime(711399.5, 'days since 0001-01-01 00:00:00').tocomp())
print(cdtime.reltime(735093.5, 'days since 0001-01-01 00:00:00').tocomp())
# 1950-1-15 12:0:0.0
# 2014-12-15 12:0:0.0
  1. Explicitly setting the calendar when converting from relative to component time. The calendar if set on the FileAxis is accessible from the tt variable by calling tt.getCalendar() and can be passed to tocomp.
r = cdtime.reltime(tim[tim.shape[0]-1],u)
last_day=r.tocomp(tt.getCalendar())
r = cdtime.reltime(tim[0],u)
first_day=r.tocomp(tt.getCalendar())
print(first_day,last_day)
# 1950-1-15 12:0:0.0 2014-12-15 12:0:0.0

@durack1
Copy link
Member

durack1 commented Jun 11, 2020

@jasonb5 moving forward if there are obvious limitations with cdtime, then would it be possible to resolve these with a reformulation of the C library? It is 23 years old after all. Synchronizing this with the latest calendars and UDUNITs would certainly be welcomed (not that I can imagine much has changed with time). Not sure it's on the roadmap?

I have always found my use of cdtime requires me to jump through hoops that aren't always obvious to a new user, and the documentation is pretty bare, so the trial and error time investment is something that would benefit greatly from a refresh (which could make usage much more straight forward, and documented)

@jasonb5
Copy link
Contributor

jasonb5 commented Jun 11, 2020

@durack1 I just realized I never mentioned this on the roadmap, i'll have to update it, but we are looking into refreshing a lot of the existing libraries. We want the redesign to provide an up to date, comprehensive and feature rich package.

I completely agree and understand the lack of documentation and moving forward with the redesign, this is one of our highest priorities.

@durack1
Copy link
Member

durack1 commented Jun 11, 2020

@jasonb5 excellent news, more than happy to be a guinea pig in the process, I am sure many of us would be very willing participants

@jasonb5
Copy link
Contributor

jasonb5 commented Aug 21, 2020

Closing as a work around has been provided.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants