-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add tests for time conversions in tools package #2341
base: main
Are you sure you want to change the base?
Conversation
I realize no one planned/asked for this, but I think it's a good addition and ready to be reviewed by a maintainer. Increases test coverage by a whopping 0.07%! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @markcampanelli
The tests expose a weakness in localize_to_utc
: what happens if location
has a timezone defined as an offset? That is permissible, and if done produces an unhelpful message:
if zone.upper() == 'UTC':
AttributeError: 'int' object has no attribute 'upper'
pvlib/tests/test_tools.py
Outdated
'input,expected', | ||
[ | ||
( | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using a dict
here allows the function call to be double-splatted (**input)
and may make it easier to "see" each tested set of inputs and output. It's not the style used for other tests are parameterized.
I'm interested in feedback from others: is this parameterization style easier or more difficult, as a reviewer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My 2 cents: a more compact representation of the changing values is helpful for reviewing, and is worth a bit of processing inside the test function itself. For example, it seems like the following:
(
{
"time": pd.DatetimeIndex(
["1974-06-22T18:30:15"],
tz=ZoneInfo("Etc/GMT+5"),
),
"location": location.Location(
43.19262774396091, -77.58782907414867, tz="Etc/GMT+5"
)
},
pd.DatetimeIndex(["1974-06-22T23:30:15"], tz=ZoneInfo("UTC")),
),
(
{
"time": pd.DatetimeIndex(["1974-06-22T18:30:15"]),
"location": location.Location(
43.19262774396091, -77.58782907414867, tz="Etc/GMT+5"
)
},
pd.DatetimeIndex(["1974-06-22T23:30:15"], tz=ZoneInfo("UTC")),
),
could be replaced with something like this:
("1974-06-22T18:30:15", "Etc/GMT+5", 43, -77, "Etc/GMT+5", "1974-06-22T23:30:15+00:00"),
("1974-06-22T18:30:15", None, 43, -77, "Etc/GMT+5", "1974-06-22T23:30:15+00:00"),
And the test function could then contain the necessary calls to Location
and pd.DatetimeIndex
to construct the parameters themselves.
I think the latter format makes it easier to see what changes between each set of parameters. I found my eyes having to flick back and forth and subsequently getting lost with the former format.
Good catch! Lemme see what I might do about fixing that here (if it's not too much scope creep). BTW: I'm pretty sure this is the sort of issue that I solved (hopefully!) by simplifying the timezone representation in |
@cwhanse @kandersolar I addressed the bug Cliff spotted, and updated a Due to the high repetition, I decided not to use test parameterization at all for the new Should be ready for another look. |
pvlib/tools.py
Outdated
|
||
Parameters | ||
---------- | ||
time : datetime.datetime, pandas.DatetimeIndex, | ||
or pandas.Series/DataFrame with a DatetimeIndex. | ||
location : pvlib.Location object | ||
location : pvlib.Location object (unused if time is localized) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
location : pvlib.Location object (unused if time is localized) | |
location : pvlib.Location object (unused if ``time`` is localized) |
pvlib/location.py
Outdated
elif isinstance(tz, (int, float)): | ||
self.tz = tz | ||
self.pytz = pytz.FixedOffset(tz*60) | ||
self.tz = f"Etc/GMT{int(-tz):+d}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm inclined to ask to revert this change. The docstring says tz
can be a float. There are places that have a non-integer offset from UTC, e.g., the Marquesas are UTC−09:30UTC−09:30. localize_to_utc
is changed in this PR to examine Location.pytz
rather than Location.tz
, so the error we discussed won't occur.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a point in the Location docstring that says this:
Location objects have two timezone attributes:
* ``tz`` is a IANA timezone string.
* ``pytz`` is a pytz timezone object.
While I appreciate my changes may break some things, I think this existing docstring suggests a much more sane approach. (Indeed, in the sequel using zoneinfo
, I propose using one, and only one, internal representation of the timezone, instead of two, potentially with helper functions s to update this internal representation using int/float offsets, etc..)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not saying revert from storing a string, but revert the int
portion. If tz
is truncated to an integer, how are timezones such as UTC+05:30 represented?
I'm open to doing away with accepting int
or float
for tz
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cwhanse Sorry I misunderstood your meaning. I was aware that some locations are on half-hour offsets and such, but what is unclear to me is if/how that is represented in pytz
(or zoneinfo
for that matter). I do not see any fractional offsets in zoneinfo.available_timezones()
and this also does not work for me: pytz.timezone("Etc/GMT+5.5")
.
I must be missing something here. Does something non-standard have to be done to support fractional offsets?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm open to doing away with accepting
int
orfloat
fortz
.
Hi again @cwhanse. It looks like the standard "non-integral-hour" offsets are only supported in the IANA standard through place names such as "Asia/Kathmandu" and "Asia/Yangon", but not through Etc/GMT+X.Y
style offsets.
Given that, I think the most trouble-free thing to do is to remove the int
and float
options and be sure to discuss this issue in the documentation tutorial. (Also, the only supported tz
strings would be the IANA ones.) Sounds like you are amenable to this.
Keeping both fields tz
(only as a string) and pytz
(as a pytz
timezone type), then I think I should add setters that change one when the other is changed after the Location object has already been already initialized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the most trouble-free thing to do is to remove the int and float options
I'm in favor of this change. @pvlib/pvlib-maintainer @pvlib/pvlib-triage your opinions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TMY3 files represent timezone in numeric form, so this change may complicate some workflows. Outside of that, I never use this attribute (and don't know why I would), so I have no opinion either.
Remember that we usually have a deprecation period before outright removal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did some more investigating on this yesterday, and I think I see an implementation that would better support the existing int
and float
offsets. However, this also raised some new questions about the existing implementation. I will try to push some concrete code here by end of day for people to consider, along with the new questions raised.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. I have added support back for int and non-fractional float being passed to tz
in the Location
constructor. I also test that pytz
and zoneinfo
timezones are likewise supported for initialization, and I beefed up the tests even more.
Note that the pytz
Location attribute is now the single source of truth for the time zone in a Location
object (which I recommend updating to a standard-library timezone.TimeZone
object in #2342). self.tz
simply returns the stringified self.pytz
, but one can still directly set self.tz
with any tz
value supported by the Location
intializer. Also, the tz
and pytz
attributes are kept in sync now, as changing one updates the other.
Ready for another review.
The two test failures appear to be unrelated to my changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think we should switch and regard tz
as the source of truth, and pytz
as the optional attribute? With an eye toward replacing pytz
with python's timezone
.
pvlib/location.py
Outdated
from the pytz and zoneinfo packages), default 'UTC'. | ||
See http://en.wikipedia.org/wiki/List_of_tz_database_time_zones for a | ||
list of valid name strings for IANA time zones. | ||
ints and floats must be non-fractional N-hour offsets from UTC, which |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ints and floats must be non-fractional N-hour offsets from UTC, which | |
ints and floats must be whole number hour offsets from UTC, which |
pvlib/location.py
Outdated
"floating point tz does not have zero fractional part: " | ||
f"{tz_}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"floating point tz does not have zero fractional part: " | |
f"{tz_}" | |
"floating point tz has non-zero fractional part: " | |
f"{tz_}. Only whole number offsets are supported." |
pvlib/location.py
Outdated
else: | ||
raise TypeError( | ||
f"invalid tz specification: {tz_}, must be an IANA time zone " | ||
"string, a non-fractional int/float UTC offset, or a " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"string, a non-fractional int/float UTC offset, or a " | |
"string, a whole number int/float UTC offset, or a " |
@cwhanse Ok, so we make the single source of truth the standardized IANA-supported string in the Question: Would it be too disruptive at this point to make the (newly implemented) |
I don't think we do this. Since pytz is in maintenance, it seems reasonable to expect to replace pytz with timezone (as you anticipate). At that time, the pytz attribute can be removed. I don't see the value in having both the timezone object and a string as attributes, since one can print the object and get the string representation. |
I decided to store the single source of truth about a I also made all of @cwhanse's recommended changes. If this implementation and the removal of directly setting the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markcampanelli let's keep this PR to the scope described in the title. Importing and adding the zoneinfo
attribute feels like half of the journey to shifting from pytz
to zoneinfo
and I think that should be done in its own PR.
Same for fixing asv - do that in a separate PR.
@@ -26,5 +35,4 @@ Requirements | |||
|
|||
Contributors | |||
~~~~~~~~~~~~ | |||
|
|||
|
|||
* Mark Campanellli (:ghuser:`markcampanelli`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* Mark Campanellli (:ghuser:`markcampanelli`) | |
* Mark Campanelli (:ghuser:`markcampanelli`) |
@pvlib/pvlib-maintainer @pvlib/pvlib-triage Can someone review my If this looks ok, then should I move the fix to a separate PR for better visibility? |
Moved to #2352, which now should go in before this. |
docs/sphinx/source/reference
for API changes.docs/sphinx/source/whatsnew
for all changes. Includes link to the GitHub Issue with:issue:`num`
or this Pull Request with:pull:`num`
. Includes contributor name and/or GitHub username (link with:ghuser:`user`
).remote-data
) and Milestone are assigned to the Pull Request and linked Issue.