Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Access output in timeseries by its time or redshift #4717

Merged
merged 20 commits into from
Oct 3, 2024

Conversation

cphyc
Copy link
Member

@cphyc cphyc commented Oct 27, 2023

PR Summary

This provides the ability to quickly find the dataset that has the closest value to a given one (e.g. time or redshift). Example

import yt

# Load all RAMSES dataset in a folder - we assume here we have a cosmological simulation
ts = yt.load("/path/to/simulation/folder/output*")

# Get the snapshot that's as close as possible from z=2
ds = ts.get_by_redshift(2)

# Get the snapshot that's as close as possible from t=10Gyr after Big-Bang,
t = ts[0].quan(10, "Gyr")
ds = ts.get_by_time(t)  

PR Checklist

  • New features are documented, with docstrings and narrative docs
  • Adds a test for any bugs fixed. Adds tests for new features.

@cphyc cphyc added the new feature Something fun and new! label Oct 27, 2023
@neutrinoceros
Copy link
Member

neutrinoceros commented Oct 27, 2023

Quick input:

  • I think this is a great idea
  • I'd prefer we avoid making it overly flexible: supporting the ability to search via arbitrary attributes may create unforeseen edge cases which will be impossible to test systematically. I think supporting search by time or redshift is sufficient for now, especially if it can be done in a way that doesn't involve typing the exact attribute names (IMO current_time is somewhat badly named, because its meaning seems excessively context-dependent)

@matthewturk
Copy link
Member

Not sure I completely agree with @neutrinoceros about limiting the flexibility. The question the other day about flash datasets with particles would suggest there may be times when we don't know the attribute someone will be looking for.

Can you make it use "Price Is Right" rules? i.e., closest-without-going-over?

@cphyc
Copy link
Member Author

cphyc commented Nov 1, 2023

I am not sure I understand what you mean by "Price Is Right" rule. Anyways, in its current state, it is already possible to query a dataset by attribute, but the method is hidden (_get_by_attribute), because “with great power comes great responsibility”.

Let me know what you think! I can always strip the leading _ to make it more user-facing, with an obvious comment in the docstring warning against misuses?

@matthewturk
Copy link
Member

matthewturk commented Nov 1, 2023 via email

@cphyc
Copy link
Member Author

cphyc commented Nov 1, 2023

Oh I see, yes, I'll implement a side argument. To either get the output on the left or on the right!

@neutrinoceros
Copy link
Member

I have nothing against that "the price is right rule", but I'd like to point out that left-to-right reading order isn't culturally neutral. I'd prefer that the interface be more inclusive.

@cphyc
Copy link
Member Author

cphyc commented Nov 1, 2023

@neutrinoceros I've implemented a "smaller", "larger" or "nearest" switch so that there is no ambiguity arising from left-to-right vs. right-to-left reading order. Thanks for pointing it out!

@matthewturk
Copy link
Member

I think the new language is better. That being said, it's not necessarily true that RTL languages have different number line concepts:

https://math.stackexchange.com/questions/2618537/do-right-to-left-readers-also-reverse-mathematical-concepts-like-number-line

There's also some evidence that filmmaking conventions (such as progression from the left side of the screen to the right) are also commonly used outside of left-to-right languages.

I will say that with redshift the entire situation is odd because the numeric progression is reversed from temporal progression. And I did choose redshift as my example... So I guess this is on me!

@cphyc
Copy link
Member Author

cphyc commented Nov 2, 2023

@yt-fido test this please

@cphyc cphyc marked this pull request as ready for review November 2, 2023 09:31
Copy link
Member

@neutrinoceros neutrinoceros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The functionality is useful and the implementation is sound ! I have a couple questions and suggestions (mostly about type annotations)

# This will fail if no output is found within 100 Myr
ds = ts.get_by_time((3, "Gyr"), tolerance=(100, "Myr"))
# Get the output at the time right before and after 3 Gyr
ds_before = ts.get_by_time((3, "Gyr"), side="smaller")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think the name of the argument could be clearer than side. What about prefer="smaller" ?

yt/data_objects/time_series.py Outdated Show resolved Hide resolved
yt/data_objects/time_series.py Outdated Show resolved Hide resolved
yt/data_objects/time_series.py Outdated Show resolved Hide resolved
yt/data_objects/time_series.py Outdated Show resolved Hide resolved
yt/data_objects/time_series.py Outdated Show resolved Hide resolved
yt/data_objects/time_series.py Outdated Show resolved Hide resolved
yt/data_objects/time_series.py Outdated Show resolved Hide resolved
yt/data_objects/time_series.py Outdated Show resolved Hide resolved
yt/data_objects/time_series.py Outdated Show resolved Hide resolved
@cphyc
Copy link
Member Author

cphyc commented Nov 27, 2023

@yt-fido test this please

1 similar comment
@cphyc
Copy link
Member Author

cphyc commented Nov 27, 2023

@yt-fido test this please

Copy link
Contributor

@chrishavlin chrishavlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

I think the case of choosing a value exactly between two datasets needs to either explicitly handled or have its behavior defined in the docstring (see comments), but otherwise this is very useful!

yt/data_objects/tests/test_time_series.py Show resolved Hide resolved
yt/data_objects/tests/test_time_series.py Show resolved Hide resolved
yt/data_objects/tests/test_time_series.py Show resolved Hide resolved
yt/data_objects/time_series.py Show resolved Hide resolved
@cphyc cphyc added this to the 4.4.0 milestone Apr 12, 2024
@neutrinoceros
Copy link
Member

pre-commit.ci autofix

@cphyc
Copy link
Member Author

cphyc commented Apr 16, 2024

@yt-fido test this please

@neutrinoceros
Copy link
Member

@yt-fido Test this please

2 similar comments
@neutrinoceros
Copy link
Member

@yt-fido Test this please

@neutrinoceros
Copy link
Member

@yt-fido Test this please

@chrishavlin
Copy link
Contributor

Hey @cphyc , you still hoping for this to go into 4.4?

I'd still like some clarification on expected behavior when you choose a time exactly between two points while setting the prefer keyword (see #4717 (comment) ), but otherwise I think everything has been addressed?

@cphyc
Copy link
Member Author

cphyc commented Oct 3, 2024

@chrishavlin should be good now :) I am now testing the behaviour when exactly halfway which is working as expected as far as I could tell.

@chrishavlin
Copy link
Contributor

great! thanks, @cphyc !

@chrishavlin
Copy link
Contributor

@neutrinoceros I think all your comments have been addressed, mind giving this a quick re-review when you have a moment?

Copy link
Member

@neutrinoceros neutrinoceros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the ping ! Indeed my comments were all addressed !

@neutrinoceros
Copy link
Member

Last thing before we merge this: @cphyc, can you re-title the PR to something a bit more descriptive that I can include directly in release notes ?

@cphyc cphyc changed the title Find output by its attribute [ENH] Access output in timeseries by its time or redshift Oct 3, 2024
@cphyc
Copy link
Member Author

cphyc commented Oct 3, 2024

Done!

@chrishavlin chrishavlin merged commit fcec2c5 into yt-project:main Oct 3, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature Something fun and new!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants