Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce decorator for broadcasting 1D methods #397

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

willGraham01
Copy link
Contributor

@willGraham01 willGraham01 commented Jan 31, 2025

Description

What is this PR

  • Bug fix
  • Addition of a new feature
  • Other

Why is this PR needed?

The shapely library does not offer vectorised functions. The data that we will typically be seeing is 2-dimensional in space, but may have multiple other dimensions for time, individuals, etc. shapely is not vectorised, so cannot handling broadcasting a single operation across the corresponding dimension of a numpy array. Furthermore, shapely also largely depends on casting to it's own internal Geometry objects before such methods can be applied.

Thus we would benefit greatly from having the ability to broadcast these 1D-functions we would need to write across dimensions automatically.

What does this PR do?

  • Introduces the make_broadcastable decorator into the utils folder. More on this decorator below.

make_broadcastable is a decorator that is designed to help reduce the code bloat that "vectorising" the shapely operations would need. It turns functions that act on 1D data (notably, the shapely functions we will be using) into functions that can act along a given axis of a DataArray input. More broadly, make_broadcastable is also quite useful to have available publicly to users; they can write functions that assume they're only operating on one "piece" of data (EG one spatial coordinate) and can then use the decorators to extend this in the manner described above.

As an example, suppose we have a function is_in_unit_square that takes a 1D (x, y) coordinate array as its input, and returns a boolean value indicating whether or not (x, y) is inside the unit square:

def is_in_unit_square(xy: ArrayLike) -> bool:
    ...
    return true_or_false

If we now have a DataArray of shape (10, 2, 3) and dims ("time", "space", "individuals"),

import xarray as xr

da = xr.DataArray(
    data=np.ones((10, 2, 3)),
    dims=("time", "space", "individuals"),
    coords=...,
)

we would have to apply is_in_unit_square 30 (=10 * 3) times to compute whether each spatial coordinate in da was inside the unit square.

To avoid having to do this in a for loop, or to re-write is_in_unit_square to accommodate DataArray inputs, we can simply decorate it with make_broadcastable:

@make_broadcastable(only_broadcastable_along = "space")
def is_in_unit_square(xy: ArrayLike) -> bool:
    ...
    return true_or_false

and now we can call

was_in_unit_square = is_in_unit_square(da)

was_in_unit_square is a (10,3) DataArray of boolean values - each entry is the result of applying is_in_unit_square along the corresponding "space" dimension.

make_broadcastable also works for functions that output 1D numpy arrays too - in this case, the output replaces the dimension to be acted along (EG "space") with a new dimension in the output DataArray that contains the result.

References

Tangentially related to #377, as this makes our lives a lot easier when interacting with shapely.

How has this PR been tested?

Addition of local test suite.
The example workbook also runs without errors, which confirms the syntax is correctly employed too.

Is this a breaking change?

No

Does this PR require an update to the documentation?

An example has been added to the gallery, although it might be more tailored for use in the developer docs rather than the showcase. Happy to take opinions on this.

Checklist:

  • The code has been tested locally
  • Tests have been added to cover all new functionality
  • The documentation has been updated to reflect any changes
  • The code has been formatted with pre-commit

This comment was marked as resolved.

@willGraham01 willGraham01 mentioned this pull request Jan 31, 2025
7 tasks
@willGraham01 willGraham01 force-pushed the wgraham-broadcasting-decorator branch from a56f748 to ccaa458 Compare January 31, 2025 14:23
@willGraham01 willGraham01 marked this pull request as ready for review February 3, 2025 11:18
@willGraham01 willGraham01 requested a review from niksirbi February 3, 2025 11:19
Copy link
Member

@niksirbi niksirbi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this monumental piece of decorator wizardry @willGraham01!

I like the idea, and find it well tested and meticulously documented.

Most of my comments are minor and purely cosmetic in nature.

My more substantial feedback has to do with making the example conceptually simpler, and somewhat shorter. Broadcasting and decorators are tough concepts to wrap one's head around, even for me. If we want users to benefit from the power of these decorators, we need to make them easy to understand (or as easy as we can). The success of this feature will mostly hinge on the quality of the example, which is why I've focused my feedback there.

As you'll see in the relevant comment, I have two proposals for reducing the length and the conceptual complexity of the example:

  • use one of our existing sample datasets instead of constructing your own at the start
  • motivate the problem with something simpler than camera FOV. Though this application is very cool and relevant for the audience, having to think about angles and trigonometry detracts from the main message of this example, at least in my opinion.

movement/utils/broadcasting.py Show resolved Hide resolved
movement/utils/broadcasting.py Show resolved Hide resolved
movement/utils/broadcasting.py Show resolved Hide resolved
tests/test_unit/test_make_broadcastable.py Show resolved Hide resolved
tests/test_unit/test_make_broadcastable.py Show resolved Hide resolved
examples/broadcasting_your_own_methods.py Outdated Show resolved Hide resolved
examples/broadcasting_your_own_methods.py Show resolved Hide resolved
examples/broadcasting_your_own_methods.py Outdated Show resolved Hide resolved
)

xr.testing.assert_equal(was_in_view_da_broadcasting, was_in_view_space_only)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe also mentione the space_broadcastable alias here.

fig.show()

# %%
# Motivation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example is already very long, and broadcasting and decorators are hard concepts to understand.
Although the camera example is very very cool, and even topically relevant for our audience, I find that's it's too complex for this example, because we are asking readers to consider trigonometry concepts, and besides it takes much space.

My take is that since broadcasting and decorators are complex enough, every other concept in this example should be as simple as possible. So I'd suggest replacing the camera FOV calculation with sth much simpler. For example, now that we've merged the basic ROI classes, perhaps the example could be if the individuals are within a certain polygon, or if they are on the left/right side of a vertical line. I'm also open to other, simpler ideas.

If you adopt my suggestion, I'd also consider replacing the custom dataset you construct at the beginning of this example with one of our sample datasets, for example the one with the three mice that is used in "Compute and visualise kinematics" example. Reasoning is similar: the code for creating the dataset takes up lots of space and delays getting to the essence of this example.

I'm in general worried that long examples will scare away readers. I'm aware some of our existing examples are similarly long, but I don't want to add to this problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, now that we've merged the basic ROI classes, perhaps the example could be if the individuals are within a certain polygon, or if they are on the left/right side of a vertical line. I'm also open to other, simpler ideas.

Yeah you've hit the nail on the head as to why my original PR scope-drifted! 😂 I can defo re-write the example though - I'll probably use "points inside a square" as an example. The RoI class can make an appearance at the end (since this decorator is how we're intending to write the RoI methods!)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good to me!

Copy link

sonarqubecloud bot commented Feb 7, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants