Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define setter for Image.array to allow inplace operations #1276

Closed
wants to merge 9 commits into from

Conversation

sidneymau
Copy link
Contributor

(Duplicate of #1273 due to rebasing/etc.)
This pull request addresses #1272 by defining a setter for the Image.array attribute. This protects against any case where a user does an inplace operation on the image array attribute in an interactive environment: without the setter, this would raise an error while still performing the operation, leading to some confusion. This change also obviates the need to operate on a view of the underling array; e.g., instead of

image.array[:] += 1

we can now do

image.array += 1

Copy link
Contributor Author

@sidneymau sidneymau Feb 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copying comment from Mike:

I'd like a few more tests that guarantee that the setter doesn't allocate new memory or change fundamental things about the array. I think the implementation is ok, but there are other ways to have written the setter which would pass this test, but would break the image object.
e.g.

image1.array = 1  # should keep array shape and dtype
image1.array = 1+2j  # should only work if array is already complex
image1.array = np.zeros((3,3))  # should only work if current shape is (3,3)

Also, all of those should fail if the image is a cont view.
In other words, try to come up with as many stress tests as possible and make sure it behaves as we would want it to behave.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative implementation of the setter is the following:

    @array.setter
    def array(self, other):
        """Set the numpy array in-place.
        """
        self_array = self.array
        other_array = np.asarray(other)
        if self_array.shape != other_array.shape:
            raise GalSimError("Other array shape is not equal to current array shape")
        if self_array.dtype != other_array.dtype:
            raise GalSimError("Other array dtype is not equal to current array dtype")
        self._array = other_array

This has an advantage in explicitly guarding against mutations to shape and/or dtype. I believe this is more robust and closer to what we want

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks worse to me. I think numpy already guarded against these things. And you lost the feature that writing to a const view would raise an exception.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the dtype check is wrong. For instance, if dtype is float, you should be allowed to assign integers to it and let it remain a float array. (Which I believe the previous implementation did correctly.)

Copy link
Member

@rmjarvis rmjarvis Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, another problem with this implementation, which might merit a test is that it loses the careful 128 bit alignment that we do for image arrays. The array should always satisfy:

assert image.array.ctypes.data % 16 == 0

Looks like we don't currently have tests for this, but it makes the SSE stuff in C++ work faster.

Copy link
Contributor Author

@sidneymau sidneymau Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose there's a third version: self._array = np.asarray(other) without the type checking:

>>> image = galsim.Image(np.zeros((2, 2), dtype=int))
>>> image.array = np.array([[1.2, 3.8], [5.6, 6.7]])
>>> image.array
array([[1.2, 3.8],
       [5.6, 6.7]])

This is probably preferred to each of the above? Unless we want to enforce that the dtype of an Image array doesn't mutate.

We would still want to test against shapes, I think, otherwise this leads to

>>> image.array = 1
>>> image.array
array(1)

Copy link
Contributor Author

@sidneymau sidneymau Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, another problem with this implementation, which might merit a test is that it loses the careful 128 bit alignment that we do for image arrays. The array should always satisfy:

assert image.array.ctypes.data % 16 == 0

Looks like we don't currently have tests for this, but it makes the SSE stuff in C++ work faster.

There's currently a commented assert checking this elsewhere in Image—is this worth raising an error/warning for in the class, or just writing a test?

followup: I have not succeeded in making a numpy array that does not satisfy array.ctypes.data % 16 == 0, so will have to figure out how to do that to write this test

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not succeeded in making a numpy array that does not satisfy array.ctypes.data % 16 == 0, so will have to figure out how to do that to write this test

Maybe numpy added that as a feature. They didn't used to guarantee that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

>>> image.array = 1
>>> image.array
array(1)

This should set all values to 1, which your original implementation did correctly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

>>> image = galsim.Image(np.zeros((2, 2), dtype=int))
>>> image.array = np.array([[1.2, 3.8], [5.6, 6.7]])
>>> image.array
array([[1.2, 3.8],
       [5.6, 6.7]])

This one is definitely wrong. Assignment should absolutely not change the dtype. To be consistent with how things like += work currently, I think assigning e.g. float to int should just convert them. The implementation would be

self.array[:] = self._safe_cast(other)

cf. methods like __iadd__ etc. in the Image class.

@sidneymau sidneymau changed the title Define setter for image.array to allow inplace operations Define setter for Image.array to allow inplace operations Feb 26, 2024
shape2=other_array.shape
)
if other_array.ctypes.data % 16 != 0:
galsim_warn("Array may not be memory aligned")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a warning or an error? We could also try realigning the array, but I'm not sure if that's within scope here...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you're getting my point. I don't want the array object to change upon setting. Use the one that is already in the image object. The original implementation you had was basically right. Here is another test that I think might make my point clearer.

im = galsim.ImageI(3,3)
ar = np.ones((3,3))
im.array = ar   # This test currently works if use im.array[:] = ar instead
assert np.all(im.array == 1)
ar *= 2
assert np.all(im.array == 1)

In other words, IMO the setter should work exactly like if you used [:].

The only exception is that floats should round rather than truncate to match how +=, etc. work.

Another probably useful test is that if the image starts out all 0's, then im.array = ar should be equivalent to im.array += ar. For instance, the above test should also work if you use replace the assignment with

im.array += ar

This is close to the original reason for this PR -- you had wanted to do that += step, and it gave an error, but still worked.

zeros_array = np.zeros((m, n))
ones_array = np.ones((m, n))
assert zeros_array.ctypes.data % 16 == 0
assert ones_array.ctypes.data % 16 == 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These will break if numpy is not guaranteeing that arrays are memory-aligned by default, which I gather was not previously the case. We can get around this by writing a function to explicitly make memory-aligned arrays, but doing so may not be necessary

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code that used to be necessary, but apparently might not be anymore is here:
https://github.com/GalSim-developers/GalSim/blob/releases/2.5/galsim/image.py#L593

farray = np.array([[1.2, 3.8], [5.6, 6.7]])
image = galsim.Image(np.zeros((2, 2), dtype=int))
image.array = farray
np.testing.assert_array_equal(image.array, farray)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the preference for broadcasting Image arrays is—this tests that an int array gets successfully cast to a float array, which may not be the desired behavior

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not desired behavior.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

>>> image.array = 1
>>> image.array
array(1)

This should set all values to 1, which your original implementation did correctly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

>>> image = galsim.Image(np.zeros((2, 2), dtype=int))
>>> image.array = np.array([[1.2, 3.8], [5.6, 6.7]])
>>> image.array
array([[1.2, 3.8],
       [5.6, 6.7]])

This one is definitely wrong. Assignment should absolutely not change the dtype. To be consistent with how things like += work currently, I think assigning e.g. float to int should just convert them. The implementation would be

self.array[:] = self._safe_cast(other)

cf. methods like __iadd__ etc. in the Image class.

shape2=other_array.shape
)
if other_array.ctypes.data % 16 != 0:
galsim_warn("Array may not be memory aligned")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you're getting my point. I don't want the array object to change upon setting. Use the one that is already in the image object. The original implementation you had was basically right. Here is another test that I think might make my point clearer.

im = galsim.ImageI(3,3)
ar = np.ones((3,3))
im.array = ar   # This test currently works if use im.array[:] = ar instead
assert np.all(im.array == 1)
ar *= 2
assert np.all(im.array == 1)

In other words, IMO the setter should work exactly like if you used [:].

The only exception is that floats should round rather than truncate to match how +=, etc. work.

Another probably useful test is that if the image starts out all 0's, then im.array = ar should be equivalent to im.array += ar. For instance, the above test should also work if you use replace the assignment with

im.array += ar

This is close to the original reason for this PR -- you had wanted to do that += step, and it gave an error, but still worked.

zeros_array = np.zeros((m, n))
ones_array = np.ones((m, n))
assert zeros_array.ctypes.data % 16 == 0
assert ones_array.ctypes.data % 16 == 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code that used to be necessary, but apparently might not be anymore is here:
https://github.com/GalSim-developers/GalSim/blob/releases/2.5/galsim/image.py#L593

farray = np.array([[1.2, 3.8], [5.6, 6.7]])
image = galsim.Image(np.zeros((2, 2), dtype=int))
image.array = farray
np.testing.assert_array_equal(image.array, farray)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not desired behavior.

@rmjarvis
Copy link
Member

@sidneymau This has been dormant for a couple months now. Did you want to try to finish it, or should we abandon this effort?

@sidneymau
Copy link
Contributor Author

Thanks for bumping this—I'm happy to consider this abandoned for now and have the PR closed and the branch pruned. I may pick it up again when I have more time, but I don't expect that to be any time soon.

@sidneymau sidneymau closed this May 13, 2024
@sidneymau sidneymau deleted the #1272 branch May 13, 2024 06:42
@rmjarvis rmjarvis added this to the No action milestone Jun 6, 2024
@rmjarvis rmjarvis added base classes Related to GSObject, Image, or other base GalSim classes feature request Request for a new feature in GalSim labels Aug 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
base classes Related to GSObject, Image, or other base GalSim classes feature request Request for a new feature in GalSim
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants