Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve serialization efficiency when objects are proxied #120

Merged
merged 3 commits into from
Dec 13, 2023
Merged

Conversation

gpauloski
Copy link
Collaborator

To avoid double serialization of objects that end up being proxied, this PR passes a shim serializer and a deserializer wrapper to store.proxy() to take advantage of the fact we already have a serialized string when proxying an object.

Let me know what you think about this solution. I think it's the simplest in terms of implementation, but as I note in the commit message and code comment, there's still this conversion from bytes -> str -> bytes when using pickle as the Colmena serialization method.

Also increases the max supported version of ProxyStore to allow v0.6.

Fixes #117

Proxied objects end up being serialized twice: first when the object is
serialized with Colmena to check the size of the serialized object and
then again by ProxyStore when the object is proxied.

ProxyStore supports passing custom serializer/deserializer functions so
this commit adds a shim serializer and a deserializer wrapper to
minimize serialization overheads when an object gets proxied.

Note that there is still one extra step when using Colmena's pickle
serialization method and ProxyStore. SerializationMethod.serialize
will pickle the object producing a byte string then convert those bytes
to a hex string then the shims convert that hex string back to bytes.
I.e., there's this intermediate string representation that's created
then discarded in this path. I don't really see a way of avoiding this
because its an artifact of the different serialization types between
Colmena and ProxyStore (str vs bytes).
@gpauloski gpauloski requested a review from WardLT December 13, 2023 17:53
@WardLT
Copy link
Collaborator

WardLT commented Dec 13, 2023

LGTM. Thanks, @gpauloski !

@WardLT WardLT merged commit 6ad3625 into master Dec 13, 2023
2 checks passed
@gpauloski gpauloski deleted the issue-117 branch December 13, 2023 19:06
gpauloski added a commit that referenced this pull request Feb 21, 2024
PR #120 introduced some shims for converting between Colmena's
serialization utilities and ProxyStore's expected serialized type of
bytes. One of these shims had a typo of "serialize" instead of
"deserialize". I think this did not get caught by CI because there's no
test for queues with ProxyStore enabled, there's just tests for manually
passing proxies as function inputs.
gpauloski added a commit that referenced this pull request Feb 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Avoid double serialization with proxystore
2 participants