Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(Python): Improve slicing performance #6042

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

lucasmcdonald3
Copy link
Contributor

@lucasmcdonald3 lucasmcdonald3 commented Jan 10, 2025

What was changed?

Add a new class Slice. This is a lazy slice for Seqs. The elements of a Seq are immutable, so slices of a Seq can refer to the original Seq's elements, and lazily calculate access based on the slice indices. Before, a slice of a Seq would copy all elements by value from the original to the new Seq.

  • Slicing a Seq now takes constant time.
  • Slicing a Slice takes constant time.
  • Getting a single element from a Slice takes constant time.
  • Getting the length of a Slice takes constant time.
  • Enumerating the elements in a Slice takes linear time.

How has this been tested?

No new functional testing. I hope that existing tests would catch any issues, but let me know if there are any doubts.

I ran a slicing performance test locally.

module Main {
  method Main(args: seq<string>) {
    var longList := seq(TestNum, i => i);
    var currentSeq := longList;
    while |currentSeq| > 0
    {
        var firstElem := currentSeq[0];
        currentSeq := currentSeq[1..];
    }
  }
}

Results from my local:

  • TestNum = 10000, Python before change: 9s

  • TestNum = 10000, Python after change: <1s

  • TestNum = 10000, .NET: ~1s

  • TestNum = 1000000, Python before change: at least 10 minutes

  • TestNum = 1000000, Python after change: ~1s

I don't plan committing the performance test file, since there's no functional testing there, but let me know if I should.

(I have separate testing in Crypto Tools' Python DBESDK (JSON encryption library) that brings a particular test execution from ~15 minutes to ~30 seconds.)

By submitting this pull request, I confirm that my contribution is made under the terms of the MIT license.

@lucasmcdonald3 lucasmcdonald3 marked this pull request as ready for review January 10, 2025 22:53
@lucasmcdonald3
Copy link
Contributor Author

(I think I'm missing permissions to add #2313 as an issue linked on the PR)

@robin-aws
Copy link
Member

Thanks for the contribution!

(I think I'm missing permissions to add #2313 as an issue linked on the PR)

AFAIK you can just say phrases like "Resolves #2313" to make that link happen.

But in this case we shouldn't anyway, since that issue is tracking a more general cross-backend improvement, and this change is just improving things for Python.

@robin-aws robin-aws requested a review from fabiomadge January 13, 2025 17:40
@lucasmcdonald3
Copy link
Contributor Author

Cool, thanks!

I made these changes as part debugging performance issues in Crypto Tools products. Now, I'm running into more performance "hiccups", and might need to make more changes.
Would y'all prefer I add any other changes to this PR, or a new PR?

@robin-aws
Copy link
Member

I'd say if the additional changes are small and somewhat related, feel free to add them here. Otherwise getting this merged and working on a new PR is probably better.

@lucasmcdonald3
Copy link
Contributor Author

I added my changes here because my new changes conflicted with the previous changes.
I'm probably done making significant changes.

Copy link
Collaborator

@fabiomadge fabiomadge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this is great! Do you have data on the impact of _SeqSlice as well?

@lucasmcdonald3
Copy link
Contributor Author

Thanks Fabio, updated the code.

I also updated the performance test code and results for the latest implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants