Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k256+p256: performance optimization tracking issue #362

Closed
1 of 3 tasks
tarcieri opened this issue Jun 14, 2021 · 2 comments
Closed
1 of 3 tasks

k256+p256: performance optimization tracking issue #362

tarcieri opened this issue Jun 14, 2021 · 2 comments

Comments

@tarcieri
Copy link
Member

tarcieri commented Jun 14, 2021

This is a tracking issue for various performance optimizations that could potentially be applied to the k256 and p256 crates. It's not entirely clear whether any of these will actually improve performance, and some are mutually exclusive.

ECDSA verification

Note: these optimizations could be applied to secp256k1 public key recovery as well (i.e. k256::ecdsa::recoverable::Signature::recover_verify_key)

  • variable-time inversions
  • Shamir's Trick
  • wNAF multiplications

Use variable-time scalar inversions

Both k256 and p256 are presently using constant-time algorithms for scalar inversions.

p256 implements Scalar::invert_vartime using Stein's algorithm. The same algorithm could be applied to k256 as well.

Use Shamir's Trick

ECDSA verification involves computing aP + bQ, which is presently performed in both k256 and p256 using two scalar multiplications and a point addition.

For signature verification, this can be reduced to one scalar multiplication by using Shamir's Trick, which performs additions inside the chain instead of outside. With a precomputed P + Q, the performance is only slightly slower than a single scalar multiplication.

Use wNAF multiplications

Some generic arithmetic for this is implemented in the group crate.

Semi-related: this resulted in a ~30% speedup for Zcash trial decryption: zcash/librustzcash#332

@tarcieri
Copy link
Member Author

I did some preliminary investigations of all of these options and unfortunately none of them are panning out too well.

I wasn't able to get any of the implementations to compute correct results, but went ahead and benchmarked them anyway to see if they were worth further investigation.

A naive implementation of Shamir's Trick was minutely slower than the current implementation.

wNAF multiplications appeared to be ~25% slower.

@tarcieri
Copy link
Member Author

I'm going to go ahead and close this as an overall issue.

#380 implemented Shamir's Trick for k256. We can still do something similar for p256.

I think there might be some merit in still investigating these approaches, but at this point, I'd suggest opening a new tracking issue for any given one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant