Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transaction Cleanup #1127

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Transaction Cleanup #1127

wants to merge 3 commits into from

Conversation

netopyr
Copy link
Contributor

@netopyr netopyr commented Feb 17, 2025

Description:

This PR adds a new HIP that describes the cleanup of transactions by defining a single standard of how TransactionBody and SignatureMap should be stored.

Signed-off-by: Michael Heinrichs <[email protected]>
@netopyr netopyr self-assigned this Feb 17, 2025
@netopyr netopyr requested a review from mgarbs as a code owner February 17, 2025 18:32
Copy link

netlify bot commented Feb 17, 2025

Deploy Preview for hedera-hips ready!

Name Link
🔨 Latest commit 322c730
🔍 Latest deploy log https://app.netlify.com/sites/hedera-hips/deploys/67b380d703920300089c1177
😎 Deploy Preview https://deploy-preview-1127--hedera-hips.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

## Rationale
To simplify the handling of transactions, we selected one of the possible ways to store the transaction data.

All incoming transactions that do not follow the standard format will be adjusted accordingly.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently in AtomicBatch if we do not follow the standard we fail the transaction. Do we want to adjust it there to keep it consistent with this HIP?
Or maybe because it's a new logic and we don't want to support the deprecated fields we are failing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot allow the deprecated format in an inner transaction. If we would try to normalize an inner transaction, the signature of the batch transaction would no longer be valid. We avoid this issue by enforcing the normalized format right from the start. This is a new functionality, and therefore, we do not need to fear we break existing applications.

Comment on lines +70 to +71
The record stream and block stream will contain the normalized transactions only.
The hash of a transaction is also calculated based on the normalized format.
Copy link
Member

@steven-sheehy steven-sheehy Feb 24, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So now if down stream system like block or mirror nodes want to generate the transaction hash or verify signatures separately it will be impossible since they don't have the original bytes. Specifically, block streams do not pass the transaction hash in the output/result to save bytes so mirror node performs a SHA384 of the signed transaction body bytes. It would be non-deterministic (and less performant) to re-serialize the TransactionBody back to bytes for transaction hash generation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If downstream systems need to calculate a hash or verify a signature, they have to use the raw bytes, similar to what we have to do in the consensus node.

Copy link
Member

@steven-sheehy steven-sheehy Feb 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But as I understand it we no longer have the raw bytes after this HIP. If user submits a TransactionBody and a TransactionBody is in the record/block stream then where is the raw bytes? Similarly if they submitted a signedTransactionBytes and the record stream and block stream will contain the normalized transactions only how will get the raw bytes to perform the calculations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The raw bytes of TransactionBody are still available. One approach to get them is defining a protobuf object like this and reading the body from there:

message Transaction2 {
    bytes body = 1;
    [...]
}

Or you can use the functionality that we currently add to PBJ.

The byte array in signedTransactionBytes that represents the TransactionBody will be preserved.

Copy link
Member

@steven-sheehy steven-sheehy Feb 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think what you're saying is even though it's normalized to TransactionBody body you'll be passing the raw bytes from the wire to the mirror node (whether from bodyBytes, signedTransactionBytes.bodyBytes, or from capturing the raw bytes of body from gRPC server) and we have to use a custom proto if we also want to treat it as bytes?

All incoming transactions that do not follow the standard format will be adjusted.

If the consensus node receives a transaction that uses `bodyBytes` instead of `body`, we create a new `Transaction` with `body` set to the `TransactionBody` contained in `bodyBytes`.
The original byte array will be preserved to ensure all signatures can still be verified.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section doesn't discuss how consensus nodes will change signature verification to account for receiving a transaction with TransactionBody. Signature verification should require a byte array to operate upon. As I understand it, the custom gRPC server can capture the raw client bytes for these. But perhaps we should call this out explicitly here as another scenario?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think how we do signature verification is an implementation detail and should not be part of this specification.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps I was unclear. I'm not asking how to do signature verification. Currently, the section talks about how it will handles bodyBytes and signedTransactionBytes flow. It does not address how it will handle the body flow and this was a large source of my confusion and should be called out explicitly. I'm sure other readers of this HIP will have a similar misunderstanding.

This flow is special in that in addition to having the de-serialized TransactionBody body the gRPC server will also have to separately capture the raw bytes on the wire for use in transaction hash generation and signature verification. So effectively it maintains two copies of each transaction in memory. Additionally, these raw bytes, if I'm not mistaken, will be the one passed onto record/block streams and not the serialization of the TransactionBody? These sorts of details should be in the HIP. So I'm not interested in signature verification details but how the bytes are flowing through the system.


### Changes in record stream and block stream
The record stream and block stream will contain the normalized transactions only.
The hash of a transaction is also calculated based on the normalized format.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the transaction hash is generated from the signedTransactionBytes. Since the data is being normalized in the record stream it sounds like we won't have the exact bytes of the signedTransactionBytes. We could attempt to reconstruct it but our deserializer could order the fields differently or it could have unknown fields that are now gone.

Is this algorithm changing so that it's generated from the TransactionBody body plus sigMap since that's what's been normalized? If so, can you please detail the algorithm as mirror node needs to implement it as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants