Docs: Create Sparse Merkle Tree test spec #66

bvrooman · 2022-02-14T21:48:11Z

Steps toward an automated test suite:

Define the test spec (this PR)
Codify test spec in a machine readable format: Feature Request: Machine readable test suite for SMT testing #67
Read machine readable test spec and generate test suite: Feature Request: Implement a test execution layer to automate the machine readable test suite #68

Voxelot · 2022-02-16T19:36:59Z

This kind of works, although I was hoping we'd have something more along the lines of a machine readable format e.g. toml or json that the library could execute against.

bvrooman · 2022-02-17T02:12:14Z

This kind of works, although I was hoping we'd have something more along the lines of a machine readable format e.g. toml or json that the library could execute against.

I considered a DSL approach for this. I think that such an approach could be built alongside this spec.

Regardless of the ultimate testing strategy implemented by an SMT library, we would need some sort of specification document, since any sort of machine readable test spec still has to be manually compiled. While the Celestia SMT currently provides the expected hashes, it is not the producer of this spec - it is a consumer. It will need to implement this spec too. Ultimately, that could mean testing against the machine readable version. Maybe in the future, the machine readable test spec can be produced automatically, too.

But there are limitations of a DSL approach.

A simple declarative format, like JSON can work, but it will become unwieldy when tests require a large number of steps. A 100-step test would require 100 objects defining each step.
Translating leaf keys and leaf data into binary inputs requires specification anyway. What is the data type/size of numeric input? Is the binary big endian or little endian? Is the text input UTF-8 or another format?
The client must interpret the DSL and inputs, and map them to function calls, which could be brittle if the language/notation ever need updating.

We can agree to avoid large tests or any ambiguity in the data formats (e.g. UTF-8 string inputs only), but ultimately the limitations of this test language become limitations in the usefulness of the test suite altogether. All this considered, is this approach superior to implementing the test suite by hand?

I would say that it's possible to go with a machine readable format, but I think we should have a design discussion around that first. I would argue this shouldn't gate our own SMT implementation in the meantime.

Voxelot · 2022-02-22T05:49:25Z

I agree that we shouldn't gate on the automated harness as long we we have this spec manually implemented for now. Please create an issue for the machine readable spec if you haven't already and we can discuss that in more detail later.

docs/test-specs/sparse_merkle_tree_tests.md

bvrooman · 2022-02-22T19:59:18Z

I agree that we shouldn't gate on the automated harness as long we we have this spec manually implemented for now. Please create an issue for the machine readable spec if you haven't already and we can discuss that in more detail later.

I wrote up two tickets for the automated harness:

Feature Request: Machine readable test suite for SMT testing #67
Feature Request: Implement a test execution layer to automate the machine readable test suite #68

I haven't fully implemented this spec as unit tests in the SMT PR yet (as you saw, a couple tests are missing or deviate in some way). I would like to make sure we have a consensus on this document before I codify it in the library. A few tests are affected by the Celestia spec discrepancy, so there's some ambiguity there. As you suggest, I may just write that those tests are still WIP until the issue is resolved.

adlerjohn

Hmm. Shouldn't keys be 32 bytes?

bvrooman · 2022-02-24T19:05:08Z

Hmm. Shouldn't keys be 32 bytes?

Key input can be any length - upon calling update(&key, &data), the key input gets hashed into 32 bytes (and similarly, the data input gets hashed into 32 bytes). These hashed keys and data of 32 bytes are what become leaves and nodes. The intuition is that update(..) can be called with any key, as long as the key can be represented as an array of bytes, and we don't impose the internal requirement of hashing keys on the end user.

bvrooman · 2022-02-24T19:09:14Z

I am changing empty root values in this document to the zero sum (000...) in the meantime, until we can come to a conclusion on the issue in Celestia. I am going with the zero sum for now since that is currently what both code implementations produce, so we are at least compliant (possibly wrong, but at least compliant!).

Voxelot · 2022-02-24T20:50:48Z

Should we also make a repo of the Go code used to generate these roots from celestia's impl as a future reference in case we want to update these tests? Or just put them in a local subdir somewhere in the docs of this repo?

bvrooman · 2022-02-25T18:19:16Z

Should we also make a repo of the Go code used to generate these roots from celestia's impl as a future reference in case we want to update these tests? Or just put them in a local subdir somewhere in the docs of this repo?

I would suggest that we fork Celestia's SMT repo. In the fork, I can put up the tests I wrote to generate the roots for this document. We could then add or modify the tests in the fork and record the outputs of those tests in this document. I imagine with a few iterations, we could get a high degree of confidence in our test coverage.

Voxelot · 2022-02-25T21:59:35Z

That makes sense, from the fork it would be easy to PR the tests back to the official celestia SMT repo once we're happy with the coverage.

adlerjohn · 2022-02-27T00:21:57Z

I'm okay with forking the Go repo only if it's used purely to upstream the changes. We don't want to maintain a Go implementation of the SMT. With that said, do we really need to fork the repo, or can @bvrooman just contribute the change upstream immediately?

adlerjohn · 2022-02-27T00:23:24Z

the key input gets hashed into 32 bytes

Wait no, the key is not hashed. Or rather, should not be. See celestiaorg/smt#40. Note that the SMT spec does not require hashed keys. That's left to the application.

bvrooman · 2022-02-28T16:52:26Z

the key input gets hashed into 32 bytes

Wait no, the key is not hashed. Or rather, should not be. See celestiaorg/smt#40. Note that the SMT spec does not require hashed keys. That's left to the application.

I see! On the Fuel side, I can modify the SMT interface to take in a Bytes32. In order for the tests to line up with the Celestia results, I can continue to hash the keys via sha256, and pass in the resulting 32 bytes. I see that there is already a PR in Celestia to move towards raw keys.

For the test vectors in this spec, I will move the hashing from the library side to the client side to produce 32 byte keys. If we specify that the test inputs are generated by using Sha256 on an input to create the 32 byte keys, we can ensure continuity of keys between implementations.

bvrooman · 2022-02-28T16:57:57Z

I'm okay with forking the Go repo only if it's used purely to upstream the changes. We don't want to maintain a Go implementation of the SMT. With that said, do we really need to fork the repo, or can @bvrooman just contribute the change upstream immediately?

The fork may not be necessary - it was just an idea to stage changes coming from us before committing to Celestia. I anticipate that the test spec will undergo at least couple of iterations. But I'm also happy to just commit any changes directly to Celestia.

It looks like the migration to using raw keys definitely affects the test cases I used to generate the roots for this document. I can put the current test suite up in a Celestia PR, but it may be worth waiting for that change to go in before I commit it.

adlerjohn · 2022-02-28T17:12:33Z

The Celestia SMT is going slow because it's not on the required path for now. I would say ignore adding anything to that repo for the time being.

bvrooman added 3 commits February 14, 2022 16:47

Create sparse_merkle_tree_tests.md

1ef65f9

Typo

6820a48

Remove language hints for pseudocode

e8c3443

bvrooman added the documentation Improvements or additions to documentation label Feb 14, 2022

bvrooman self-assigned this Feb 14, 2022

bvrooman added 6 commits February 15, 2022 16:21

Additional tests

a31c982

Additional tests

966f451

Abstract

060e3fb

Wording

63b7600

Wording

1daf50d

Update test pseudocode

c241a33

bvrooman marked this pull request as ready for review February 16, 2022 04:55

bvrooman requested review from adlerjohn and Voxelot February 16, 2022 04:56

Correction to Update 2 Delete 1

8d043a4

Voxelot reviewed Feb 22, 2022

View reviewed changes

docs/test-specs/sparse_merkle_tree_tests.md Outdated Show resolved Hide resolved

Voxelot reviewed Feb 22, 2022

View reviewed changes

docs/test-specs/sparse_merkle_tree_tests.md Outdated Show resolved Hide resolved

bvrooman mentioned this pull request Feb 22, 2022

Feature Request: Machine readable test suite for SMT testing #67

Closed

4 tasks

bvrooman added 2 commits February 22, 2022 15:51

Additional test and revisions

a843482

Update sparse_merkle_tree_tests.md

93a33d7

adlerjohn reviewed Feb 24, 2022

View reviewed changes

Clean up spec

63a331f

Add last test to TOC

fa587c8

Voxelot previously approved these changes Feb 24, 2022

View reviewed changes

Update sparse_merkle_tree_tests.md

13c71db

bvrooman dismissed Voxelot’s stale review via 13c71db February 26, 2022 01:51

Typos

1c65093

bvrooman added 4 commits March 2, 2022 13:59

Convert key inputs to 32 byte keys

d9fe653

Remove deprecated description

43f2d6b

Specify big endian

f4f8700

Missed characters

7dee3fe

Voxelot approved these changes Mar 3, 2022

View reviewed changes

adlerjohn approved these changes Mar 3, 2022

View reviewed changes

bvrooman merged commit 0a7d91c into master Mar 3, 2022

bvrooman deleted the bvrooman-docs-sparse-merkle-tree-test-spec branch March 3, 2022 20:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Docs: Create Sparse Merkle Tree test spec #66

Docs: Create Sparse Merkle Tree test spec #66

bvrooman commented Feb 14, 2022 •

edited

Loading

Voxelot commented Feb 16, 2022

bvrooman commented Feb 17, 2022 •

edited

Loading

Voxelot commented Feb 22, 2022 •

edited

Loading

bvrooman commented Feb 22, 2022 •

edited

Loading

adlerjohn left a comment

bvrooman commented Feb 24, 2022 •

edited

Loading

bvrooman commented Feb 24, 2022 •

edited

Loading

Voxelot commented Feb 24, 2022 •

edited

Loading

bvrooman commented Feb 25, 2022

Voxelot commented Feb 25, 2022 •

edited

Loading

adlerjohn commented Feb 27, 2022

adlerjohn commented Feb 27, 2022

bvrooman commented Feb 28, 2022 •

edited

Loading

bvrooman commented Feb 28, 2022 •

edited

Loading

adlerjohn commented Feb 28, 2022

Docs: Create Sparse Merkle Tree test spec #66

Docs: Create Sparse Merkle Tree test spec #66

Conversation

bvrooman commented Feb 14, 2022 • edited Loading

Voxelot commented Feb 16, 2022

bvrooman commented Feb 17, 2022 • edited Loading

Voxelot commented Feb 22, 2022 • edited Loading

bvrooman commented Feb 22, 2022 • edited Loading

adlerjohn left a comment

Choose a reason for hiding this comment

bvrooman commented Feb 24, 2022 • edited Loading

bvrooman commented Feb 24, 2022 • edited Loading

Voxelot commented Feb 24, 2022 • edited Loading

bvrooman commented Feb 25, 2022

Voxelot commented Feb 25, 2022 • edited Loading

adlerjohn commented Feb 27, 2022

adlerjohn commented Feb 27, 2022

bvrooman commented Feb 28, 2022 • edited Loading

bvrooman commented Feb 28, 2022 • edited Loading

adlerjohn commented Feb 28, 2022

bvrooman commented Feb 14, 2022 •

edited

Loading

bvrooman commented Feb 17, 2022 •

edited

Loading

Voxelot commented Feb 22, 2022 •

edited

Loading

bvrooman commented Feb 22, 2022 •

edited

Loading

bvrooman commented Feb 24, 2022 •

edited

Loading

bvrooman commented Feb 24, 2022 •

edited

Loading

Voxelot commented Feb 24, 2022 •

edited

Loading

Voxelot commented Feb 25, 2022 •

edited

Loading

bvrooman commented Feb 28, 2022 •

edited

Loading

bvrooman commented Feb 28, 2022 •

edited

Loading