Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API SSZ response #5128

Closed
wemeetagain opened this issue Feb 10, 2023 · 15 comments · Fixed by #6749
Closed

API SSZ response #5128

wemeetagain opened this issue Feb 10, 2023 · 15 comments · Fixed by #6749
Labels
meta-feature-request Issues to track feature requests.

Comments

@wemeetagain
Copy link
Member

Is your feature request related to a problem? Please describe.

Describe the solution you'd like

Systematically support returning ssz-encoded responses throughout all/most API endpoints.
Systematically return the Eth-Consensus-Version header

Figure out how to expose encoding choice + raw response in our api client. (Take notes from fetch API?)

Additional context

There are a benefit from using ssz across the board in production. SSZ encoding is more performant than JSON encoding.

We also already have some consumers that are requesting this feature.

@philknows philknows added the meta-feature-request Issues to track feature requests. label Feb 18, 2023
@maschad
Copy link
Contributor

maschad commented Mar 6, 2023

The getReqSerializers() function checks the headers for the requested format in the header. Given this function is implemented across a variety of other clients I am thinking it should be an interface that does this check and then based on that generates a JSON client or an SSZ client. Wdty @nazarhussain @wemeetagain @dapplion

@nazarhussain
Copy link
Contributor

nazarhussain commented Mar 6, 2023

I suggest API clients should not be encoding specific but each endpoint. I can initiate a client, and for some endpoints I need SSZ and for some I need JSON.

We have one endpoint already implemented with this approach. Check this reference.

getState(
stateId: StateId,
format?: "json"
): Promise<
ApiClientResponse<{[HttpStatusCode.OK]: {data: allForks.BeaconState; executionOptimistic: ExecutionOptimistic}}>
>;
getState(stateId: StateId, format: "ssz"): Promise<ApiClientResponse<{[HttpStatusCode.OK]: Uint8Array}>>;
getState(
stateId: StateId,
format?: StateFormat
): Promise<
ApiClientResponse<{
[HttpStatusCode.OK]: Uint8Array | {data: allForks.BeaconState; executionOptimistic: ExecutionOptimistic};
}>
>;

The second argument from the API client should be passed as HEADER to the request and than parsed by ReqeustSerializers.

@dapplion
Copy link
Contributor

dapplion commented Mar 7, 2023

@nazarhussain I imagine an API ergonomics like this would be nice:

api.beacon.getGenesis() // queries with mime-type json
api.beacon.ssz.getGenesis() // queries with mime-type ssz

WDYT

@nazarhussain
Copy link
Contributor

My preferences would be

api.beacon.getGenesis() // queries with mime-type json
api.beacon.getGenesis('ssz') // queries with mime-type ssz

For the following reasons:

  1. Namespace derivation is associated with the eth API, so we should not use it for other purposes.
  2. The parameter convention is consistent to use.
  3. We can extend easily for more mime types in future.

@dapplion
Copy link
Contributor

dapplion commented Mar 7, 2023

@nazarhussain The API has a lot of surface so supporting that generically should not involve a huge change that make maintenance a pain. Could you draft some ideas on how to implement that last argument generically?

A current feature of the library type is that the API has the same interface consumed programmatically or via REST. Having to add this last parameter must have to break this symmetry, otherwise api/impl/* would be a complete nightmare.

What about name-spacing at the top level?

api.beacon.getGenesis() // queries with mime-type json
api.ssz.beacon.getGenesis() // queries with mime-type ssz

@maschad
Copy link
Contributor

maschad commented Mar 7, 2023

@dapplion @nazarhussain I think we should focus on the server for now as this issue was originally raised with the consumers of these endpoints in mind.

@nazarhussain
Copy link
Contributor

Could you draft some ideas on how to implement that last argument generically?

There could be two options:

  1. We can do it explicitly the same way debug.getState endpoint.
  2. Add an additional parameter implicitly with a custom type before setting the export type Api in the routes.

Other approach which looks more appropriate than namespace is the function chaining.

api.beacon.getGenesis().json() // queries with mime-type json
api.beacon.getGenesis().ssz() // queries with mime-type ssz

Where actual request is triggered by calling a specfic format function after the request function. This approach will be more explicit for consumer (so less confusion) and more extensible for us in future.

@dapplion
Copy link
Contributor

dapplion commented Mar 8, 2023

@nazarhussain please check the usage pattern here

let api: Api;
if (typeof opts.api === "string" || Array.isArray(opts.api)) {
// This new api instance can make do with default timeout as a faster timeout is
// not necessary since this instance won't be used for validator duties
api = getClient(
{
urls: typeof opts.api === "string" ? [opts.api] : opts.api,
// Validator would need the beacon to respond within the slot
timeoutMs: config.SECONDS_PER_SLOT * 1000,
getAbortSignal: () => this.controller.signal,
},
{config, logger, metrics: metrics?.restApiClient}
);
} else {
api = opts.api;
}
on how the validator can both consume the api directly in-memory of via HTTP. In the in-memory case what would cause to call that suffixed .ssz()?

Also this tooling must serve us as well as the broader community. Adding more boilerplate like that sounds annoying. I want to support the simplest case of "just give me a block I don't care". PR #4994 already kind of defeated that requiring non-trivial amount of boilerplate to do anything, but we should agree first on the goals and audience of the library

@nazarhussain
Copy link
Contributor

on how the validator can both consume the api directly in-memory of via HTTP. In the in-memory case what would cause to call that suffixed .ssz()?

Yes user had to call .ssz() in that case.

"just give me a block I don't care"

But as a user you should care about the format of that block. So telling the API client if you need the block as ssz or json is not a trivial boilerplate.

@wemeetagain
Copy link
Member Author

agree with @maschad that the server is actually the more important feature for now

Regarding the client, I have some opinions about it, having used it fairly extensively in side chainsafe projects.
I feel that the direction mentioned by @nazarhussain is the right direction for a full-featured client. We can also support a simple "I don't care" client, if necessary. (It's easy to wrap a full-featured client into a simple client. It's not easy to extend a simple client into a full-featured client.)

PR #4994 already kind of defeated that requiring non-trivial amount of boilerplate to do anything

This PR made the error handling that was already needed more explicit. The "non-trivial" amount of boilerplate is either an if(result.ok) { or ApiError.assert(result), imo, neither of which can be considered "non-trivial". We could still improve the situation though by adding something like an assertOk method that could do the assertion as a one-liner and return the success type.

The way I'd like to see the full-featured client operate is taking inspiration from the fetch api, which operates on the model of Request and Response objects.

Something like this:

type RequestOpts = {
  wireFormat?: 'json' | 'ssz';
  timeoutMs?: number;
}

type FailResponse = {
  ok: false;
  code: number;
  message?: string;
}
type SuccessResponse<T> = {
  ok: true;
  type: Type;
  forkVersion?: Version;
  fork: Fork;
  value(): T;
  ssz(): Uint8Array;
}
type Response<T> = (SuccessResponse<T> | FailResponse) & {
  assertOk(): SuccessResponse<T>;
}

const response = await api[namespace][method](...args, opts)

If we want a simplified client, we can wrap responses with something like:

function simplifyResponse<T>(response: Response<T>): {fork: Fork; value: T} {
  response = response.assertOk();
  return {
    fork: response.fork,
    value: response.value(),
  };
}

@nazarhussain
Copy link
Contributor

adding something like an assertOk method that could do the assertion as a one-liner and return the success type.

We already have ApiError.assert(res) everywhere which is just one line extra so not trivial.

@dapplion
Copy link
Contributor

dapplion commented Mar 9, 2023

Ok yeah you are right it's not that bad, let's go with the full featured client

@jshufro
Copy link
Contributor

jshufro commented Oct 3, 2023

At @wemeetagain 's request here: #6014 (review) I'm going to share a couple use-cases for SSZ handlers that I've run across building applications downstream of the consensus clients.

It's my preference that every beacon-api endpoint eventually support ssz as a matter of course.

Feel free to ignore this post, unless you really want to know a lot more about how rocket pool and the rocket rescue node work.


The first use-case pertains to rocket pool's smoothing pool tree generation. The tree itself is a merkle tree containing performance data pertinent to all rocket pool validators that have opted in to the smoothing pool. A node operator's share of the smoothing pool is proportionate to how many successful attestations they made during a rewards interval (28 days). At the end of the interval, the oDAO generates this tree and votes on the root. Once vote consensus is reached, the root is published on-chain and users can submit merkle proofs to get their rewards.

The most compute intensive portion of this process is scanning 28 days worth of epochs (6300) and collecting performance data for over 14400 validators. First, we check the epoch for duties assignments for each validator using /eth/v1/beacon/states/finalized/committees?state_id={epoch}. This handler only returns json, and the responses are quite large:

$ curl -Ss http://eth2:5052/eth/v1/beacon/states/finalized/committees?state_id=230000 | wc -c
7576856

7 MB of data, 6300 times reaches to around 48 GB of json that we must serialize, pass over the wire, and deserialize. It took sufficiently long that, just for fun, I wrote a reverse proxy to man-in-the-middle requests to /beacon/states/finalized/committees, convert them to protobuf, and cache them on disk. I found quite a large speedup from that alone.


The second use-case is probably one you're already aware of from other users- namely, rocket rescue node wants to build a bidirectional mapping of 0x01 credentials to validator pubkeys. The rescue node, if you're unfamiliar, is free to use, but access limited. We needed a way to reject access to specific users after they reach some quota, so we settled on HMAC credentials which include an expiry, and we keep track of which users have requested credentials and when in a database. We settled on using the address of the node operator's node wallet as a user identifier, because we could easily have users prove custody of the node wallet via a signed message (see the form on https://rescuenode.com ).

Now, we're expanding access to solo validators, but we still need a way for them to prove custody of a given validator, and to rate limit them. We settled on a signed message from the 0x01 credential, which allows us a small amount of sybil protection... however, we need to know if an address is a 0x01 credential for at least one validator before we issue the HMAC credential, ergo the need for a 0x01->pubkey map. Additionally, when the validator uses the rescue node, we need to do some fee_recipient policing to mitigate the prepare_beacon_proposer attack outlined here. For RP node operators, the valid fee_recipient is well-known and derived from on-chain data. For solo validators, we will require them to use their withdrawal address as their fee recipient. As such, when we receive a prepare_beacon_proposer request, we need to fetch the 0x01 credential for a given pubkey- ergo the pubkey->0x01 map.

The result of this is that rescue-proxy would like to poll /eth/v1/beacon/states/finalized/validators to create the index:

$ curl -Ss http://eth2:5052/eth/v1/beacon/states/finalized/validators | wc -c
456968766

Since the request is so slow and large, we reduced the polling frequency to every few hours. The fallout of this is that a user who migrates from 0x00 to 0x01 credentials in order to use the rescue node may not be able to until the polling boundary elapses, which we generally consider to be acceptable... but, in addition, our rescue proxy startup speed and therefor our ability to recover from service degradation is diminished.

We were able to side-step this through attestantio's beacon client, which has an optimization to pull the full set from the state instead of the beacon/states/finalized/validators endpoint: https://github.com/attestantio/go-eth2-client/blob/master/http/validators.go#L78-L80

Ultimately this is how we discovered #5966 since attestantio sets weights in the Accept header.

@dapplion
Copy link
Contributor

dapplion commented Oct 3, 2023

This is are very valid points, I would like to see also a broad support of SSZ on most of the API endpoints

@philknows
Copy link
Member

Great idea from @nflaig as well to add a flag that disallows JSON when SSZ is implemented on all API responses. Good for any public facing type nodes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta-feature-request Issues to track feature requests.
Projects
None yet
6 participants