Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

op-node derivation: add metrics for batch queue lengths #271

Open
ezdac opened this issue Nov 19, 2024 · 2 comments · May be fixed by #337
Open

op-node derivation: add metrics for batch queue lengths #271

ezdac opened this issue Nov 19, 2024 · 2 comments · May be fixed by #337
Labels
type:enhancement New feature or request

Comments

@ezdac
Copy link

ezdac commented Nov 19, 2024

Problem definition

During our latest l2 safe head stall incident in Alfajores we spent a long time figuring out what was going on with
the l1 derivation pipeline.
We did not catch the EOF error in the op-batcher, and the sequencers op-node logs did not actively indicate
that the l2 batch queue was stuck.
This is because only the tracing log-level would have given some indication of the problem:

if batch.Timestamp > nextTimestamp {
log.Trace("received out-of-order batch for future processing after next batch", "next_timestamp", nextTimestamp)
return BatchFuture
}

And the main indication of a batch queue stall in the info log-level is the lack of "Found next batch" logs:

if nextBatch != nil {
nextBatch.Batch.LogContext(bq.log).Info("Found next batch")
return nextBatch.Batch, nil
}

Proposed solution

Add metrics of the different batch-queues within the derivation pipeline, so that
we can observe the growth of the remaining queue vs the BatchQueue.batches queue.

case BatchFuture:
remaining = append(remaining, batch)
continue

There we would have seen a linear growth of "remaining" batches and a constant length of BatchQueue.batches,
while the opposite is expected during normal operations.

This could potentially be done in upstream-optimism, since this is not a Celo specific problem.

@ezdac ezdac added the type:enhancement New feature or request label Nov 19, 2024
@palango
Copy link
Collaborator

palango commented Jan 22, 2025

Opened a PR for this here: ethereum-optimism#13654

@palango
Copy link
Collaborator

palango commented Feb 5, 2025

Upstream didn't want to merge this because it's only relevant for pre-Holocene. So either we apply the above code to our codebase or close this.

@palango palango linked a pull request Feb 27, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants