Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pindexer: add dex block summary #5063

Merged
merged 17 commits into from
Feb 13, 2025
Merged

pindexer: add dex block summary #5063

merged 17 commits into from
Feb 13, 2025

Conversation

erwanor
Copy link
Member

@erwanor erwanor commented Feb 6, 2025

Describe your changes

WIP pairing

Issue ticket number and link

penumbra-zone/dex-explorer#338

Checklist before requesting a review

  • I have added guiding text to explain how a reviewer should test these changes.

  • If this code contains consensus-breaking changes, I have added the "consensus-breaking" label. Otherwise, I declare my belief that there are not consensus-breaking changes, for the following reason:

    REPLACE THIS TEXT WITH RATIONALE (CAN BE BRIEF)

@erwanor erwanor removed their assignment Feb 10, 2025
@conorsch
Copy link
Contributor

@JasonMHasperhoven and I paired a bit to debug the failing smoke test. To investigate, I pulled down this branch, ran just dev to spin up the fullnode indexing config, and quickly identified the specific error that pindexer was showing:

pindexer-debug-2

That error, in text, is:

Error: error returned from database: column "batch_swaps" is of type batch_swap_summary[] but expression is of type jsonb
Caused by:
    column "batch_swaps" is of type batch_swap_summary[] but expression is of type jsonb

Clear enough: we've got to massage the types a bit. Typically we use JSON blobs to store Penumbra types in postgres via pindexer, but as the spec in penumbra-zone/dex-explorer#338 indicates, we don't want to do that here. Instead, we want a full custom type inside the db. I took a stab at implementing sqlx::Encode for the BatchSwapSummary type, but wasn't able to finish it. The build-time error I'm getting now is:

error[E0277]: the trait bound `for<'a> &'a [BatchSwapSummary]: sqlx::Encode<'_, Postgres>` is not satisfied
    --> crates/bin/pindexer/src/dex_ex/mod.rs:1270:16
     |
1270 |         .bind(&batch_swap_summaries)
     |          ----  ^^^^^^^^^^^^^^^^^^^^ the trait `for<'a> sqlx::Encode<'_, Postgres>` is not implemented for `&'a [BatchSwapSummary]`, which is required by `&Vec<BatchSwapSummary>: sqlx::Encode<'_, Postgres>`

Subsequent work should pick up from there. Take the Encode impl with a grain of salt, @JasonMHasperhoven , because this code isn't actually working yet! But hopefully this is one step closer.

@erwanor erwanor assigned erwanor and unassigned JasonMHasperhoven Feb 12, 2025
-- Primary key
rowid SERIAL PRIMARY KEY,
-- The height of the block.
height INTEGER NOT NULL,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cronokirby noticed that some inserts fail if we are adding a UNIQUE constraint on height, but my understanding is that this part of the pipeline should have exactly once semantics on delivery. is that right?

wondering if this is caused by the catchup logic: https://github.com/penumbra-zone/penumbra/blob/erwan/pindexer_block_summary/crates/util/cometindex/src/indexer/indexing_state.rs#L271 (<= vs. <). unsure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At a first glance this line seems wrong, yeah ; I think we should optimistically merge this and then look at re-adding in a unique constraint and making sure indexing logic doesn't repeat events

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DevOps hiring for RC

https://web3.career/staff-sre-engineer-platform-phantom/95791

Questions:

  1. Given a greenfield project that needs hosting, how do you approach the problem? What do you focus on first? What information do you require? (Look for keywords like monitoring, network surface, observability, SLAs; explore from there.)

  2. Sometimes on small teams there can be disagreements about architecture. Can you tell me about a time when you disagreed with a technical decision made by a colleague, and how you approached that situation?

  3. Describe your preferred working style. Do you prefer to communicate via group chat in Slack or similar, via DMs, on Github issues, on video calls?

  4. What types of work, whether backend systems administration or otherwise, do you find draining? What about rewarding?

  5. If we were to move forward with this role,how could Radiant Commons support you? What does supportive work environment look like to you?

  6. How do you prefer to run software if left to your own devices? Do you like container orchestration?

  7. How do you decide whether a service should be self-run, internally to the company, versus paid for as a Saas?

  8. Can you share an "war stories" about running production infrastructure, incident response?

  9. When it comes to writing code vs automating infrastructure, which do you prefer? HOw are those responsibilities different, how related?

conor in 🌐 Antigonus in ~/src/penumbra on 🌱 main [$] via 🦀 v1.83.0 via ❄ impure (nix-shell-env)
❯ 1

After deploying pindexer at v1.0.2, the mainnet events pipeline stalled out:

2025-02-17T16:59:01.880880Z DEBUG sqlx::query: summary="SELECT total, staked, price …" db.statement="\n\nSELECT\n  total,\n  staked,\n  price\nFROM\n
 insights_supply\nORDER BY\n  HEIGHT DESC\nLIMIT\n  1\n" rows_affected=0 rows_returned=0 elapsed=246.583µs elapsed_secs=0.000246583
Error: error returned from database: duplicate key value violates unique constraint "block_details_pkey"

Caused by:
    duplicate key value violates unique constraint "block_details_pkey"

After switching to a build based on #5095, that error becomes:

2025-02-17T17:11:08.156215Z DEBUG sqlx::query: summary="INSERT INTO insights_supply(height, total, …" db.statement="\n\nINSERT INTO\n  insights_supply
(\n    height,\n    total,\n    staked,\n    price,\n    price_numeraire_asset_id\n  )\nVALUES\n  ($1, $2, $3, $5, $4) ON CONFLICT (height) DO\nUPDATE
\nSET\n  total = excluded.total,\n  staked = excluded.staked,\n  price = excluded.price,\n  price_numeraire_asset_id = excluded.price_numeraire_asset_
id\n" rows_affected=0 rows_returned=0 elapsed=545.631µs elapsed_secs=0.000545631
Error: error returned from database: type "batch_swap_summary" does not exist

Caused by:
    type "batch_swap_summary" does not exist

which looks like a genuine problem. I'm surprised that the CREATE TYPE batch_swap_summary AS declaration in the modified schema isn't sufficient.

@erwanor erwanor marked this pull request as ready for review February 12, 2025 23:03
@erwanor
Copy link
Member Author

erwanor commented Feb 12, 2025

AFAICT this is good to go, the integration test seems to timeout because of the commitment source addition.

@JasonMHasperhoven
Copy link
Contributor

How do we fix the integration test here?

@erwanor erwanor merged commit 41c9a80 into main Feb 13, 2025
9 of 10 checks passed
@erwanor erwanor deleted the erwan/pindexer_block_summary branch February 13, 2025 19:46
@conorsch
Copy link
Contributor

How do we fix the integration test here?

The integration test communicates with the live endpoints for the testnet (#4983). Given the ongoing work towards #5010, the endpoint is no longer compatible with main, thus the failing test. We should probably disable it on main branch for now. I'll follow up in a separate PR.

conorsch added a commit that referenced this pull request Feb 14, 2025
The testnet integration tests are failing PRs targeting main, because the checks
assume the code being tested is compatible with the live testnet
endpoints. That's not currently the case, due to ongoing LQT development #5010.

Refs #5063 (comment)
conorsch added a commit that referenced this pull request Feb 17, 2025
## Describe your changes

<!--
Describe what's changed and why. If interactive testing is required,
explain
to the reviewer how the PR should be tested.
-->
The testnet integration tests are failing PRs targeting main, because
the checks assume the code being tested is compatible with the live
testnet endpoints. That's not currently the case, due to ongoing LQT
development #5010.

Refs
#5063 (comment)

## Issue ticket number and link

## Checklist before requesting a review

- [x] I have added guiding text to explain how a reviewer should test
these changes.

- [x] If this code contains consensus-breaking changes, I have added the
"consensus-breaking" label. Otherwise, I declare my belief that there
are not consensus-breaking changes, for the following reason:

  > ci-only, no changes to app code.
conorsch added a commit that referenced this pull request Feb 19, 2025
## Describe your changes

Moving the smoke test invocations out of the process-compose wrapper,
preferring running raw `cargo test` invocations instead. Doing so
ensures that the test output is readable, both locally, and crucially in
CI.

### Screenshot before

![smoke-test-before](https://github.com/user-attachments/assets/219b8518-8525-463e-8465-2c16f787d3f6)

### Screenshot after

![smoke-logs-after](https://github.com/user-attachments/assets/b7721d89-c969-4870-8d33-b9aa8dbc6ac9)


## Issue ticket number and link

No guiding issue, just trying to improve the testing setup
opportunistically. We did have inscrutable smoke test failures on #5063
& #5081, both of which were failures of the new pindexer integration
tests (#5057), but it was hard to see the specific failure at a glance,
which slowed down development.

## Testing and review

Check the CI logs for the smoke test job. Are they readable? Consider
running `just smoke` locally and confirm the same.

## Checklist before requesting a review

- [x] I have added guiding text to explain how a reviewer should test
these changes.

- [x] If this code contains consensus-breaking changes, I have added the
"consensus-breaking" label. Otherwise, I declare my belief that there
are not consensus-breaking changes, for the following reason:

  > tests-only, no code changes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

4 participants