Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent (off-by-one) handling of cursor parameter for Firehose depending on EventPersistence implementation. #837

Open
mrd0ll4r opened this issue Nov 21, 2024 · 0 comments

Comments

@mrd0ll4r
Copy link
Contributor

mrd0ll4r commented Nov 21, 2024

The Situation

I copied code from Palomar to subscribe to the Firehose. That includes persisting the cursor of the last processed message to a database (and also periodically). Note: This is also used by, e.g. Rainbow when saving the cursor to disk.
I'm receiving a message with sequence number N and persist N to the database. On next startup, I read N from the database and pass it as cursor=N to com.atproto.sync.subscribeRepos. I noticed that the first message I receive has sequence number N, not N+1, which means I process that message twice.
This was an easy fix on my end (and I have a PR to fix search/firehose.go), but I was curious if there were other components affected, so I dug a bit deeper.

I didn't find good documentation for com.atproto.sync.subscribeRepos, but I did find this: https://github.com/bluesky-social/atproto/blob/main/lexicons/com/atproto/sync/subscribeRepos.json#L13
...which states "The last known event seq number to backfill from.".
My gut feeling tells me if I already know the sequence number, I don't want it included in the response, i.e., I want everything > cursor.

Checking the Implementation

The Firehose runs EventPersistence.Playback(ctx, cursor, ...) with the user-provided cursor: https://github.com/bluesky-social/indigo/blob/main/events/events.go#L360
Checking all implementations of that interface, I get:

EDIT: What confuses me is that I thought the current Firehose used DbPersistence or maybe DiskPersistence, but it doesn't behave that way 🤔 .

Solutions

  1. It would be nice to have documentation for com.atproto.sync.subscribeRepos that states whether it should return everything > since or >= since.
  2. The EventPersistence interface should have documentation to specify how since should be handled.
  3. The implementations should all behave the same way.

I can submit PRs for 2 and 3, probably. Let me know :)

EDIT2: Seeing how there's a test for PebblePersist and DiskPersistence, apparently they behave the same and I'm confused somewhere. I'll keep looking...

EDIT3: Well, after thinking about this for a while, I can't really put my finger on where exactly, but I believe something is not right. The Firehose (and the EventPersistence interface, based on the existing tests) return events with evt.SeqNo >= since, i.e., including since. A bunch of the consumers, including search/firehose.go and splitter use the last processed sequence number, or the highest sequence number in PebbleDB for since. I think they process one event twice.
If you let me know how it should behave, I can send PRs to fix it.

EDIT4: Just to add, the streaming endpoint of Labelers return events with seqNo > since.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
1 participant