Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement design to reduce amount of CIDs needed to reprovide #45

Open
estebanabaroa opened this issue Aug 31, 2024 · 1 comment
Open

Comments

@estebanabaroa
Copy link
Member

estebanabaroa commented Aug 31, 2024

At the moment, each nested reply, and each subplebbit page is another IPFS CID that needs to be reprovided to the DHT every 24 hours.

To reduce this amount of CIDs to reprovide, we can make pages that few people consume (e.g. page 5+) very large, they could be 10-100mb each, and contain thousands of posts. They will be slower to load, but since few people need them, it's not a big deal. They can also be updated less often, it could be once a month e.g. for several years old posts.

We can also avoid making each nested reply a CID to reprovide by never fetching replies directly by their CID, we only fetch replies using post.replies pages.

Also most posts on reddit never have more than a dozen replies, so it's actually not necessary to create replies page cids, all the replies can simply be stored in post.replies.pages.top. For different sorts, they can be sorted by the client (since all replies are known).

It seems to work for all scenarios:

1. Fetch a seedit post

  • load post cid
  • load post update
  • browse first page of post.replies.pages.top
  • load and browse second page of post.replies.pages.top.nextCid
  • etc. each page should be larger and larger, as to not create too many CIDs, in the rare case someone needs to scroll many pages, they can wait for large IPFS files to load
  • (optional) to see a deep nested reply, load post.replies.pages.top.comments[x].replies.pageCids.top (pageCids of a nested reply should only exist if there are a lot of nested replies and they can't all fit in the original page, most of the time, pageCids should not exist on nested replies)

2. Fetch a plebchan post

  • load post cid
  • load post update
  • load and browse first page of post.replies.pageCids.oldFlat (plebchan needs a new type of replies page that isn't nested, i.e. flat)
  • load and browse second page of post.replies.pageCids.oldFlat

3. Fetch a group chat like UI post (newest first, scrolling up)

  • load post cid
  • load post update
  • load and browse first page of post.replies.pageCids.newFlat
  • load and browse second page of post.replies.pageCids.newFlat

4. Fetch update (votes/replies/etc) of your own nested reply

  • load reply.postCid
  • load post update
  • based on the reply.timestamp, scroll post.replies.pageCids.newFlat or post.replies.pageCids.oldFlat until find own reply reply.
  • note: this algo is slow, but looking for updates to your own replies is done in the background, the user doesn't wait for it, and using the instant replies design, the user can also join a pubsub topic to get updates in real time
  • note: another possible algo in very large posts with thousands of replies could be for reply to include reply.parentCids: string[] and reply.parentTimestamps: number[], this could be used to search for the reply in nested pages
@Rinse12
Copy link
Member

Rinse12 commented Aug 31, 2024

Shouldn't we wait before thinking of a design until we figure out how to fix the DHT issues? Maybe we won't even need a DHT, iroh for example has a key-value document design that we can use for subplebbit record. The subplebbit would have all of its comment/page cids in the document, and the sub would update the page cids frequently within the document, without having to announce to a DHT.

The user can selectively load keys within a document, if they don't want to load the whole subplebbit record with all its pages/comments.

The question is how would the user discover the node ID of the publisher(s)? For that I think we can have a tracker-like system, or good old bittorrent DHT

https://www.iroh.computer/docs/components/docs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants