Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clarify gaps in object/group IDs at relay #587

Closed
wants to merge 14 commits into from
Closed

clarify gaps in object/group IDs at relay #587

wants to merge 14 commits into from

Conversation

fluffy
Copy link
Contributor

@fluffy fluffy commented Oct 11, 2024

No description provided.

@fluffy fluffy requested a review from suhasHere October 11, 2024 20:26
Copy link
Collaborator

@afrind afrind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Individual Comments:

Can you link the issue this PR is addressing (or create one)?

@@ -852,6 +852,32 @@ the active subscribers for that track. Relays MUST forward OBJECT messages to
matching subscribers in accordance to each subscription's priority, group order,
and delivery timeout.

Limited bandwidth upstream of the relay may result in gaps in the
sequences of ObjectID values that the relay receives. For example if
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understood the comments made in Boston, the only way such a gap can happen right now for non-datagrams is due to DELIVERY_TIMEOUT expiration and limited bandwidth by itself is not a cause of gaps. Is that correct?

See also #582

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my understanding, as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, also the subsequent paragraph is a bit daunting

Comment on lines 874 to 875
gaps and out of order reception. What can be assumed is that the objects
received on a single QUIC stream are in order but may have gaps. These
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is adding non-normative text which is in conflict normative text here:

https://github.com/moq-wg/moq-transport/blob/main/draft-ietf-moq-transport.md?plain=1#L740-L741

What happens to a subgroup with an object in the middle that misses the delivery timeout? This PR says: omit the object and keep going. Previous text says the relay cannot omit the object. I'm not sure how to implement what is suggested in this PR for subgroups:

When I receive the beginning of an object from upstream, it's not past the delivery timeout -- that starts when the object is received -- so I will start forwarding it (eg: write it to a downstream QUIC stream). If the end of the object does not arrive by the delivery timeout, I cannot retract the bytes already written. I think the only recourse is to reset the entire stream (eg: drop the tail of the subgroup). Is there another implementation you have in mind?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC delivery timeout is considered even before inserting object in to the QUIC stream. If the object stays within QUIC stream for ever, it is still considered moving out of moq layer into transport layer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me restate that to see if I got it:

You envision a system where the incoming object is queued outside of QUIC, and if the delivery timeout fires before it gets written to QUIC, you can skip over it.

Is that correct?

I imagine some relays might operate that way, and others might write incoming objects directly to the downstream stream (necessitating stream reset on timeout). Is there an interoperability problem if clients assume one behavior or the other?

A related question is, when does the relay stop the delivery timeout timer - when the last byte of the object is written to the stream, when it's first transmitted, when it's ACK'd, or something else? Maybe this one needs a separate issue.

Copy link
Collaborator

@suhasHere suhasHere Oct 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine some relays might operate that way, and others might write incoming objects directly to the downstream stream (necessitating stream reset on timeout). Is there an interoperability problem if clients assume one behavior or the other?

This should work. Logic is same in both the case.
Before inserting the object into the downstream QUIC stream, check for the delivery timeout and insert it. Once inserted into QUIC stream, it needs to be treated as handed over from the moq layer to underlying transport.

@fluffy
Copy link
Contributor Author

fluffy commented Oct 16, 2024

Few things to think about in updating from the call.

  1. when talking about gaps, should be clear of timescale and if means not arriving

  2. bandwidth causes delay, cause timout, cuases drop

@fluffy
Copy link
Contributor Author

fluffy commented Oct 23, 2024

@afrind is going to help me on clarifying this.

@martinduke
Copy link
Contributor

Cullen,

I think one of the problems is that you keep using the words "out of order" which does not fully capture what I think we're agreeing on.

Say a subgroup is object IDs 2, 4, 6, 8:

sending 2, 4, 6, 8 = legal
sending 2, 4, 6 = legal
sending 2, 4, 8, 6 = illegal (out of order)
sending 2, 4, 8 = illegal (in order, but with an omitted object in the middle.)

@fluffy
Copy link
Contributor Author

fluffy commented Oct 23, 2024

Thanks Martin - i will try and rephrase. I also think that Alan and I were using gap differently so will get more specific with that too.

@fluffy
Copy link
Contributor Author

fluffy commented Nov 3, 2024

@afrind - Can you read the latest version of this and see if it matches your thinking. I think it should.

@afrind
Copy link
Collaborator

afrind commented Nov 7, 2024

Individual Comment:

Thanks for rewriting. I would prefer to see this expressed normatively and relatively short in terms of what a relay can and cannot write to a stream, and then we can add an examples or informative motivation.

I was trying to write something that fit the bill, but stumbled on this case.

sending 2, 4, 8 = illegal (in order, but with an omitted object in the middle.)

If a relay gets a subgroup with objects 2, 4, 6, 8, I expect it can send one of the following

Nothing
2
2, 4
2, 4, 6
2, 4, 6, 8

However, what is the relay gets

<GOAWAY>
2, 4 <RELIABLE_RESET>

It starts making a new subscription to the GOAWAY location, but doesn't close anything downstream. By the time it resubscribes it gets another stream for this subgroup:

8

Would the relay be allowed to send 2, 4, 8 on one stream downstream? If not, then what is it supposed to do?

  • FETCH to fill in the gap?
  • If it gets non-sequential object (even 6) it has no idea if it missed and should just reset all the downstream subgroup streams
    • If so, can it open a new subgroup stream containing just 8 or stop trying to forward this subgroup?
  • Something else?

@martinduke
Copy link
Contributor

martinduke commented Nov 7, 2024 via email

@fluffy
Copy link
Contributor Author

fluffy commented Nov 8, 2024

So in the case where the relay gets for group=77 subgroup=11

<GOAWAY>
2, 4 <RELIABLE_RESET>

after that it would not allowed for the publisher to resent that subsgroup with just

8

Instead it would need to send all the groups in the substream so it could be in order so it would send

2,4,6,8

@fluffy
Copy link
Contributor Author

fluffy commented Nov 8, 2024

What I want to get into the PR is something that helps people understand how this works and a bit on how the cases come up. Once we get there, we might need to tweek a few places in the draft with actual normative language.

@afrind
Copy link
Collaborator

afrind commented Nov 8, 2024

after that it would not allowed for the publisher to resent that subsgroup with just

8

I agree with that bit 100%. The example I gave was a new subscription that started at latest object. eg, maybe like this:

          /--- relay 1 ---\
Publisher                  relay3 ---> Subscriber 
          \--- relay 2 ---/

relay3 gets GOAWAY/2, 4<RELIABLE_RESET> from relay1 and a new ANNOUNCE from relay2. relay3 issues SUBSCRIBE(latest object) to relay2, and gets a stream with just 8 for group=77 subgroup=11.

Also, we should probably move this discussion to a PR or the mailing list rather than a PR where it's less discoverable.

What I want to get into the PR is something that helps people understand how this works and a bit on how the cases come up. Once we get there, we might need to tweek a few places in the draft with actual normative language.

I'll defer to the editor on the approach.

@fluffy
Copy link
Contributor Author

fluffy commented Nov 12, 2024

I don't know enough about reliable reset. Will web transport support that ?

@afrind
Copy link
Collaborator

afrind commented Nov 12, 2024

Will web transport support that

I think it has too, but we should double check. @vasilvv do you know?

@vasilvv
Copy link
Collaborator

vasilvv commented Nov 12, 2024

We don't expose it on the API level (since no one asked), but we could try adding it since it's mandatory to implement WebTransport itself.

Copy link

@TimEvens TimEvens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious if fetch is supposed to fill in the gaps here. I hope not as we would be reinventing a reliable [overlay] transport. I believe the expected behavior is to expect gaps and be okay with it. No fetch is required in this case to fill the gaps. This results in the relay having a cache with partial data, which is expected... but with fetch it gets interesting in that the fetch cannot be processed if there is any missing data... Might be worth clarifying that a bit more as it pertains to fetch request with gap data in cache on the relays.

@ianswett
Copy link
Collaborator

This is an editorial PR, and I'm not sure what issue we're solving with the text, so I'm going to close for now and we can re-open if needed.

@ianswett ianswett closed this Dec 11, 2024
@fluffy fluffy reopened this Feb 25, 2025
@fluffy fluffy requested review from suhasHere, englishm, wilaw, ianswett and martinduke and removed request for vasilvv and kixelated February 26, 2025 00:57
Copy link
Collaborator

@afrind afrind left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestions for typos

@zafergurel
Copy link
Contributor

zafergurel commented Feb 26, 2025

To detect a gap, the previous group id can be added by the original publisher to each group header. So, incrementing group ids or incrementing by one is not necessary.

For instance we have group ids: 1, 3, 5. Using the previous group ids, the relay can understand that there is not gap when it receives the groups in an arbitrary order such as 5, 3, 1. It can re-order them using the previous group ids as follows:

1 <- 3 <- 5

Even if the relay does not have 2 and 4 in the cache (which actually do not exist), the relay can understand there is no gap in the sequence above by looking at every previous group ids.

Please see #731 .

@ianswett ianswett closed this in #749 Mar 2, 2025
@ianswett ianswett closed this in d51514d Mar 2, 2025
ianswett added a commit that referenced this pull request Mar 2, 2025
The text that came out of the Denver interim plus a short editorial
sentence.

Closes #727
Closes #587
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants