Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reuse WebCodec audio/video chunk definitions #101

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 51 additions & 23 deletions index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,8 @@ spec:webidl; type:dfn; text:resolve
<pre class=biblio>
{
"WEB-CODECS": {
"href":
"https://github.com/WICG/web-codecs/blob/master/explainer.md",
"title": "Web Codecs explainer"
"href": "https://w3c.github.io/webcodecs/",
"title": "Web Codecs"
}
}
</pre>
Expand Down Expand Up @@ -139,8 +138,8 @@ The <dfn>readEncodedData</dfn> algorithm is given a |rtcObject| as parameter. It

The <dfn>writeEncodedData</dfn> algorithm is given a |rtcObject| as parameter and a |frame| as input. It is defined by running the following steps:
1. If |frame|.`[[owner]]` is not equal to |rtcObject|, abort these steps and return [=a promise resolved with=] undefined. A processor cannot create frames, or move frames between streams.
2. If the |frame|'s {{RTCEncodedVideoFrame/timestamp}} is equal to or larger than |rtcObject|.`[[lastReceivedFrameTimestamp]]`, abort these steps and return [=a promise resolved with=] undefined. A processor cannot reorder frames, although it may delay them or drop them.
3. Set |rtcObject|.`[[lastReceivedFrameTimestamp]]` to the |frame|'s {{RTCEncodedVideoFrame/timestamp}}.
2. If the |frame|'s {{EncodedMediaChunk/timestamp}} is equal to or larger than |rtcObject|.`[[lastReceivedFrameTimestamp]]`, abort these steps and return [=a promise resolved with=] undefined. A processor cannot reorder frames, although it may delay them or drop them.
3. Set |rtcObject|.`[[lastReceivedFrameTimestamp]]` to the |frame|'s {{EncodedMediaChunk/timestamp}}.
4. Enqueue the frame for processing as if it came directly from the encoded data source, by running one of the following steps:
* If |rtcObject| is a {{RTCRtpSender}}, enqueue it to |rtcObject|'s packetizer, to be processed [=in parallel=].
* If |rtcObject| is a {{RTCRtpReceiver}}, enqueue it to |rtcObject|'s decoder, to be processed [=in parallel=].
Expand Down Expand Up @@ -221,13 +220,13 @@ The SFrame transform algorithm, given |sframe| as a SFrameTransform object and |
3. If |frame|.`[[owner]]` is a {{RTCRtpReceiver}}, set |role| to 'decrypt'.
4. Let |data| be undefined.
5. If |frame| is a {{BufferSource}}, set |data| to |frame|.
6. If |frame| is a {{RTCEncodedAudioFrame}}, set |data| to |frame|.{{RTCEncodedAudioFrame/data}}
7. If |frame| is a {{RTCEncodedVideoFrame}}, set |data| to |frame|.{{RTCEncodedVideoFrame/data}}
6. If |frame| is a {{RTCEncodedAudioFrame}}, set |data| to |frame|.{{EncodedMediaChunk/data}}
7. If |frame| is a {{RTCEncodedVideoFrame}}, set |data| to |frame|.{{EncodedMediaChunk/data}}
8. If |data| is undefined, abort these steps.
9. Let |buffer| be the result of running the SFrame algorithm with |data| and |role| as parameters. This algorithm is defined by the <a href="https://datatracker.ietf.org/doc/draft-omara-sframe/">SFrame specification</a> and returns an {{ArrayBuffer}}.
10. If |frame| is a {{BufferSource}}, set |frame| to |buffer|.
11. If |frame| is a {{RTCEncodedAudioFrame}}, set |frame|.{{RTCEncodedAudioFrame/data}} to |buffer|.
12. If |frame| is a {{RTCEncodedVideoFrame}}, set |frame|.{{RTCEncodedVideoFrame/data}} to |buffer|.
11. If |frame| is a {{RTCEncodedAudioFrame}}, set |frame|.{{EncodedMediaChunk/data}} to |buffer|.
12. If |frame| is a {{RTCEncodedVideoFrame}}, set |frame|.{{EncodedMediaChunk/data}} to |buffer|.
13. [=ReadableStream/Enqueue=] |frame| in |sframe|.`[[transform]]`.

## Methods ## {#sframe-transform-methods}
Expand All @@ -244,14 +243,51 @@ The <dfn method for="SFrameTransform">setEncryptionKey(|key|, |keyID|)</dfn> met
# RTCRtpScriptTransform # {#scriptTransform}

<pre class="idl">
// New enum for video frame types. Will eventually re-use the equivalent defined
// by WebCodecs.
enum RTCEncodedVideoFrameType {
"empty",
interface EncodedMediaChunk {
readonly attribute unsigned long long timestamp; // microseconds
readonly attribute ArrayBuffer data;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I look at it, data is readonly here but we want to be able to change it in RTCRtpSFrameTransform transforms.

};

// WebCodecs definitions with introduction of EncodedMediaChunk to more easily refer to timestamp and data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this be brought directly back to WebCodecs?

Copy link
Collaborator Author

@youennf youennf Apr 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer we get agreement on the strategy first, like get agreement to reuse WebCodecs constructs, then start discussing with WebCodecs how we can best do things.
With regards to EncodedMediaChunk, I did not think about it to hard, we might want to use a mixin instead of inheritance.

// They should be removed from this spec at some point.
[Exposed=(Window,DedicatedWorker)]
interface EncodedVideoChunk : EncodedMediaChunk {
constructor(EncodedVideoChunkInit init);
readonly attribute EncodedVideoChunkType type;
readonly attribute unsigned long long? duration; // microseconds
};

dictionary EncodedVideoChunkInit {
required EncodedVideoChunkType type;
required unsigned long long timestamp;
unsigned long long duration;
required BufferSource data;
};

enum EncodedVideoChunkType {
"key",
"delta",
};

[Exposed=(Window,DedicatedWorker)]
interface EncodedAudioChunk : EncodedMediaChunk {
constructor(EncodedAudioChunkInit init);
readonly attribute EncodedAudioChunkType type;
};

dictionary EncodedAudioChunkInit {
required EncodedAudioChunkType type;
required unsigned long long timestamp;
required BufferSource data;
};

enum EncodedAudioChunkType {
"key",
"delta",
};
</pre>

<pre class="idl">
dictionary RTCEncodedVideoFrameMetadata {
long long frameId;
sequence&lt;long long&gt; dependencies;
Expand All @@ -263,13 +299,8 @@ dictionary RTCEncodedVideoFrameMetadata {
sequence&lt;long&gt; contributingSources;
};

// New interfaces to define encoded video and audio frames. Will eventually
// re-use or extend the equivalent defined in WebCodecs.
[Exposed=(Window,DedicatedWorker)]
interface RTCEncodedVideoFrame {
readonly attribute RTCEncodedVideoFrameType type;
readonly attribute unsigned long long timestamp;
attribute ArrayBuffer data;
interface RTCEncodedVideoFrame : EncodedVideoChunk {
RTCEncodedVideoFrameMetadata getMetadata();
};

Expand All @@ -279,13 +310,10 @@ dictionary RTCEncodedAudioFrameMetadata {
};

[Exposed=(Window,DedicatedWorker)]
interface RTCEncodedAudioFrame {
readonly attribute unsigned long long timestamp;
attribute ArrayBuffer data;
interface RTCEncodedAudioFrame : EncodedAudioChunk {
RTCEncodedAudioFrameMetadata getMetadata();
};


// New interfaces to expose JavaScript-based transforms.

[Exposed=DedicatedWorker]
Expand Down