add c608 mapping #823

dericed · 2024-04-10T17:39:28Z

For discussion, here is a pull request to add storage of EIA-608 data to Matroska. It's defined in a manner similar to (QuickTime)[https://developer.apple.com/documentation/quicktime-file-format/closed_captioning_sample_data/]. The QuickTime version is a bit vague so I added something to the extra data based on DASH to define the name of the caption channel and its language.

robUx4 · 2024-04-13T07:00:59Z

I guess this is one correct/portable way of doing it. As discussed in #375 it may also be good to extract the data into proper subtitle tracks.

Also should we add constraints on what kind of track (video, with specific resolutions) it can be attached to ?

robUx4 · 2024-04-13T06:57:43Z

block_additional_mappings/c608-captions.md

+
+```
+@value = (channel *3 [";" channel]) / (language *3[";" language])
+2 channel = channel-number "=" language


Are the numbers at the beginning of each line expected in the ABNF description ?

To answer myself: yes, these number are line numbers in the PDF. They should be removed here.

robUx4 · 2024-04-13T06:58:07Z

block_additional_mappings/c608-captions.md

+```
+@value = (channel *3 [";" channel]) / (language *3[";" language])
+2 channel = channel-number "=" language
+3 channel-number = CC1 | CC2 | CC3 | CC4


CC1/CC2/CC3/CC4 are missing a definition

Given the examples in the PDF they are litteral strings.

In a more recent version of the split PDF it says:

SCTE 214-1 [5], section 7.2.3, “Signalling CEA-608 caption service metadata”

In the current version of SCTE 214-1 that's section 8.2.3 and the BNF has been fixed:

@value = language / channel 0*3 [";" channel] channel = channel-number "=" language channel-number = “CC1” / “CC2” / “CC3” / “CC4” language = 3ALPHA ; language code per ISO 639-3

We should probably use that and also reference that source directly.

mbunkus · 2024-04-13T07:56:17Z

I'm strongly against storing CC data in this way. Repeating what I wrote in #375:

Personally I'd vote for "keep it subtitle, let CodecID speak for itself". We already integrate to many different formats under the type "subtitle", some text based, others are images. And closed captions fulfill largely the same role.

I don't consider their traditional way of transportation (embedded in the video track) to be relevant for our decision.

and

Well, what our track type "subtitles" transports can easily fill both roles, and it only depends on the content. Similar to how "audio" tracks can contain the whole dialog or only the director's comments. I don't see any reason to use a separate track type.

Block additions are meant to carry data required to or enhancing decoding the main block data in the same track. This means the video data (think of e.g. HDR information). However, CCs don't have anything to do with decoding video data at all.

There's no advantage to storing CC data alongside video data. All it does is creating new corner cases for software handling it. For example, mkvmerge would suddenly have to expose that data as some kind of virtual track outside of the regular track structure so that MKVToolNix GUI can present the CC data as what it is, continuous data timed to display at certain points = a track. Same for all plackbac programs out there.

My vote goes to storage as regular track with type subtitle & proper codec ID. I'd also be fine with storage as regular track with a new track type closed_captions or whatever.

dericed · 2024-04-21T19:09:33Z

I'm adding a subtitle track definition for C608 for comment. See 90e7b11. If there's consensus about this approach, then I'll remove the BlockAdditionalMapping version and rebase.

For storing the channel number I wanted to make a recommendation and suggested the LABEL tag, but more appropriate suggestions are welcome. Or perhaps this could just go into the private data.

JeromeMartinez · 2024-04-21T19:12:33Z

codec_specs.md

+
+Codec Name: CEA-608 Captions
+
+Description: Each Block of a S_C608 subtitle track stores the c608 data as a array of one or more octet pairs from one data channel of a CEA-608 data stream with each octet pair corresponding to a video frame. The "LABEL" Matroska tag is RECOMMENDED for storage the channel number of the CEA-608 (such as "CC1", "CC2", "CC3" or "CC4"). For more information about this format see [@!ANSI-CTA-608-E-S-2019].


CC1 & CC2 share field 1 and can not be separated.
CC3 & CC4 share field 2 and can not be separated.

IMO it would be better to indicate the field, or even better to say that for frames there should be 4 bytes (2 bytes of field 1 then 2 bytes of field 2) so we avoid a have to manage tags.

Actually we have also T1 to T4 and XDS, so the service name ("CC" thing) is really not relevant.

@JeromeMartinez I pushed an update.

JeromeMartinez · 2024-04-21T19:53:08Z

first channel number of the CEA-608 pair (expressed as either "CC1" or "CC2")

CC1 and CC2 are names of a service in the specs and it is technically possible to use e.g. only CC2 without using CC1 so I prefer just "field number of the CEA-608 pair (expressed as either "1" or "2")".

robUx4 · 2024-10-22T12:24:56Z

block_additional_mappings/c608-captions.md

+
+```
+@value = (channel *3 [";" channel]) / (language *3[";" language])
+2 channel = channel-number "=" language


To answer myself: yes, these number are line numbers in the PDF. They should be removed here.

robUx4 · 2024-10-22T12:25:19Z

block_additional_mappings/c608-captions.md

+
+### BlockAddIDExtraData
+
+BlockAddIDExtraData MAY contain a value as defined by Section 6.4.3.3 of [!@DASH-IF-IOP] that describes the caption services and language. An ABNF from [!@DASH-IF-IOP] for this value is restated here for convenience:


MAY should be normative

Suggested change

BlockAddIDExtraData MAY contain a value as defined by Section 6.4.3.3 of [!@DASH-IF-IOP] that describes the caption services and language. An ABNF from [!@DASH-IF-IOP] for this value is restated here for convenience:

BlockAddIDExtraData **MAY** contain a value as defined by Section 6.4.3.3 of [!@DASH-IF-IOP] that describes the caption services and language. An ABNF from [!@DASH-IF-IOP] for this value is restated here for convenience:

robUx4 · 2024-10-22T12:39:12Z

block_additional_mappings/c608-captions.md

+```
+@value = (channel *3 [";" channel]) / (language *3[";" language])
+2 channel = channel-number "=" language
+3 channel-number = CC1 | CC2 | CC3 | CC4


Given the examples in the PDF they are litteral strings.

In a more recent version of the split PDF it says:

SCTE 214-1 [5], section 7.2.3, “Signalling CEA-608 caption service metadata”

In the current version of SCTE 214-1 that's section 8.2.3 and the BNF has been fixed:

@value = language / channel 0*3 [";" channel] channel = channel-number "=" language channel-number = “CC1” / “CC2” / “CC3” / “CC4” language = 3ALPHA ; language code per ISO 639-3

We should probably use that and also reference that source directly.

robUx4 · 2024-12-22T14:30:28Z

marking this a v5 as the codec is not supported yet (or defined).

robUx4 · 2025-01-26T09:01:25Z

codec_specs.md

+
+Codec Name: CEA-608 Captions
+
+Description: Each Block of a S_C608 subtitle track stores the c608 data as a array of one or more octet pairs from one data channel of a CEA-608 data stream with each octet pair corresponding to a video frame. The "LABEL" Matroska tag is RECOMMENDED for storage the first channel number of the CEA-608 pair (expressed as either "CC1" or "CC2"). For more information about this format see [@!ANSI-CTA-608-E-S-2019].


RECOMMENDED should have "**" around it

robUx4 · 2025-01-26T09:01:45Z

block_additional_mappings/c608-captions.md

+
+### c608 Captions Description
+
+CEA-608 captions can be stored in `BlockMore Element` to associate the content of a Matroska Block (typically a video frame) with closed captioning data. Storing CEA-608 caption data requires that the corresponding Block SHALL NOT use Lacing.


SHALL NOT should have "**" around it.

add c608 mapping

2fecb92

robUx4 added codec mapping spec_codecs Codec Matroska spec document target labels Apr 13, 2024

robUx4 reviewed Apr 13, 2024

View reviewed changes

JeromeMartinez reviewed Apr 21, 2024

View reviewed changes

add S_C608 subtitle track

8eeee1b

dericed force-pushed the c608-blockadditionalmapping branch from 90e7b11 to 8eeee1b Compare April 21, 2024 19:44

robUx4 requested changes Oct 22, 2024

View reviewed changes

robUx4 added the matroska-v5 label Dec 22, 2024

robUx4 reviewed Jan 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add c608 mapping #823

add c608 mapping #823

dericed commented Apr 10, 2024

robUx4 commented Apr 13, 2024

robUx4 Apr 13, 2024

robUx4 Oct 22, 2024

robUx4 Apr 13, 2024

robUx4 Oct 22, 2024

mbunkus commented Apr 13, 2024

dericed commented Apr 21, 2024

JeromeMartinez Apr 21, 2024 •

edited

Loading

JeromeMartinez Apr 21, 2024

dericed Apr 21, 2024

JeromeMartinez commented Apr 21, 2024

robUx4 Oct 22, 2024

robUx4 Oct 22, 2024

robUx4 Oct 22, 2024

robUx4 commented Dec 22, 2024

robUx4 Jan 26, 2025

robUx4 Jan 26, 2025


		Codec Name: CEA-608 Captions

		Description: Each Block of a S_C608 subtitle track stores the c608 data as a array of one or more octet pairs from one data channel of a CEA-608 data stream with each octet pair corresponding to a video frame. The "LABEL" Matroska tag is RECOMMENDED for storage the channel number of the CEA-608 (such as "CC1", "CC2", "CC3" or "CC4"). For more information about this format see [@!ANSI-CTA-608-E-S-2019].


		### BlockAddIDExtraData

		BlockAddIDExtraData MAY contain a value as defined by Section 6.4.3.3 of [!@DASH-IF-IOP] that describes the caption services and language. An ABNF from [!@DASH-IF-IOP] for this value is restated here for convenience:

	BlockAddIDExtraData MAY contain a value as defined by Section 6.4.3.3 of [!@DASH-IF-IOP] that describes the caption services and language. An ABNF from [!@DASH-IF-IOP] for this value is restated here for convenience:
	BlockAddIDExtraData MAY contain a value as defined by Section 6.4.3.3 of [!@DASH-IF-IOP] that describes the caption services and language. An ABNF from [!@DASH-IF-IOP] for this value is restated here for convenience:


		### c608 Captions Description

		CEA-608 captions can be stored in `BlockMore Element` to associate the content of a Matroska Block (typically a video frame) with closed captioning data. Storing CEA-608 caption data requires that the corresponding Block SHALL NOT use Lacing.

add c608 mapping #823

Are you sure you want to change the base?

add c608 mapping #823

Conversation

dericed commented Apr 10, 2024

robUx4 commented Apr 13, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mbunkus commented Apr 13, 2024

dericed commented Apr 21, 2024

JeromeMartinez Apr 21, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JeromeMartinez commented Apr 21, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

robUx4 commented Dec 22, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JeromeMartinez Apr 21, 2024 •

edited

Loading