22562 – WebVTT: We need to define what fragments can be used on URI referencing VTT files

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 22562 - WebVTT: We need to define what fragments can be used on URI referencing VTT files

Summary: WebVTT: We need to define what fragments can be used on URI referencing VTT f...

Status:	NEW

Alias:	None

Product:	TextTracks CG
Classification:	Unclassified
Component:	WebVTT (show other bugs)
Version:	unspecified
Hardware:	All All

Importance:	P2 enhancement
Target Milestone:	---
Assignee:	Silvia Pfeiffer
QA Contact:	This bug has no owner yet - up for the taking

URL:
Whiteboard:	v2
Keywords:

Duplicates (2):	24567 27326 (view as bug list)
Depends on:
Blocks:

Reported:	2013-07-03 21:55 UTC by David Singer
Modified:	2014-11-15 03:54 UTC (History)
CC List:	6 users (show)

See Also:

Attachments

Description David Singer 2013-07-03 21:55:43 UTC

In the case that the media and the VTT are separately referenced by URI, if someone wants to e.g. pre-seek the media by using a #fragment on the URI, they should be able to pre-seek the VTT as well.  It seems as if we should consider an annex of the spec. that defines what fragment syntax may be used with VTT resources (fragment syntax is 'owned' by the MIME type).

Comment 1 Silvia Pfeiffer 2014-02-06 20:11:02 UTC

*** Bug 24567 has been marked as a duplicate of this bug. ***

Comment 2 Silvia Pfeiffer 2014-02-06 20:16:18 UTC

This would go into the mime type registration which is currently in section 8.1. I'do go with the Media Fragment URI spec: http://www.w3.org/TR/2012/PR-media-frags-20120315/#fragment-dimensions . I'd include the temporal and the id referencing mechanisms: #t= and #id= where the first one would reference a list of cues that fall into the given time range, while the second is a reference to the first (all?) cues with the given identifier.

Seeing as we should do an IANA registration, does it make sense to move this to v1?

Comment 3 Philip Jägenstedt 2014-02-07 03:05:18 UTC

I don't understand this bug. The text track is synced with the media resource's timeline, what would a fragment on the text track URL actually do?

Comment 4 Pierre-Antoine Champin 2014-02-07 08:46:02 UTC

@Philip For me, the idea was not so much to *do* anything, but to be able to *talk* about a given cue from outside the Web-VTT file, so provide each (identified) cue with a full URI.

@Sylvia I like very much the idea to reuse media-fragment in this context!

Comment 5 Philip Jägenstedt 2014-02-07 08:58:53 UTC

(In reply to Pierre-Antoine Champin from comment #4)
> @Philip For me, the idea was not so much to *do* anything, but to be able to
> *talk* about a given cue from outside the Web-VTT file, so provide each
> (identified) cue with a full URI.

Can you elaborate a bit on what you'd like to do? It seems to me that if such a URI shouldn't cause any software to do anything, it doesn't need to be standardized. The most straightforward way of identifying a cue would seem to be using just the id in a context where the file URL is known and not bothering with global identifiers...

Comment 6 Pierre-Antoine Champin 2014-02-07 12:52:36 UTC

(In reply to Philip Jägenstedt from comment #5)
> Can you elaborate a bit on what you'd like to do? It seems to me that if
> such a URI shouldn't cause any software to do anything, it doesn't need to
> be standardized.

A use case that I have in mind would be to use RDF to say something about (annotate) an individual cue. E.g.:

- the text of this cue contains an error / a typo
- this cue has incorrect timestamps
- this cue is uttered by that person (URI of the person)

Of course, systems will *do* something with those annotations, my point was only that I'm not making any assumptions on those various actions.
However, to make them possible To do that, I need a full URI to identify the fragment, whose semantics is standardized in order to be unambiguous.

> The most straightforward way of identifying a cue would
> seem to be using just the id in a context where the file URL is known and
> not bothering with global identifiers...

Well prefixing the cue ID with the file URL and a hash is not *that* bothering, and it is a straightforward way to make that context explicit.

Comment 7 Philip Jägenstedt 2014-02-07 15:35:49 UTC

(In reply to Pierre-Antoine Champin from comment #6)
> (In reply to Philip Jägenstedt from comment #5)
> > Can you elaborate a bit on what you'd like to do? It seems to me that if
> > such a URI shouldn't cause any software to do anything, it doesn't need to
> > be standardized.
> 
> A use case that I have in mind would be to use RDF to say something about
> (annotate) an individual cue. E.g.:
> 
> - the text of this cue contains an error / a typo
> - this cue has incorrect timestamps
> - this cue is uttered by that person (URI of the person)
> 
> Of course, systems will *do* something with those annotations, my point was
> only that I'm not making any assumptions on those various actions.
> However, to make them possible To do that, I need a full URI to identify the
> fragment, whose semantics is standardized in order to be unambiguous.

These use cases sound rather hypothetical at this point, assuming you're not currently building a collaborative editing system for WebVTT. I think that in order to build a collaborative such a system which is robust, you cannot depend on IDs at all, as they may be missing or duplicated. A more robust approach would be to have a custom format that's a superset of the WebVTT model. Otherwise I don't see how you're going to keep the extra data when cues are added/removed or the IDs are changed.

Unless implementations (browsers or standalone players) of WebVTT have some testable behavior for media fragments in <track src="file.vtt#id=foo"> I don't think we should add anything to the spec, i.e. I would prefer to WONTFIX this.

Silvia?

Comment 8 Pierre-Antoine Champin 2014-02-09 21:56:51 UTC

(In reply to Philip Jägenstedt from comment #7)
> > Of course, systems will *do* something with those annotations, my point was
> > only that I'm not making any assumptions on those various actions.
> > However, to make them possible To do that, I need a full URI to identify the
> > fragment, whose semantics is standardized in order to be unambiguous.
> 
> These use cases sound rather hypothetical at this point, assuming you're not
> currently building a collaborative editing system for WebVTT. 

That's actually the kind of thing that we are aiming to do :)

> I think that
> in order to build a collaborative such a system which is robust, you cannot
> depend on IDs at all, as they may be missing or duplicated.

would it be valid Web-VTT if there was a duplicate identifier??
If yes, then indeed they are bad candidates for fragments ids,
but then they should probably *not* be named "identifiers" in the first place!
If not, then I insist there should be a way to use them in a context-free way, see below... 

> (...)
> Unless implementations (browsers or standalone players) of WebVTT have some
> testable behavior for media fragments in <track src="file.vtt#id=foo"> I
> don't think we should add anything to the spec, i.e. I would prefer to
> WONTFIX this.

I understand your position regarding the testability of what is in the spec.
On the other hand, RFC3986 says

> The semantics of a fragment identifier are defined by the set of
> representations that might result from a retrieval action on the
> primary resource.  The fragment's format and resolution is therefore
> dependent on the media type [RFC2046] of a potentially retrieved
> representation.

So by *not* specifying it, you close the door once and for all to any kind of cue addressing.
On the other hand, by allowing fragments identifiers, and specifying only what they *mean* (rather than what behaviour this implies in such or such application), you leave the door open to extensibility.

Comment 9 Silvia Pfeiffer 2014-02-15 03:10:36 UTC

(In reply to Pierre-Antoine Champin from comment #8)
> (In reply to Philip Jägenstedt from comment #7)
> 
> > I think that
> > in order to build a collaborative such a system which is robust, you cannot
> > depend on IDs at all, as they may be missing or duplicated.
> 
> would it be valid Web-VTT if there was a duplicate identifier??
> If yes, then indeed they are bad candidates for fragments ids,
> but then they should probably *not* be named "identifiers" in the first
> place!
> If not, then I insist there should be a way to use them in a context-free
> way, see below... 

You can depend on IDs in WebVTT files as much as you can in HTML files. They are supposed to be unique, but author mistakes will be tolerated - URLs would then just resolve to the first one and duplicates can't be addresses (just like in HTML).


> > Unless implementations (browsers or standalone players) of WebVTT have some
> > testable behavior for media fragments in <track src="file.vtt#id=foo"> I
> > don't think we should add anything to the spec, i.e. I would prefer to
> > WONTFIX this.

I can imagine a situation with a very long video where you want to just retrieve a subpart of the video's timeline and the related subpart of the vtt files:

<video src="video.webm#t=npt:2:50:00,2:55:00">
  <track src="captions.vtt#t=npt:2:50:00,2:55:00">
</video>

That would allow to use a reduced size video and skip to the right cues in the vtt file.

Though I agree that this is a pretty convoluted and unrealistic example, since you'd probably rather use a query parameter on the vtt file to have the server send a reduced size vtt file. (I assume byte range requests wouldn't be done on the vtt file no matter how large).


> I understand your position regarding the testability of what is in the spec.
> On the other hand, RFC3986 says
> 
> > The semantics of a fragment identifier are defined by the set of
> > representations that might result from a retrieval action on the
> > primary resource.  The fragment's format and resolution is therefore
> > dependent on the media type [RFC2046] of a potentially retrieved
> > representation.
> 
> So by *not* specifying it, you close the door once and for all to any kind
> of cue addressing.
> On the other hand, by allowing fragments identifiers, and specifying only
> what they *mean* (rather than what behaviour this implies in such or such
> application), you leave the door open to extensibility.

That's how I was looking at it: merely as an addition to the mime type registration to point people who want to do such URI addressing in the right direction. I can particularly think of annotation and archiving use cases that would want to deal with such things.

I guess we should ask for some more concrete examples and can leave it for now as a v2 feature. This is really only relevant once we register the mime type with IANA.

Comment 10 Silvia Pfeiffer 2014-11-15 03:54:11 UTC

*** Bug 27326 has been marked as a duplicate of this bug. ***