This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 15922 - <track> WebVTT: Add <lang ...> to WebVTT
Summary: <track> WebVTT: Add <lang ...> to WebVTT
Status: CLOSED FIXED
Alias: None
Product: TextTracks CG
Classification: Unclassified
Component: WebVTT (show other bugs)
Version: unspecified
Hardware: PC All
: P2 normal
Target Milestone: ---
Assignee: Ian 'Hixie' Hickson
QA Contact: This bug has no owner yet - up for the taking
URL:
Whiteboard:
Keywords:
Depends on:
Blocks: 20853
  Show dependency treegraph
 
Reported: 2012-02-07 06:20 UTC by Silvia Pfeiffer
Modified: 2013-02-02 07:08 UTC (History)
5 users (show)

See Also:


Attachments

Description Silvia Pfeiffer 2012-02-07 06:20:10 UTC
Sometimes languages are changed mid-cue and need a different font loaded. This can happen automatically if the language is marked up. Mixed language cues should be supported. This means we need a means for inline language markup.

The following approach was suggested by Ian in http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2011-December/034013.html :


  WEBVTT

  cue id
  00:00:00.000 --> 00:00:01.000
  <lang en>cue text says <lang fr>bonjour</lang></lang>


It would probably translate to <span lang="xx"></span> in HTML.
Comment 1 Ian 'Hixie' Hickson 2012-04-25 21:52:09 UTC
If we are to add language information to the language, there's four ways 
to do it: inline, cue-level, block-level (a section of the file, e.g. 
setting a default at different points in the file), and file-level.

Inline would look like this:

   WEBVTT

   cue id
   00:00:00.000 --> 00:00:01.000
   <lang en>cue text says <lang fr>bonjour</lang></lang>

File-level would look like this:

   WEBVTT
   language: fr

   cue id
   00:00:00.000 --> 00:00:01.000
   bonjour

I suppose we'd need both. I wouldn't propose cue-level or block-level.

How important is this for v1?
Comment 2 Silvia Pfeiffer 2012-04-30 01:16:49 UTC
I agree, we want inline and file-level. cue-level and block-level are not as relevant and can be achieved by inline.

About the importance: anyone wanting to represent mixed language cues will need it, so I think it's important enough for v1.

Adding inline seems a simple addition, so definitely v1.

Adding file-level seems to only be required outside of Web browsers, since for Web browsers we have @srclang . IIUC file-level depends on resolving the general structure of how to add file-wide metadata first, see bug 15851 .
Comment 3 Philip Jägenstedt 2012-04-30 09:27:59 UTC
Is font switching the only use case? If so, can a real-world example be given where the language is needed to pick the correct font? (English/French is not such a case.) On the list the discussion has been mostly about CJK, but I've never seen subtitles in mixed CJK script, AFAICT names are always transliterated.
Comment 4 Silvia Pfeiffer 2012-05-01 03:38:32 UTC
(In reply to comment #3)
> Is font switching the only use case?

No. File-level markup can also be used for creating menus. File- and cue-level markup can be chosen to load the right speech synthesis engine when for example used for audio descriptions.

> If so, can a real-world example be given
> where the language is needed to pick the correct font? (English/French is not
> such a case.) On the list the discussion has been mostly about CJK, but I've
> never seen subtitles in mixed CJK script, AFAICT names are always
> transliterated.

I don't know much about i18n and fonts, but I think the use cases for @lang in HTML also apply here (http://www.whatwg.org/specs/web-apps/current-work/multipage/elements.html#the-lang-and-xml:lang-attributes), which goes beyond just fonts.
Comment 5 Philip Jägenstedt 2012-05-02 08:39:23 UTC
(In reply to comment #4)
> (In reply to comment #3)
> > Is font switching the only use case?
> 
> No. File-level markup can also be used for creating menus. File- and cue-level
> markup can be chosen to load the right speech synthesis engine when for example
> used for audio descriptions.
> 
> > If so, can a real-world example be given
> > where the language is needed to pick the correct font? (English/French is not
> > such a case.) On the list the discussion has been mostly about CJK, but I've
> > never seen subtitles in mixed CJK script, AFAICT names are always
> > transliterated.
> 
> I don't know much about i18n and fonts, but I think the use cases for @lang in
> HTML also apply here
> (http://www.whatwg.org/specs/web-apps/current-work/multipage/elements.html#the-lang-and-xml:lang-attributes),
> which goes beyond just fonts.

The linked section says:

"User agents may use the element's language to determine proper processing or rendering (e.g. in the selection of appropriate fonts or pronunciations, for dictionary selection, or for the user interfaces of form controls such as date pickers)."

AFAICT, only font switching and pronunciation (for audio descriptions) could potentially make sense for WebVTT. The case for font switching is not compelling without real-world examples that would need it.

I don't know much about speech synth, but is it really the case that users want to hear foreign words in another voice or style than native words? As an example, I'd bet that many English speakers would not understand "Sichuan" or "Xi'an" and possibly even "Shanghai" if pronounced as Mandarin Chinese. In what kinds of contexts would language information be an improvement?
Comment 6 Janina Sajka 2012-06-02 23:26:40 UTC
It's much more than just fonts. The W3C I18N WG has published a Best Practices Note on use of lang, http://www.w3.org/TR/i18n-html-tech-lang/. Speaking personally as a screen reader user, I always appreciate lang ml as it gives my speech tech the opportunity to pronounce content correctly--something that's quite impossible when default lang phonemes and pronunciations rules are applied to some inline content in another language. It would seem to me all the other reasons for lang ml still pertain, i.e. better indexing, better spell and grammar checking, etc. Consider, too, that correct indexing of caption/video-description content will be highly useful at allow identification of exactly where in a video particular speech (actions, etc) occur. I submit it's as important for lang ml in alternative media content as in any page content. The reasons are just as strong, imo.
Comment 7 Philip Jägenstedt 2012-06-11 09:01:02 UTC
Janina, can you comment on the last paragraph of comment 5?
Comment 8 Ian 'Hixie' Hickson 2012-07-24 16:05:45 UTC
Adding <lang> seems reasonable for v1. I'll start with that.
Comment 9 Ian 'Hixie' Hickson 2012-11-05 21:03:36 UTC
I've added <lang>. If we want to add file-wide language data, please file a separate bug, explaining the use cases.

Note that right now it doesn't default to the language used to link to the file, so by default :lang(...) can't match anything. If you think that should change, please file a bug on that too, with reasoning.
Comment 10 contributor 2012-11-05 21:04:04 UTC
Checked in as WHATWG revision r7504.
Check-in comment: Add <lang> to WebVTT.
http://html5.org/tools/web-apps-tracker?from=7503&to=7504