This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 28266 - [webvtt] 6.2.1 processing model handling of bidi [I18N-ISSUE-432]
Summary: [webvtt] 6.2.1 processing model handling of bidi [I18N-ISSUE-432]
Status: RESOLVED FIXED
Alias: None
Product: TextTracks CG
Classification: Unclassified
Component: WebVTT (show other bugs)
Version: unspecified
Hardware: PC All
: P2 normal
Target Milestone: ---
Assignee: This bug has no owner yet - up for the taking
QA Contact: Web Media Text Tracks CG
URL:
Whiteboard: widereview, v1
Keywords:
: 24129 (view as bug list)
Depends on:
Blocks:
 
Reported: 2015-03-22 00:18 UTC by Silvia Pfeiffer
Modified: 2016-10-11 18:22 UTC (History)
8 users (show)

See Also:


Attachments

Description Silvia Pfeiffer 2015-03-22 00:18:26 UTC
Feedback by Addison Phillips from W3C I18N group:
http://lists.w3.org/Archives/Public/public-tt/2015Mar/0065.html

I18N comment: https://www.w3.org/International/track/issues/432

6.2.1 Processing model
http://www.w3.org/TR/2014/WD-webvtt1-20141111/#h4_processing-model

In section 6.2.1 I find the instruction quoted below. This seems to be trying to determine the base direction, but I don't see other elements of implementing BIDI elsewhere in processing, such as using 'direction' to set the meaning of 'start' and 'end' and I don't see any discussion of handling directional runs in presentation (is that on a different level??)

--
Apply the Unicode Bidirectional Algorithm's Paragraph Level steps to the concatenation of the values of each WebVTT Text Object in nodes, in a pre-order, depth-first traversal, excluding WebVTT Ruby Text Objects and their descendants, to determine the paragraph embedding level of the first Unicode paragraph of the cue. [BIDI]
Note

Within a cue, paragraph boundaries are only denoted by Type B characters, such as U+000A LINE FEED (LF), U+0085 NEXT LINE (NEL), and U+2029 PARAGRAPH SEPARATOR. (This means each line of the cue is reordered as if it was a separate paragraph.)

If the paragraph embedding level determined in the previous step is even (the paragraph direction is left-to-right), let direction be 'ltr', otherwise, let it be 'rtl'.
--

It would be helpful to the author to be able to set the default base direction for a whole WebVTT file to rtl.

It would also be helpful if the author could set the base direction for each cue explicitly, since the Unicode paragraph detection algorithm can be fooled by a paragraph that starts with a strong LTR character, but is actually a RTL paragraph (or vice versa), eg. "نشاط التدويل is how you say 'i18n Activity' in Arabic."
Comment 1 Philip Jägenstedt 2015-03-23 04:07:58 UTC
As it stands, the cue text itself is the only thing used to determine text directionality.

Are ‎ and ‏ insufficient to help the Unicode paragraph detection algorithm on a per-cue level?

For which kinds of captions would it be useful to a file-wide direction default? Since cues are centered by default, it will only make a difference if one uses align:start or align:end on any cues, but if one wants all cues to be right-aligned (for the odd LTR cue in otherwise RTL captions) it seems like using align:right would be the thing to use.

Bug 15024 is somewhat related, if we have a per-cue setting then a DEFAULTS block could be used to override it.
Comment 2 Addison Phillips 2015-03-24 00:59:13 UTC
(In reply to Philip Jägenstedt from comment #1)
> As it stands, the cue text itself is the only thing used to determine text
> directionality.
> 
> Are ‎ and ‏ insufficient to help the Unicode paragraph detection
> algorithm on a per-cue level?

These are sufficient from the point of view that they set the text direction, but they require user intervention to create and insert them. Use of the bidi controls in this way is a serious impediment to RTL language users, since they must generally insert the markers for every string. This also makes it more difficult to, e.g., extract bidi text from other markup languages.

Would you find WebVTT acceptable to use if every time you entered an English or German cue string, you also had to insert an invisible control character or its entity just to ensure that it displayed properly?

> 
> For which kinds of captions would it be useful to a file-wide direction
> default? 

If the file is mainly in an RTL script language (such as Arabic or Hebrew), setting a file-wide direction is the most direct means of ensuring that all of the cues receive the correct base direction, layout, and so forth. This is much simpler than having to laboriously set the direction of every text element. It also sets "start" and "end" and other directionally affected presentational elements without reference to text. Other CSS styles that are directionally aware can also key of of the value.

> Since cues are centered by default, it will only make a difference
> if one uses align:start or align:end on any cues, but if one wants all cues
> to be right-aligned (for the odd LTR cue in otherwise RTL captions) it seems
> like using align:right would be the thing to use.

Well, no. The reason you have keywords "start" and "end" in the first place (rather than just "left" or "right") is so that RTL language users don't have to recreate and reset every single style to "the other side" of the display. If the base direction is RTL then align:start does what it says it does: aligns the text according to the starting (right) side of the display.

It is also the case that setting base text direction helps even when "start" and "end" don't enter into the situation. The Unicode bidi algorithm needs help to determine the base direction in many cases and the presentation of a given cue may be wrong because the bidi algorithm hasn't been given the necessary help. Richard provided an example in the original bug text, but there are a myriad more where the presentation is wrong. Since most sets of cues will be in the same language and share a direction, setting a default is very helpful in this regard.

> 
> Bug 15024 is somewhat related, if we have a per-cue setting then a DEFAULTS
> block could be used to override it.
Comment 3 Philip Jägenstedt 2015-03-27 06:42:38 UTC
I am fully prepared that there's a problem that needs to be solved here, but it would be very illuminating with one or a few examples of caption files along with the rendering that is desired. I'm guessing it would be something like captions in mainly Arabic, but a single cue somewhere with only Latin text. What I can't guess is a desired rendering that's hard to achieve without a file-wide directionality.

Note that my assumption here is that the text directionality only affects the position of the cue text and not how it actually rendered. If this is not true, please correct me :)
Comment 4 Glenn Adams 2015-03-27 07:11:01 UTC
(In reply to Philip Jägenstedt from comment #3)
> I am fully prepared that there's a problem that needs to be solved here, but
> it would be very illuminating with one or a few examples of caption files
> along with the rendering that is desired. I'm guessing it would be something
> like captions in mainly Arabic, but a single cue somewhere with only Latin
> text. What I can't guess is a desired rendering that's hard to achieve
> without a file-wide directionality.
> 
> Note that my assumption here is that the text directionality only affects
> the position of the cue text and not how it actually rendered. If this is
> not true, please correct me :)

As Addison says, the Unicode Bidi Algorithm needs to know the default bidi level to apply to a "paragraph level". This information is needed to resolve the dir3ectionaly of neutral or weakly directional characters, particularly at the beginning and end of the paragraph. Although there is an algorithm that chooses a default directionality in the absence of an external direction property, that algorithm often requires an explicit directionality to achieve the desired results.

So basically, it does determine how text is rendered, and not just alignment.
Comment 5 Richard Ishida 2015-03-27 08:31:18 UTC
Philip, this may help you: 
http://www.w3.org/International/articles/inline-bidi-markup/uba-basics

It explains in simple terms and with examples how the base direction (ie. the direction of the surrounding context) affects rendering.
Comment 6 Philip Jägenstedt 2015-03-27 10:36:53 UTC
Thanks, Richard. To sum it up, it's the difference in rendering between these two divs:

<div style="direction:rtl">bahrain مصر kuwait</div>
<div style="text-align:right">bahrain مصر kuwait</div>

Kind of obvious in hindsight, really.

I would propose a VTTCue.direction with values "ltr", "rtl" and "auto".

(It looked tempting to merge VTTCue.vertical into VTTCue.direction, but these are separate things in CSS and the above bahrain case can be turned into vertical examples where text directionality matters.)

Then add the DEFAULTS block to have that setting on all cues by default.
Comment 7 Silvia Pfeiffer 2015-03-28 03:51:49 UTC
Phillip, does that mean we also need an additional cue setting called "direction"?
Comment 8 Philip Jägenstedt 2015-03-30 05:54:16 UTC
Yes, a setting was my thinking.
Comment 9 Silvia Pfeiffer 2015-03-30 11:16:29 UTC
Makes sense to me.
Comment 10 David Singer 2015-05-19 00:01:41 UTC
do we also need an overall default direction?
Comment 11 Silvia Pfeiffer 2015-06-08 10:19:38 UTC
Thinking about this again:

* for the file-wide settings we are now introducing CSS style settings in a STYLE block, so that could cover this need (see bug 15023)

* for the cue-specific setting, writing &lrm; and &rlm; at the beginning of the cue would be shorter than having to write direction:lrm or direction:rlm as a cue setting

Is that sufficient to resolve this bug?
Comment 12 Philip Jägenstedt 2015-06-08 12:33:25 UTC
That would be nice:

STYLE
direction:rtl;

00:00.000 --> 00:10.000
bahrain مصر kuwait

Would that work, i18n folks?
Comment 13 Richard Ishida 2015-06-25 11:22:30 UTC
> * for the cue-specific setting, writing &lrm; and &rlm; at the beginning of the cue would be shorter than having to write direction:lrm or direction:rlm as a cue setting
>
> Is that sufficient to resolve this bug?

only if first-strong detection is used to establish the direction of the cue text.

Basically, 
(a) setting base direction via metadata (here the STYLE setting or direction:xxx on the cue) or 
(b) guessing base direction based on the first strong character, 
are both mutually exclusive (except for the case where the metadata is set to auto, which explicitly indicates that first strong should be used).

take the example Philip provided:

00:00.000 --> 00:10.000
bahrain مصر kuwait

if first-strong heuristics are being used to determine the direction, then putting &rlm; before bahrain in the source will indeed cause bahrain to appear to the right of the other text, since the first strong character is now RTL.

if, however, the base direction is already set by metadata (eg. STYLE direction:rtl;), the application would then ignore the first strong directional character to determine the base direction, since the base direction is already determined. If STYLE direction is set to ltr, therefore, bahrain will be displayed to the left of the other text, regardless of whether there is an &rlm; or not, because &rlm; doesn't set the base direction for a range of text.

(there are other, paired, Unicode control characters which would produce the expected effect by setting the base direction for a range of text, but it's not recommended to use them, especially for non-inline signals, for a variety of reasons. I have begun writing up some notes about bidi in plain text at http://r12a.github.io/docs/bidi-plain-text/ that are relevant to this.)

on the other hand, note that if STYLE contains, say, direction:rtl then the only time you'd have to add direction:ltr to the cell is when you want the ordering of directional runs or punctuation inside the cue to be different from what would be already produced with a base direction of rtl. This is not likely to be very frequent in monolingual content (although when needed it is important that it should be available).

(note, btw, that i wrote direction:ltr rather than direction:lrm)

does that help?
Comment 14 Silvia Pfeiffer 2015-06-27 10:02:32 UTC
(In reply to Richard Ishida from comment #13)
> > * for the cue-specific setting, writing &lrm; and &rlm; at the beginning of the cue would be shorter than having to write direction:lrm or direction:rlm as a cue setting
> >
> > Is that sufficient to resolve this bug?
> 
> only if first-strong detection is used to establish the direction of the cue
> text.

Trying to follow the logic. Are you saying that we are ok if we specify that when STYLE has no explicit setting, the first-strong heuristic is used?


> does that help?

Sort-of. I wasn't quite clear whether that was a "yes" or a "no"... :-)
Comment 15 Richard Ishida 2015-08-03 18:35:57 UTC
what i'm saying, or trying to say, badly :-), is that we are discussing two incompatible approaches, and the answer depends on which approach we take.


[1]
if i understand correctly, the current approach establishes the base direction of the lines of cue text by assuming that the text within a cue will behave as if CSS unicode-bidi: plain-text was applied, ie.
for each paragraph (ie. line in WebVTT), find the first strong character and set the base direction per the direction of that character.

in principle, this works for setting direction at the per-paragraph level unless you have
(a) a line that should be rtl, but starts with non-rtl characters (and vice versa),
(b) a line with no strong character (such as a telephone number) or a mixture of strong and non-strong characters (such as a Mac address) but that has to ordered in a particular way.

authors would have to look for all such cases and add either &rlm; or &lrm; to the start of the line to create the desired display.


[1a]
actually, i'm not sure it's quite as simple as that, since much of the spec text seems to concern itself with the direction of the first line in the cue, with an implication that the direction determined from that will be applied to any remaining paragraphs in that cue. This would mean that if you had a line in English, the direction of that line would be rtl if the preceding line started with, say, Arabic. I'm struggling a little to see the bigger picture due to the complexity and algorithm-heavy nature of the spec, so apologies if i'm missing something.


[2]
if WebVTT instead adds the ability to say

STYLE
direction:rtl;

then the default base direction for the content is established by that statement, and all lines of cue text should get a base direction of rtl, regardless of their first-strong character, unless some lower level directive intervenes. The important thing to bear in mind is that this approach is incompatible with first-strong heuristics, and &lrm; or &lrm; at the start of the para are of no consequence.

When you have paragraphs/lines that should not have a direction of rtl (like those mentioned above) you need a way to change their base direction using some kind of metadata annotation, on a per paragraph basis.

one could probably easily enough allow for some metadata declaration at the cue level to change the direction of content, however it is actually necessary to be able to change the direction of content for any paragraph/line level, eg. it may be the second line in the cue that has to be set to ltr. Since lines in WebVTT cues are not bounded by markup, i'm not sure how one would do this using metadata/markup.

so what i'm saying is that, if we have the file-level declaration for direction, it has to come with some other mechanism for indicating the desired base direction for individual paragraphs.

any clearer that time?


btw, none of the above addresses the need to often define an inline range of text with a base direction different to that of the paragraph. That's another story.
Comment 16 Silvia Pfeiffer 2015-08-09 09:49:10 UTC
(In reply to Richard Ishida from comment #15)
> what i'm saying, or trying to say, badly :-), is that we are discussing two
> incompatible approaches, and the answer depends on which approach we take.
> 
> 
> [1]
> if i understand correctly, the current approach establishes the base
> direction of the lines of cue text by assuming that the text within a cue
> will behave as if CSS unicode-bidi: plain-text was applied, ie.
> for each paragraph (ie. line in WebVTT)

NOTE: it's not for a line in WebVTT, but for all lines in a cue (i.e. a paragraph)


>, find the first strong character and
> set the base direction per the direction of that character.
> 
> in principle, this works for setting direction at the per-paragraph level
> unless you have
> (a) a line that should be rtl, but starts with non-rtl characters (and vice
> versa),
> (b) a line with no strong character (such as a telephone number) or a
> mixture of strong and non-strong characters (such as a Mac address) but that
> has to ordered in a particular way.
> 
> authors would have to look for all such cases and add either &rlm; or &lrm;
> to the start of the line to create the desired display.

(replace "line" with "paragraph" everywhere)
Yes, that's the idea.


> [1a]
> actually, i'm not sure it's quite as simple as that, since much of the spec
> text seems to concern itself with the direction of the first line in the
> cue, with an implication that the direction determined from that will be
> applied to any remaining paragraphs in that cue. This would mean that if you
> had a line in English, the direction of that line would be rtl if the
> preceding line started with, say, Arabic. I'm struggling a little to see the
> bigger picture due to the complexity and algorithm-heavy nature of the spec,
> so apologies if i'm missing something.

Just think of all the lines in a cue as a "paragraph" and apply directionality that way.


> [2]
> if WebVTT instead adds the ability to say
> 
> STYLE
> direction:rtl;
> 
> then the default base direction for the content is established by that
> statement, and all lines of cue text should get a base direction of rtl,
> regardless of their first-strong character, unless some lower level
> directive intervenes. The important thing to bear in mind is that this
> approach is incompatible with first-strong heuristics, and &lrm; or &lrm; at
> the start of the para are of no consequence.

Seeing as the first-strong heuristics apply to the whole cue (all of the lines), does that change your opinion?

> When you have paragraphs/lines that should not have a direction of rtl (like
> those mentioned above) you need a way to change their base direction using
> some kind of metadata annotation, on a per paragraph basis.

&lrm; and &rlm; can do that within a cue.

> one could probably easily enough allow for some metadata declaration at the
> cue level to change the direction of content, however it is actually
> necessary to be able to change the direction of content for any
> paragraph/line level, eg. it may be the second line in the cue that has to
> be set to ltr. Since lines in WebVTT cues are not bounded by markup, i'm not
> sure how one would do this using metadata/markup.

Lines in WebVTT cues are considered as a block, so they are bound. Also, you can use markup with a class span.

> so what i'm saying is that, if we have the file-level declaration for
> direction, it has to come with some other mechanism for indicating the
> desired base direction for individual paragraphs.
> 
> any clearer that time?

Yes, I understand now. But I think there's a misunderstanding about how it currently works.


> btw, none of the above addresses the need to often define an inline range of
> text with a base direction different to that of the paragraph. That's
> another story.

Class spans can work for that, too.
Comment 17 Silvia Pfeiffer 2015-09-30 22:59:28 UTC
After discussion with several internationalisation users at FOMS came to the conclusion that there is no immediate need for change. Explanations follow:

(In reply to Silvia Pfeiffer from comment #0)
> Feedback by Addison Phillips from W3C I18N group:
> http://lists.w3.org/Archives/Public/public-tt/2015Mar/0065.html
> 
> I18N comment: https://www.w3.org/International/track/issues/432
> 
> 6.2.1 Processing model
> http://www.w3.org/TR/2014/WD-webvtt1-20141111/#h4_processing-model
> 
<..>
> It would be helpful to the author to be able to set the default base
> direction for a whole WebVTT file to rtl.


There doesn't seem to be a need for this, since the text of the cue itself through UTF-8 characters already determines the directionality.


> It would also be helpful if the author could set the base direction for each
> cue explicitly, since the Unicode paragraph detection algorithm can be
> fooled by a paragraph that starts with a strong LTR character, but is
> actually a RTL paragraph (or vice versa), eg. "نشاط التدويل is how you say
> 'i18n Activity' in Arabic."

This problem has been acknowledged. However, there is already a means to address this by using the UTF-8 RLO, LRO, RLE and LRE characters. These do explicit directionality overrides in contrast to &lrm; and &rlm; which provide only hints to the algorithm.

WebVTT generally prefers the use of a single means of specifying directionality and prefers the use of UTF-8 characters to specify this over explicit markup. Therefore, we regard this issue as being addressed.
Comment 18 Silvia Pfeiffer 2015-09-30 23:04:49 UTC
*** Bug 24129 has been marked as a duplicate of this bug. ***
Comment 19 Glenn Adams 2015-10-01 02:16:47 UTC
(In reply to Silvia Pfeiffer from comment #17)
> After discussion with several internationalisation users at FOMS came to the
> conclusion that there is no immediate need for change. Explanations follow:
> 
> (In reply to Silvia Pfeiffer from comment #0)
> > Feedback by Addison Phillips from W3C I18N group:
> > http://lists.w3.org/Archives/Public/public-tt/2015Mar/0065.html
> > 
> > I18N comment: https://www.w3.org/International/track/issues/432
> > 
> > 6.2.1 Processing model
> > http://www.w3.org/TR/2014/WD-webvtt1-20141111/#h4_processing-model
> > 
> <..>
> > It would be helpful to the author to be able to set the default base
> > direction for a whole WebVTT file to rtl.
> 
> 
> There doesn't seem to be a need for this, since the text of the cue itself
> through UTF-8 characters already determines the directionality.

This is insufficient in general (but see more below). A cue may be a mixture of LTR and RTL strong directionality characters, in which case there is no generally acceptable way to determine the cue's default directionality, in which case an author specified property is required.

> 
> 
> > It would also be helpful if the author could set the base direction for each
> > cue explicitly, since the Unicode paragraph detection algorithm can be
> > fooled by a paragraph that starts with a strong LTR character, but is
> > actually a RTL paragraph (or vice versa), eg. "نشاط التدويل is how you say
> > 'i18n Activity' in Arabic."
> 
> This problem has been acknowledged. However, there is already a means to
> address this by using the UTF-8 RLO, LRO, RLE and LRE characters. These do
> explicit directionality overrides in contrast to &lrm; and &rlm; which
> provide only hints to the algorithm.

While it is true that each cue's text could be wrapped in a RLE/PDF or LRE/PDF pair, this does not actually affect the cue's default directionality, but merely the embedding level of the text so wrapped.

If there are style properties that apply to the cue as a whole, and those properties require the use of a default directionality to resolve their computed value, e.g., the computed value resolved from 'start', 'end', etc., then use of Bidi controls in the text will not suffice.

> 
> WebVTT generally prefers the use of a single means of specifying
> directionality and prefers the use of UTF-8 characters to specify this over
> explicit markup. Therefore, we regard this issue as being addressed.
Comment 20 Silvia Pfeiffer 2015-10-01 18:52:37 UTC
(In reply to Glenn Adams from comment #19)
>
> > > It would also be helpful if the author could set the base direction for each
> > > cue explicitly, since the Unicode paragraph detection algorithm can be
> > > fooled by a paragraph that starts with a strong LTR character, but is
> > > actually a RTL paragraph (or vice versa), eg. "نشاط التدويل is how you say
> > > 'i18n Activity' in Arabic."
> > 
> > This problem has been acknowledged. However, there is already a means to
> > address this by using the UTF-8 RLO, LRO, RLE and LRE characters. These do
> > explicit directionality overrides in contrast to &lrm; and &rlm; which
> > provide only hints to the algorithm.
> 
> While it is true that each cue's text could be wrapped in a RLE/PDF or
> LRE/PDF pair, this does not actually affect the cue's default
> directionality, but merely the embedding level of the text so wrapped.

If the resulting rendering is correct, what's the difference in it being called an overall "default directionality" and "the default direction of text on that embedding level"?


> If there are style properties that apply to the cue as a whole, and those
> properties require the use of a default directionality to resolve their
> computed value, e.g., the computed value resolved from 'start', 'end', etc.,
> then use of Bidi controls in the text will not suffice.

The writing direction in WebVTT is not determined from the cue text, but through cue settings. The resolution of 'start' and 'end' alignments and positioning is dependent on the paragraph directionality as determined by BIDI, which in fact is the direction of the paragraph embedding level, so we're good on that, too.
Comment 21 Glenn Adams 2015-10-01 19:04:22 UTC
(In reply to Silvia Pfeiffer from comment #20)
> (In reply to Glenn Adams from comment #19)
> >
> > > > It would also be helpful if the author could set the base direction for each
> > > > cue explicitly, since the Unicode paragraph detection algorithm can be
> > > > fooled by a paragraph that starts with a strong LTR character, but is
> > > > actually a RTL paragraph (or vice versa), eg. "نشاط التدويل is how you say
> > > > 'i18n Activity' in Arabic."
> > > 
> > > This problem has been acknowledged. However, there is already a means to
> > > address this by using the UTF-8 RLO, LRO, RLE and LRE characters. These do
> > > explicit directionality overrides in contrast to &lrm; and &rlm; which
> > > provide only hints to the algorithm.
> > 
> > While it is true that each cue's text could be wrapped in a RLE/PDF or
> > LRE/PDF pair, this does not actually affect the cue's default
> > directionality, but merely the embedding level of the text so wrapped.
> 
> If the resulting rendering is correct, what's the difference in it being
> called an overall "default directionality" and "the default direction of
> text on that embedding level"?
> 
> 
> > If there are style properties that apply to the cue as a whole, and those
> > properties require the use of a default directionality to resolve their
> > computed value, e.g., the computed value resolved from 'start', 'end', etc.,
> > then use of Bidi controls in the text will not suffice.
> 
> The writing direction in WebVTT is not determined from the cue text, but
> through cue settings. The resolution of 'start' and 'end' alignments and
> positioning is dependent on the paragraph directionality as determined by
> BIDI, which in fact is the direction of the paragraph embedding level, so
> we're good on that, too.

I'm afraid I don't agree. You are suggesting that the default bidi level for the paragraph is determined by the embedding level of the text content of the paragraph. However, that is not how the Unicode Bidi Algorithm works: the default bidi level of the paragraph is an input parameter to be used to determine the resolved levels of the paragraph's content, and not the other way around.
Comment 22 Silvia Pfeiffer 2015-10-01 22:34:10 UTC
(In reply to Glenn Adams from comment #21)
>
> > > If there are style properties that apply to the cue as a whole, and those
> > > properties require the use of a default directionality to resolve their
> > > computed value, e.g., the computed value resolved from 'start', 'end', etc.,
> > > then use of Bidi controls in the text will not suffice.
> > 
> > The writing direction in WebVTT is not determined from the cue text, but
> > through cue settings. The resolution of 'start' and 'end' alignments and
> > positioning is dependent on the paragraph directionality as determined by
> > BIDI, which in fact is the direction of the paragraph embedding level, so
> > we're good on that, too.
> 
> I'm afraid I don't agree. You are suggesting that the default bidi level for
> the paragraph is determined by the embedding level of the text content of
> the paragraph. However, that is not how the Unicode Bidi Algorithm works:
> the default bidi level of the paragraph is an input parameter to be used to
> determine the resolved levels of the paragraph's content, and not the other
> way around.

No, I'm not suggesting any such thing.

I'm saying that the default bidi level for the paragraph is irrelevant for WebVTT cues, because we have the writing direction.

If you read how the cue text alignment works:
http://dev.w3.org/html5/webvtt/#dfn-webvtt-cue-text-alignment
you find that it depends on the writing direction (determined by a cue setting) and on the BIDI paragraph direction (which according to http://www.unicode.org/reports/tr9/ is specified as follows: "The direction of the paragraph embedding level is called the paragraph direction.").
Comment 23 Glenn Adams 2015-10-02 00:52:06 UTC
(In reply to Silvia Pfeiffer from comment #22)
> (In reply to Glenn Adams from comment #21)
> >
> > > > If there are style properties that apply to the cue as a whole, and those
> > > > properties require the use of a default directionality to resolve their
> > > > computed value, e.g., the computed value resolved from 'start', 'end', etc.,
> > > > then use of Bidi controls in the text will not suffice.
> > > 
> > > The writing direction in WebVTT is not determined from the cue text, but
> > > through cue settings. The resolution of 'start' and 'end' alignments and
> > > positioning is dependent on the paragraph directionality as determined by
> > > BIDI, which in fact is the direction of the paragraph embedding level, so
> > > we're good on that, too.
> > 
> > I'm afraid I don't agree. You are suggesting that the default bidi level for
> > the paragraph is determined by the embedding level of the text content of
> > the paragraph. However, that is not how the Unicode Bidi Algorithm works:
> > the default bidi level of the paragraph is an input parameter to be used to
> > determine the resolved levels of the paragraph's content, and not the other
> > way around.
> 
> No, I'm not suggesting any such thing.
> 
> I'm saying that the default bidi level for the paragraph is irrelevant for
> WebVTT cues, because we have the writing direction.
> 
> If you read how the cue text alignment works:
> http://dev.w3.org/html5/webvtt/#dfn-webvtt-cue-text-alignment
> you find that it depends on the writing direction (determined by a cue
> setting) and on the BIDI paragraph direction (which according to
> http://www.unicode.org/reports/tr9/ is specified as follows: "The direction
> of the paragraph embedding level is called the paragraph direction.").

ok, if writing mode is specified by and applied to individual cues, as opposed to, say specified by a region regardless of the cues associated with that region, then i would agree you can infer the default paragraph from the cue specific writing mode;

however, if VTT allows specifying a writing mode on a region, and the cues obtain their writing mode from the region's writing mode, then there should be a way to override the region's writing mode on a cue by cue basis, at least with respect to overriding the inline directionality
Comment 24 Richard Ishida 2015-10-02 13:30:53 UTC
this post is going to get a little messy, but i suspect it is worth responding to your comments.  I've been making notes and examples to resummarise my understanding and clarify some remaining issues related to this heuristic approach, but I'd suggest that we plan a teleconference call to go over this.  It's likely that the ability to ask questions and adjust on the fly will significantly shorten the process.

(In reply to Silvia Pfeiffer from comment #16)
> (In reply to Richard Ishida from comment #15)

> > [1]
> > if i understand correctly, the current approach establishes the base
> > direction of the lines of cue text by assuming that the text within a cue
> > will behave as if CSS unicode-bidi: plain-text was applied, ie.
> > for each paragraph (ie. line in WebVTT)
> 
> NOTE: it's not for a line in WebVTT, but for all lines in a cue (i.e. a
> paragraph)
> 
> 
> >, find the first strong character and
> > set the base direction per the direction of that character.
> > 
> > in principle, this works for setting direction at the per-paragraph level
> > unless you have
> > (a) a line that should be rtl, but starts with non-rtl characters (and vice
> > versa),
> > (b) a line with no strong character (such as a telephone number) or a
> > mixture of strong and non-strong characters (such as a Mac address) but that
> > has to ordered in a particular way.
> > 
> > authors would have to look for all such cases and add either &rlm; or &lrm;
> > to the start of the line to create the desired display.
> 
> (replace "line" with "paragraph" everywhere)
> Yes, that's the idea.
> 
> 
> > [1a]
> > actually, i'm not sure it's quite as simple as that, since much of the spec
> > text seems to concern itself with the direction of the first line in the
> > cue, with an implication that the direction determined from that will be
> > applied to any remaining paragraphs in that cue. This would mean that if you
> > had a line in English, the direction of that line would be rtl if the
> > preceding line started with, say, Arabic. I'm struggling a little to see the
> > bigger picture due to the complexity and algorithm-heavy nature of the spec,
> > so apologies if i'm missing something.
> 
> Just think of all the lines in a cue as a "paragraph" and apply
> directionality that way.

This is very different to the way the bidi algorithm works in Unicode and in CSS. When autodetecting the direction of some text, that direction is applied to the Unicode definition of a paragraph. Such a paragraph terminates with a line break.

This approach will also cause difficulties for mutlilingual cues, such as 

00:18.000 --> 00:20.000
שלום עליכם!
Hello!

where the exclamation mark must appear to the left of the first line and to the right of the second.

The implementation will need to go out of its way to override this normal approach in order to restrict directional information to that of the first line only, and in doing so it will mean that authors will need to do something rather unusual to undo what the implementation did in cases such as the one above. I'm not sure why the spec makes the implementation do extra work to set the direction according to the first strong character in the cue rather than the first strong character for each line, which is what the UBA would normally do.



> > [2]
> > if WebVTT instead adds the ability to say
> > 
> > STYLE
> > direction:rtl;
> > 
> > then the default base direction for the content is established by that
> > statement, and all lines of cue text should get a base direction of rtl,
> > regardless of their first-strong character, unless some lower level
> > directive intervenes. The important thing to bear in mind is that this
> > approach is incompatible with first-strong heuristics, and &lrm; or &lrm; at
> > the start of the para are of no consequence.
> 
> Seeing as the first-strong heuristics apply to the whole cue (all of the
> lines), does that change your opinion?

Not at all. Compare what happens in HTML, if that helps.  If you set dir=rtl on a div containing two p elements, the base direction for each of those p elements is set to rtl.  I can only assume that if the first-strong heuristics apply 'to the whole cue (all of the lines)', then the same base direction is propagated to all those lines, and any &rlm; or &lrm; has no effect.

> 
> > When you have paragraphs/lines that should not have a direction of rtl (like
> > those mentioned above) you need a way to change their base direction using
> > some kind of metadata annotation, on a per paragraph basis.
> 
> &lrm; and &rlm; can do that within a cue.

&lrm; and &rlm; can't set the base direction when it is set declaratively, since the initial strong character is not examined.  Try it in HTML. Make a p element, add some directionally-sensitive text and add/remove &rlm; or &lrm; to the start. There'll be no difference.  (Don't confuse RLM/LRM with RLI, RLE, LRI, etc.)

> 
> > one could probably easily enough allow for some metadata declaration at the
> > cue level to change the direction of content, however it is actually
> > necessary to be able to change the direction of content for any
> > paragraph/line level, eg. it may be the second line in the cue that has to
> > be set to ltr. Since lines in WebVTT cues are not bounded by markup, i'm not
> > sure how one would do this using metadata/markup.
> 
> Lines in WebVTT cues are considered as a block, so they are bound. Also, you
> can use markup with a class span.

see above
Comment 25 Simon Pieters 2015-10-30 04:09:32 UTC
https://github.com/w3c/webvtt/issues/227
Comment 26 Silvia Pfeiffer 2016-10-11 18:22:00 UTC
https://github.com/w3c/webvtt/pull/248