<?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<!DOCTYPE bugzilla SYSTEM "https://www.w3.org/Bugs/Public/page.cgi?id=bugzilla.dtd">

<bugzilla version="5.0.4"
          urlbase="https://www.w3.org/Bugs/Public/"
          
          maintainer="sysbot+bugzilla@w3.org"
>

    <bug>
          <bug_id>24100</bug_id>
          
          <creation_ts>2013-12-15 07:31:04 +0000</creation_ts>
          <short_desc>Bug in the HTML outline algorithm</short_desc>
          <delta_ts>2014-02-21 21:09:45 +0000</delta_ts>
          <reporter_accessible>1</reporter_accessible>
          <cclist_accessible>1</cclist_accessible>
          <classification_id>1</classification_id>
          <classification>Unclassified</classification>
          <product>WHATWG</product>
          <component>HTML</component>
          <version>unspecified</version>
          <rep_platform>PC</rep_platform>
          <op_sys>All</op_sys>
          <bug_status>RESOLVED</bug_status>
          <resolution>FIXED</resolution>
          
          
          <bug_file_loc></bug_file_loc>
          <status_whiteboard></status_whiteboard>
          <keywords></keywords>
          <priority>P2</priority>
          <bug_severity>normal</bug_severity>
          <target_milestone>Unsorted</target_milestone>
          
          
          <everconfirmed>1</everconfirmed>
          <reporter name="Michael[tm] Smith">mike</reporter>
          <assigned_to name="Ian &apos;Hixie&apos; Hickson">ian</assigned_to>
          <cc>faulkner.steve</cc>
    
    <cc>ian</cc>
    
    <cc>marc.hoyois</cc>
    
    <cc>mike</cc>
          
          <qa_contact>contributor</qa_contact>

      

      

      

          <comment_sort_order>oldest_to_newest</comment_sort_order>  
          <long_desc isprivate="0" >
    <commentid>97628</commentid>
    <comment_count>0</comment_count>
    <who name="Michael[tm] Smith">mike</who>
    <bug_when>2013-12-15 07:31:04 +0000</bug_when>
    <thetext>+++ This bug was initially created as a clone of Bug #24097 +++

Quoting Marc Hoyois&apos;s description from bug 24097:

[[
The determination of the *current section* when exiting a sectioning root is wrong and can lead to several weird behaviors, including an actual error. Here are a couple of examples.

## Example 1 (error)

&lt;body&gt;
&lt;h1&gt;A&lt;/h1&gt;
&lt;section&gt;&lt;/section&gt;
&lt;figure&gt;&lt;/figure&gt;
&lt;h2&gt;B&lt;/h2&gt;
&lt;/body&gt;

After exiting the sectioning root &lt;figure&gt;, the algorithm sets the current section to be the *deepest section* in the current outline, which is the section corresponding to the &lt;section&gt; element. Then, when entering &lt;h2&gt;, it will compare the rank of &lt;h2&gt; with the rank of the implied heading of that section, which is undefined.

## Example 2 (no error but nonsensical outline)

&lt;body&gt;
&lt;h1&gt;A&lt;/h1&gt;
&lt;section&gt;&lt;h1&gt;B&lt;/h1&gt;&lt;/section&gt;
&lt;figure&gt;&lt;/figure&gt;
&lt;h2&gt;C&lt;/h2&gt;
&lt;/body&gt;

In this case the algorithm produces the outline

1. A
   1.1. B
      1.1.1. C

If we remove the &lt;figure&gt; element, we get the correct outline:

1. A
   1.1. B
   1.2. C

## Solution

The problem is this: when exiting a sectioning root, the current section should be set to whichever section was current upon entering the root, but this is not always the deepest section. The algorithm could ask that the correct section be remembered, or else that section can be determined as follows (when exiting the sectioning root):

- let *current section* be the last section of the current outline
- if the last child section of *current section* exists and is an *implicit* section, then go to the step *finding the deepest child*, otherwise do nothing

## Another bug?

There is a related point which I&apos;m not sure is intended. Consider the document:

&lt;body&gt;
&lt;figure&gt;&lt;/figure&gt;
&lt;h1&gt;Title&lt;/h1&gt;
&lt;/body&gt;

The algorithm computes the outline:

1. Untitled document
2. Title

If sectioning roots are supposed to be &quot;invisible&quot; in the outline, then the outline should simply be

1. Title

If the latter is indeed the intended behavior, then the algorithm should not create an implied heading when entering a sectioning root.
]]</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>97642</commentid>
    <comment_count>1</comment_count>
    <who name="Michael[tm] Smith">mike</who>
    <bug_when>2013-12-16 07:57:51 +0000</bug_when>
    <thetext>*** Bug 24107 has been marked as a duplicate of this bug. ***</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>97697</commentid>
    <comment_count>2</comment_count>
    <who name="Ian &apos;Hixie&apos; Hickson">ian</who>
    <bug_when>2013-12-16 22:11:06 +0000</bug_when>
    <thetext>For the record, the reason this logic exists at all (distinguishing sectioning roots from sectioning content) is so that this:

   &lt;h1&gt;...&lt;/h1&gt;
   &lt;h2&gt;...&lt;/h2&gt;
   &lt;figure&gt;&lt;/figure&gt;
   &lt;h3&gt;...&lt;/h3&gt;

...results in:

    h1 section
      h2 section
        (figure)
        h3 section

...while this:

   &lt;h1&gt;...&lt;/h1&gt;
   &lt;h2&gt;...&lt;/h2&gt;
   &lt;section&gt;&lt;/section&gt;
   &lt;h3&gt;...&lt;/h3&gt;

...result in:

    h1 section
      h2 section
      anon section
      h3 section

...in the outline.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>97699</commentid>
    <comment_count>3</comment_count>
    <who name="Ian &apos;Hixie&apos; Hickson">ian</who>
    <bug_when>2013-12-16 22:17:31 +0000</bug_when>
    <thetext>Split off the second part to bug 24118.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>97700</commentid>
    <comment_count>4</comment_count>
    <who name="">contributor</who>
    <bug_when>2013-12-16 22:30:41 +0000</bug_when>
    <thetext>Checked in as WHATWG revision r8357.
Check-in comment: Make the outline algorithm easier to edit by making it all explicit steps and breaking out the (currently still identical) steps for entering sectioning content vs sectioning roots.
http://html5.org/tools/web-apps-tracker?from=8356&amp;to=8357</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>97703</commentid>
    <comment_count>5</comment_count>
    <who name="Ian &apos;Hixie&apos; Hickson">ian</who>
    <bug_when>2013-12-16 22:44:26 +0000</bug_when>
    <thetext>Actually nevermind about that splitting off, I fixed bug 24118 at the same time as this one anyway. Heh. Thanks Marc! Please do reopen this bug if it&apos;s not properly fixed (or bug 24118 if that part of it isn&apos;t fixed). Thanks!</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>97704</commentid>
    <comment_count>6</comment_count>
    <who name="">contributor</who>
    <bug_when>2013-12-16 22:44:28 +0000</bug_when>
    <thetext>Checked in as WHATWG revision r8358.
Check-in comment: Make the outline algorithm handle sectioning roots more sensibly
http://html5.org/tools/web-apps-tracker?from=8357&amp;to=8358</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>97731</commentid>
    <comment_count>7</comment_count>
    <who name="Marc Hoyois">marc.hoyois</who>
    <bug_when>2013-12-17 08:33:42 +0000</bug_when>
    <thetext>Everything looks good, except that you removed the penultimate step &quot;Associate current outline target with current section&quot; when entering a sectioning root. Without it a sectioning root ends up being associated with its *parent section* (which is null for &lt;body&gt;).</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>97983</commentid>
    <comment_count>8</comment_count>
    <who name="Ian &apos;Hixie&apos; Hickson">ian</who>
    <bug_when>2014-01-03 22:34:14 +0000</bug_when>
    <thetext>You want it to be associated with its parent section, otherwise it disappears from the section it was a part of, which makes no sense (consider a &lt;blockquote&gt;; it&apos;s not a subsection, it&apos;s just a part of the section that happens to have its own outline). In the case of a root &lt;body&gt;, it gets associated with its own section because &quot;current section&quot; is set when you enter the &lt;body&gt; and is never unset.

No? Maybe I&apos;m missing something.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>97987</commentid>
    <comment_count>9</comment_count>
    <who name="Marc Hoyois">marc.hoyois</who>
    <bug_when>2014-01-04 00:32:52 +0000</bug_when>
    <thetext>You&apos;re right about &lt;body&gt;.

I see your point, but you could also argue the other way. If &lt;body&gt; is to be associated with the top section in the outline it creates, then you might expect the same for other sectioning roots. I guess it depends on what exactly is the purpose of these associations; the spec doesn&apos;t say.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>98001</commentid>
    <comment_count>10</comment_count>
    <who name="Ian &apos;Hixie&apos; Hickson">ian</who>
    <bug_when>2014-01-04 22:01:50 +0000</bug_when>
    <thetext>The purpose is up to the implementation, but for example: if you had the element, which entry in the table of contents should you highlight? If you associate the &lt;blockquote&gt; with the sections of its internal outline only, there&apos;s no link from that outline to the parent outline. It&apos;s like you&apos;ve orphaned the element entirely. There&apos;d be no way to know what section the element was in:

   &lt;h1&gt;Aaa&lt;/h1&gt;
   &lt;h2&gt;Bbb&lt;/h2&gt;
   &lt;blockquote&gt;...&lt;/blockquote&gt;
   &lt;h2&gt;Ccc&lt;/h2&gt;

What section is the blockquote in?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>98006</commentid>
    <comment_count>11</comment_count>
    <who name="Marc Hoyois">marc.hoyois</who>
    <bug_when>2014-01-04 22:35:01 +0000</bug_when>
    <thetext>&gt; What section is the blockquote in?

I&apos;d say the answer depends on which outline you&apos;re looking at: it&apos;s in section Bbb of the main outline and it&apos;s also in the first section of another outline. But if each node must be associated to a single section of a single outline, that section should not depend on which root you run the algorithm from, as it currently does. The outline of the &lt;blockquote&gt; element will not be linked to the main outline either way, and the problem of figuring out which section to highlight also applies to any child of &lt;blockquote&gt;, so changing the section associated with &lt;blockquote&gt; does not solve it.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>98217</commentid>
    <comment_count>12</comment_count>
    <who name="Ian &apos;Hixie&apos; Hickson">ian</who>
    <bug_when>2014-01-09 19:04:41 +0000</bug_when>
    <thetext>The children of the blockquote belong to the outline of the blockquote, and the blockquote itself belongs to the outline of the document. That way you can walk your way up the chain. If we associate the blockquote with the inner outline&apos;s section, then the chain is broken.

I suppose we could have a 1:many association model, but I&apos;m not really sure what that would mean, exactly.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>98226</commentid>
    <comment_count>13</comment_count>
    <who name="Marc Hoyois">marc.hoyois</who>
    <bug_when>2014-01-09 20:46:04 +0000</bug_when>
    <thetext>&gt; That way you can walk your way up the chain.

I don&apos;t see how. You can&apos;t possibly link the outlines with a one-to-one association of nodes with sections.

With either model a user agent that wants to figure out in which section of the main outline a given node is cannot do it using only the trees of sections and the node→section mapping. Given this, it seems more consistent to associate sectioning roots with a section in their own outline, since that&apos;s how it&apos;s done for the top root and for sectioning content elements.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>98386</commentid>
    <comment_count>14</comment_count>
    <who name="Ian &apos;Hixie&apos; Hickson">ian</who>
    <bug_when>2014-01-14 00:13:04 +0000</bug_when>
    <thetext>You walk the outline chain by going element -&gt; section, section -&gt; outline, outline -&gt; element, loop.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>98390</commentid>
    <comment_count>15</comment_count>
    <who name="Marc Hoyois">marc.hoyois</who>
    <bug_when>2014-01-14 01:42:07 +0000</bug_when>
    <thetext>The new algorithm provides no way of getting the element from the outline.

Anyway, in the hope of moving things forward, let me suggest a couple solutions. I&apos;m assuming the goal is to determine which section to highlight in a table of content given an element.

Solution 1: the algorithm simply treats all sectioning roots (except the top one) and their descendants as generic nodes. That way the algorithm produces only one outline, that of the root given as input, and all nodes are associated to a section in that outline. This seems like a clean and practical solution.

Solution 2: leave the algorithm as is, but restore the &quot;associate node with section&quot; step as I proposed in comment #7. Read literally, each sectioning root now has an &quot;associated section&quot; (same as in the pre-december algorithm) as well as a &quot;parent section&quot;. Using this data you can walk up the chain.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>98449</commentid>
    <comment_count>16</comment_count>
    <who name="Ian &apos;Hixie&apos; Hickson">ian</who>
    <bug_when>2014-01-14 19:16:37 +0000</bug_when>
    <thetext>The outline is the outline of the element. I don&apos;t understand what you mean.

I don&apos;t understand the problem that the solutions are attempting to solve. As far as I can tell, the issue in comment 7 isn&apos;t an issue. I thought what we were discussing is why it _is_ an issue. :-)</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>98752</commentid>
    <comment_count>17</comment_count>
    <who name="Marc Hoyois">marc.hoyois</who>
    <bug_when>2014-01-19 17:47:00 +0000</bug_when>
    <thetext>Say you&apos;re given some descendant of that &lt;blockquote&gt; element, and you want to highlight the section in the outline of &lt;body&gt; where the node belongs. The problem is that the output of the algorithm does not contain the necessary information to figure out which section that is. What you proposed in comment #14 assumes that outlines are some sort of object having the element as a property, but the algorithm does not define outlines as such: they&apos;re just trees of sections.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>98845</commentid>
    <comment_count>18</comment_count>
    <who name="Ian &apos;Hixie&apos; Hickson">ian</who>
    <bug_when>2014-01-21 21:39:11 +0000</bug_when>
    <thetext>An outline is &quot;for a sectioning content element or a sectioning root element&quot; (quoting from the definition of &quot;outline&quot; at http://whatwg.org/html#outline ).

So if you get to an outline, you can go to the element for which it was created.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>99600</commentid>
    <comment_count>19</comment_count>
    <who name="Marc Hoyois">marc.hoyois</who>
    <bug_when>2014-02-03 06:21:57 +0000</bug_when>
    <thetext>I don&apos;t know what to say except to repeat the last sentence of comment #17. I doubt that any implementor would understand the sentence &quot;an outline is for a sectioning element&quot; as &quot;an outline must point to a sectioning element&quot;. (I&apos;m terribly busy at the moment, so I may not be very responsive.)</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>100041</commentid>
    <comment_count>20</comment_count>
    <who name="Ian &apos;Hixie&apos; Hickson">ian</who>
    <bug_when>2014-02-07 18:17:05 +0000</bug_when>
    <thetext>I don&apos;t know what &quot;must point&quot; would mean, I mean, we can&apos;t very well give conformance criteria for the shapes of internal data structures.

But I think it&apos;s eminently reasonable to assume that if X is for Y, a property of X is that it is for Y, and a property of Y is that X is for it. The outline doesn&apos;t exist in isolation, it exists only in the context of the element for which it was created. Would the spec be more acceptable if I simply added the sentence &quot;The element for which the outline was created is said to be the outline&apos;s owner&quot;?</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>101143</commentid>
    <comment_count>21</comment_count>
    <who name="Ian &apos;Hixie&apos; Hickson">ian</who>
    <bug_when>2014-02-21 21:09:03 +0000</bug_when>
    <thetext>I&apos;ve added the sentence I suggested in comment 20. Reopen the bug if it&apos;s not enough or if I am still missing something here.</thetext>
  </long_desc><long_desc isprivate="0" >
    <commentid>101144</commentid>
    <comment_count>22</comment_count>
    <who name="">contributor</who>
    <bug_when>2014-02-21 21:09:45 +0000</bug_when>
    <thetext>Checked in as WHATWG revision r8499.
Check-in comment: Try to clarify that outlines are owned by elements.
http://html5.org/tools/web-apps-tracker?from=8498&amp;to=8499</thetext>
  </long_desc>
      
      

    </bug>

</bugzilla>