This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 19541 - Specification split marks are out of kilter
Summary: Specification split marks are out of kilter
Status: RESOLVED FIXED
Alias: None
Product: HTML WG
Classification: Unclassified
Component: Editor tools (show other bugs)
Version: unspecified
Hardware: All All
: P2 blocker
Target Milestone: ---
Assignee: Robin Berjon
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-10-15 12:35 UTC by Robin Berjon
Modified: 2013-02-18 13:52 UTC (History)
7 users (show)

See Also:


Attachments

Description Robin Berjon 2012-10-15 12:35:38 UTC
I've written a tool to check that the START and END marks in the spec source are correct. You can find it as scripts/check-split-markers.js.

Currently it reports:

Consecutive END for dev-html at 118071 line 2730
Consecutive START for dev-html at 147695 line 3418
Consecutive END for w3c-html at 2774443 line 65146
Consecutive END for w3c-html at 3070938 line 72661
Consecutive END for w3c-html at 3109311 line 73695
Consecutive END for w3c-html at 3115561 line 73854
Consecutive END for dev-html at 3923610 line 94425
Consecutive START for dev-html at 4001247 line 96370
Consecutive END for dev-html at 4006880 line 96530
Consecutive START for dev-html at 4090450 line 98625
Consecutive END for dev-html at 4132792 line 99799
Consecutive START for dev-html at 4164965 line 100545
Consecutive END for w3c-html at 4894314 line 118216

For each line, the first number given is the character offset; the line number might be easier to use. When it reports a duplicate, it reports the second (or more) consecutive instance (so if you grep upwards for the same key you'll find the initial instance).

Each of those we need to fix.

Note that the first dupe END for w3c-html corresponds to the section which is currently gobbling up all the subsequent spec sections in the generated draft. In others words, that's what's killing the splitter.
Comment 1 Silvia Pfeiffer 2012-10-15 13:21:03 UTC
Just a note to be careful with such a script - sometime the START and END occur on the same line. I've fallen in this trap before.
Comment 2 Silvia Pfeiffer 2012-10-15 13:21:59 UTC
BTW: why are you following the "dev-html" markers and not the "w3c-html" markers?
Comment 3 Robin Berjon 2012-10-15 14:23:50 UTC
According to git bisect this has been broken for a very long time.

daeeed99059286753802acef79765a3d65badf50 is the first bad commit
commit daeeed99059286753802acef79765a3d65badf50
Author: ianh <ianh@340c8d12-0b0e-0410-8428-c7bf67bfef74>
Date:   Thu Jul 28 23:02:19 2011 +0000

    [e] (0) class=impl fix
    Fixing http://www.w3.org/Bugs/Public/show_bug.cgi?id=13441
    
    git-svn-id: http://svn.whatwg.org/webapps@6336 340c8d12-0b0e-0410-8428-c7bf67bfef74

:100644 100644 0b6434b45b382d58cc01f15ca706a933dde6c61b 87c299ed78622bf297f1dec208bdc38f3a51d273 M	complete.html
:100644 100644 28e2bdcf526be9b5f3977257b32206292c57ce90 f175239067d4b42416400aea1e049a13617e76a9 M	index
:100644 100644 8edb9be8fff8fc70802e9f0b2857b41303e1ad6c 5400bc6c65a846ba3160682adbef243addf62513 M	source
bisect run success

But the effects of this breakage were probably less noticeable initially.
Comment 4 Robin Berjon 2012-10-15 14:24:21 UTC
That was for w3c-html, for dev-html it has been broken since January 2011:

72c251847f02d90d2aee6371fc68c4beb6a4dae6 is the first bad commit
commit 72c251847f02d90d2aee6371fc68c4beb6a4dae6
Author: ianh <ianh@340c8d12-0b0e-0410-8428-c7bf67bfef74>
Date:   Fri Jan 7 00:40:20 2011 +0000

    [e] (0) web dev edition supporting changes
    
    git-svn-id: http://svn.whatwg.org/webapps@5746 340c8d12-0b0e-0410-8428-c7bf67bfef74

:100644 100644 58a888a7c6ff1ada22eb6944def8af454d7d657d ac7f8fdddd9d89d429b0667912ef15738bf4a839 M	complete.html
:100644 100644 00edf3a941ae821813015f7fda4dfee45188bada 250ded424ea87366640e89876978cccc89724cb1 M	index
:100644 100644 680786e1e3aa78e5704c11a4287ad3c5920af831 8e14d17556e14634ab210f1c45eb1fa44b4cf4a0 M	source
bisect run success
Comment 5 Robin Berjon 2012-10-15 14:27:33 UTC
(In reply to comment #2)
> BTW: why are you following the "dev-html" markers and not the "w3c-html"
> markers?

The script knows how to track any marker. Initially I had noticed such a problem with the 2dcontext markers, and written this tool for that, but the regex didn't include "-" so that I did fix the 2dcontext breakage but didn't see these.

The log output I show below includes all errors. You can now pass an argument to the tool to filter just for one marker type.

(In reply to comment #1)
> Just a note to be careful with such a script - sometime the START and END
> occur on the same line. I've fallen in this trap before.

Yup, I remembered that when I wrote this. I'm using a //g regex instead of line-oriented processing because of that. I've hand-checked at least some of the broken lines and can confirm that they are indeed bugs.
Comment 6 Robin Berjon 2012-10-15 14:53:59 UTC
As of 5918446 this is now correct (or hoped to be) for w3c-html.

Fixing the dev-html problems is a fair bit more work though.
Comment 7 Edward O'Connor 2012-10-15 18:54:13 UTC
When adding or removing START or END directives, the `specedit-specs-at-point' command (in specedit.el) can be helpful: it tells you what specs are active at the current cursor position.
Comment 8 Robin Berjon 2013-02-18 13:52:28 UTC
This is correct for the cases that we care about.