This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 9071 - Handling of "[" in between-doctype-public-and-system-identifiers-state may not be ideal
Summary: Handling of "[" in between-doctype-public-and-system-identifiers-state may no...
Status: CLOSED WONTFIX
Alias: None
Product: HTML WG
Classification: Unclassified
Component: pre-LC1 HTML5 spec (editor: Ian Hickson) (show other bugs)
Version: unspecified
Hardware: PC Linux
: P3 normal
Target Milestone: ---
Assignee: Ian 'Hixie' Hickson
QA Contact: HTML WG Bugzilla archive list
URL:
Whiteboard: parsing
Keywords:
: 9052 (view as bug list)
Depends on:
Blocks:
 
Reported: 2010-02-18 16:42 UTC by Philip Taylor
Modified: 2010-10-04 14:00 UTC (History)
5 users (show)

See Also:


Attachments

Description Philip Taylor 2010-02-18 16:42:16 UTC
Currently "[" in between-doctype-public-and-system-identifiers-state is handled like any other unrecognised character, and forces quirks mode.

Firefox appears to have special-casing for "[" here. Compare (in Firefox 3.6 with html5.enable off):

http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C!doctype%20html%20public%20%22%22%20[!%22%C2%A3%24%25^%26*%28%29{}[]%3E%0A - "CSS1Compat"

http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C!doctype%20html%20public%20%22%22%20!%22%C2%A3%24%25^%26*%28%29{}[]%3E%0A - "BackCompat"

HTML5's behaviour breaks the positioning of the map in http://www.freemanforman.co.uk/content/001_Area_Search/

But it also fixes the menu spacing and skip-link underlining in http://symptomresearch.nih.gov/grantopportunities.htm

(These are the only two sites I found, out of half a million pages.)

Perhaps more data is needed to determine which behaviour results in less breakage.
Comment 1 Philip Taylor 2010-02-18 16:48:20 UTC
(See http://lists.w3.org/Archives/Public/public-html/2010Feb/0657.html for context.)
Comment 2 Simon Pieters 2010-02-19 08:53:20 UTC
I think one broken page out of half a million is not worth worrying about -- it's easier to evang the site than change the spec and implementations.
Comment 3 Simon Pieters 2010-02-19 10:42:19 UTC
<Philip`> zcorpan: http://philip.html5.org/data/doctypes.html#%3c%21doctype_html_public_%22-%2f%2fw3c%2f%2fdtd_xhtml_1.0_transitional%2f%2fen%22_system_%22http%3a%2f%2fwww.w3.org%2ftr%2fxhtml1%2fdtd%2fxhtml1-transitional.dtd%22%2f%3e
<Philip`> zcorpan: http://philip.html5.org/data/doctypes.html#%3c%21doctype_html_public_%22-%2f%2fw3c%2f%2fdtd_html_4.01_transitional%2f%2fen%22_%3chtml%3e
<Philip`> zcorpan: http://philip.html5.org/data/doctypes.html#%3c%21doctype_html_public_%22-%2f%2fw3c%2f%2fdtd_html_4.01_transitional%2f%2fen%22_%2f%3e

There's other garbage to find there; it seems it's more common to include a bogus "system" or forget the doctype's ">" than to have "[" there. Might be useful to analyze these pages and see which mode they expect. (For instance it might be reasonable to go into bogus doctype state without setting force quirks.)
Comment 4 Simon Pieters 2010-02-20 00:22:54 UTC
http://philip.html5.org/data/doctype-with-bogus-after-pub-id.txt is more recent data.
Comment 5 Simon Pieters 2010-02-20 11:24:44 UTC
Analysis:


www.superhouston.net/Employment/99818/-Gold-Maids.html
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"	
<html>	
needs quirks

www.holisticlocal.co.nz/business/search/keywords/feng+shui/btid/8
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"

    http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
doesn't matter

wvm.anonza.de/2/90728492931f0d68ef049d51f9c35d21/32/result/po_1.c_500.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" SYSTEM "http:www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
doesn't matter

www.indecon.com/iec_Web/expertise/communitydevelopment.asp
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" EAD>
needs quirks

btv.anonza.de/13/b804f95584ef3a06f24635da63fe05a1/4/result/c_8733.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" SYSTEM "http:www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
needs quirks

www.larchecommons.ca/fr/nouvelles/Deces_de_Claire_de_Miribel_1951_-_2008_2008-12-25
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
needs quirks or almost standards

www.agregat-zavod.ru/gucen/OTECH/T-330/9.php
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
<html>
needs quirks

londonbikers.com/galleries/image/1331/26311/infinity-motorcycles-night
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" http://www.w3.org/TR/html4/loose.dtd">
needs quirks or almost standards

www.zweitausendeins.de/stoebern/index.cfm?ArticleFocus=10&ord=1&alpha=1&key=Dokumentationen&CT=1
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" />
needs quirks

mobbest.ru/games/p2181p46.html
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" loose.dtd>
needs quirks

www.pagina12.com.ar/diario/elmundo/4-65222-2006-04-06.html
<!DOCTYPE html 
	PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN
	"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
doesn't matter

sites.blockstar.com/867777744/styles/limedots.css.html
<!DOCTYPE html
	PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
	SYSTEM "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
doesn't matter

watch.cissac.net/pages1/item407471_13.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
doesn't matter

www.ms-plus.com/search.asp?id=9644
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
(note: U+3000)
doesn't matter

www.kucaicha.com/chanpin/gen2.htm
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
<html>
doesn't matter

www.anonza.de/1/783245697bff54613f7f3ce35238c752/4/detail/a_3247364.c_57.cr_1312757968.html
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" SYSTEM "http:www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
doesn't matter

www.zweitausendeins.de/filmlexikon/?wert=46852&sucheNach=titel&load=2
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" />
needs quirks

www.hub-uk.com/family02/family0099.htm
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN2 "http://www.w3.org/TR/html4/loose.dtd">
needs quirks

www.bootsbau.net/contact.php?userid=ee734ebdec521d1c89abbc9dcc985361
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"/>
needs quirks

www.protic.org/index.shtml?apc=y1r030101--&x=563959&apc=y1r030101--&m=-
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
<html>
needs quirks

www.tribuntv.com/bank-asya-1-lig-18-hafta-toplu-sonuclar-haber2486.html
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" />
doesn't matter

www.anonza.de/1/6c6aa9b7ce5964615df77a3f8ec7ce80/16/result/cr_741655008.c_119.cp_1.html?sr[sf]=date&sr[sd]=
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" SYSTEM "http:www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
doesn't matter

symptomresearch.nih.gov/chapter_8/sec4/cess4pg2.htm
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" []>
doesn't matter

www.kosel.com/nl/sh/livdpz.htm
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"\n "http://www.w3.org/TR/html4/loose.dtd">
doesn't matter

www.saturnpolska.com/krakow/serwis/poradnik_klienta/
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
doesn't matter

www.supermilwaukee.com/Grocery-Stores-and-Supermarkets/29351/-Big-Discount-Foods.htm
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"	
<html>
needs quirks

www.depmod.com/albums/some_great_reward/a0470.htm
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" � "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
(notes: U+0000)
doesn't matter

www.otebe.info/sonnik/freid/3/318.html
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" 
<html>
doesn't matter

www.dopcast.de/tag/plan/newEpisodes.html
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"-->
needs quirks

votigo.com/contests/showentry/16986?showH2HDetails=true
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd>
needs quirks or almost standards

213.153.169.41/BeksaWebeng/mainpage.aspx
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"

<html xmlns="http://www.w3.org/1999/xhtml">
needs quirks or almost standards

www.avionstamps.com/ambrowCart/unique/themes_list.php
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" http://www.w3.org/TR/html4/loose.dtd">
doesn't matter


...more URLs I didn't look at...
Comment 6 Simon Pieters 2010-02-20 11:29:56 UTC
Conclusion: The spec is good as is -- there were some pages with standards mode or almost standards mode FPI but had garbage after it and expected quirks:

btv.anonza.de/13/b804f95584ef3a06f24635da63fe05a1/4/result/c_8733.html
www.bootsbau.net/contact.php?userid=ee734ebdec521d1c89abbc9dcc985361

...but no pages (of those I looked at) where the opposite is true.
Comment 7 Ian 'Hixie' Hickson 2010-02-20 22:27:21 UTC
I searched for pages matching these regexps in the Google index (basically any valid-looking doctype that contains an internal subset):

  /<!doctype\s+html\s+public\s+"[^"]+"\s*\[|)/i
  /<!doctype\s+html\s+public\s+'[^']+'\s*\[|)/i
  /<!doctype\s+html\s+system\s+"[^"]+"\s*\[|)/i
  /<!doctype\s+html\s+system\s+'[^']+'\s*\[|)/i
  /<!doctype\s+html\s+public\s+"[^"]+"\s+"[^"]+"\s*\[|)/i
  /<!doctype\s+html\s+public\s+'[^']+'\s+'[^']+'\s*\[|)/i
  /<!doctype\s+html\s+public\s+"[^"]+"\s+'[^']+'\s*\[|)/i
  /<!doctype\s+html\s+public\s+'[^']+'\s+"[^"]+"\s*\[|)/i


A lot of the pages that have an internal subset are from the "epages" e-commerce system:

http://www.btowstore.com/epages/Store3.sf/?ObjectPath=/Shops/Store3.Shop2250
http://cavaweb.es/epages/eb3502.sf/es_ES/?ObjectPath=/Shops/eb3502/Products/CH6
http://shop.cmsme.de/epages/62030644.sf/de_DE/?ObjectPath=./Categories/TERMINE
http://www.kissen-studio.de/
http://www.lahjamaailma.fi/Ukki-on-kova-jaetkae
http://www.legalize-love.com/


The list-of-companies.org site has a DOCTYPE they use on all their pages, but presumably it is better in quirks mode since half of their other pages start with something immediately before the DOCTYPE:

http://www.list-of-companies.org/Details/11311550/United_States/I-J_Auto_Shop/
http://fr.list-of-companies.org/Details/11311550/United_States/I-J_Auto_Shop/
http://www.list-of-companies.org/Details/10086450/China/Guangli_Machinery_Xinhui_Co_Ltd_/
http://gl.list-of-companies.org/Details/10086450/China/Guangli_Machinery_Xinhui_Co_Ltd_/


Here are some others I found:

http://www.dolomitiinfo.com/bellavista-cadore-comelico-superiore.t4i35371f1.aspx
http://www.dfa.ie/home/index.aspx?id=80791


I also found an actual XHTML page! (There's a bunch of them at this site.)

http://www.hindawi.com/floats/384010/figures/2008.384010.fig11.xht
Comment 8 Simon Pieters 2010-02-21 09:13:07 UTC
Your findings seem to all have the internal subset after the system identifier (where it doesn't affect the rendering mode per HTML5).
Comment 9 Ian 'Hixie' Hickson 2010-02-25 02:42:55 UTC
*** Bug 9052 has been marked as a duplicate of this bug. ***
Comment 10 Ian 'Hixie' Hickson 2010-02-25 02:57:41 UTC
I changed my regexps to only look at pages that match this:

   /<!doctype\s+html\s+public\s+"[^"]+"\s*\[/i
   /<!doctype\s+html\s+public\s+'[^']+'\s*\/i

The results, looking for this data in the Google index, found about 0.000125% of pages are have this particular DOCTYPE pattern. Note, though, that this doesn't include DOCTYPEs that are simply bogus, e.g. that have a missing quote in the system identifier part, which Philip's data _does_ catch.

Here's a random selection of some of the matching pages:

http://www.austinwyatt.co.uk/property-details-rpsMSE-AWE090138
http://www.bairstow-eves.co.uk/content/011_Legal_Information
http://www.bairstoweves.co.uk/content/008_Offices/
http://www.boekbesprekingen.nl/cgi-bin/auteur.cgi?auteur=311737&type=biografie
http://www.chappellandmatthews.co.uk/content/006_Information/001_HIPs/
http://www.countrywidescotland.co.uk/property-details-rpsCWN-ADR080332
http://www.daiwaint.co.jp/stock/Sheets/SK1.htm
http://www.diolla.ru/catalog/pharmacy/preparates/anticought/nose-drops/p_103129
http://www.entwistlegreen.co.uk/property-details-rpsBAD-RUN090794
http://europroject.pl/index.php?pid=3:15:44
http://www.frankinnes.co.uk/content/001_Contact_Us/001_Sales/
http://www.gallex.ch/gallex/1/141.41.html
http://www.gpees.co.uk/content/001_Search/004_New_Homes/
http://www.jazz-network.com/kumpf/p-lyrik.html
http://www.manncountrywide.co.uk/property-details-rpsMSE-CWS090264
http://symptomresearch.nih.gov/chapter_13/sec5/ckns5pg2.htm
http://oestjyllandsflyt.dk/privatflytning/flyttetilbud/
http://www.palmersnell.co.uk/content/002_To_Let/002_Lettings/
http://www.spencers.co.uk/content/002_To_Let/001_Lettings_Area_Search/
http://www.strattoncreber.co.uk/property-details-rpsSTC-REH090342
http://www.sugano-foods.co.jp/products2.html
http://runker_room.tripod.com/tiestalk/japped.htm
http://www.lpl.univ-aix.fr/projects/multext/CES/CES1.Annex7.html
http://www.winncom.com/moreinfo/item/5054-BSUR-LR-US/index.html

I'm leaning towards not changing the spec, based on the rarity of this and based on Simon's findings earlier in this bug.
Comment 11 Leif Halvard Silli 2010-02-25 04:30:29 UTC
(In reply to comment #2)
> I think one broken page out of half a million is not worth worrying about --
> it's easier to evang the site than change the spec and implementations.

It shoudl be simple to change implementations: No change needs to be made. Except in Opera beta 10.5 and Firefox beta HTML5 rendering. 

Comment 12 Leif Halvard Silli 2010-02-25 05:08:31 UTC
(In reply to comment #10)
> I changed my regexps to only look at pages that match this:
> 
>    /<!doctype\s+html\s+public\s+"[^"]+"\s*\[/i
>    /<!doctype\s+html\s+public\s+'[^']+'\s*\/i
> 
> The results, looking for this data in the Google index, found about 0.000125%
> of pages are have this particular DOCTYPE pattern. Note, though, that this
> doesn't include DOCTYPEs that are simply bogus, e.g. that have a missing quote
> in the system identifier part, which Philip's data _does_ catch.

In the debate that I and Philip had in public-html, then Philip brought forward a page that used a HTML40 (not 401) transitional doctype. Which thus was quirks triggering _because of that_. But which Philip, at first, thought was quirks triggering because of the "[]" characters.

And I think you have done the same thing here.. E.g. it seems you brought one page from the same site that Philip mentioend:

HTML40Transitional: http://symptomresearch.nih.gov/chapter_13/sec5/ckns5pg2.htm

And these used a HTML3 variant:

HTML30: http://www.jazz-network.com/kumpf/p-lyrik.html
HTML32: http://www.daiwaint.co.jp/stock/Sheets/SK1.htm
HTML32: http://www.gallex.ch/gallex/1/141.41.html
HTML32: http://www.sugano-foods.co.jp/products2.html
HTML30: http://runker_room.tripod.com/tiestalk/japped.htm
HTML32: http://aune.lpl.univ-aix.fr/projects/multext/CES/CES1.Annex7.html

The following page It has the doctype in the middle of the document - and shoudl therefore be in quirks mode because, in the DOM then it has no doctype: 

http://www.boekbesprekingen.nl/cgi-bin/auteur.cgi?auteur=311737&type=biografie

This page has standards doctype where it should be, and a error doctype in the middle of the document - thus in the DOM, it _has_ a correct standards triggering doctype:

http://www.diolla.ru/catalog/pharmacy/preparates/anticought/nose-drops/p_103129

ALL THE PAGES I HAVE MENTIONED ABOVE do not belong into the "book keeping" that you try to perform here, Ian. They are irrelevant to the issue that we are discussing.

--------

Now, there were very many XHTML pages amongst your selection of pages, and all of them contained a "[url=" instead of a quote mark in the system identifier part. Typically this:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" [url=http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">

(<http://www.gpees.co.uk/content/001_Search/004_New_Homes/>)

In IE8 this doctype does not trigger quirks. Firefox also not. Opera 10.5 beta yes. But not in released versions of Opera.
Comment 13 Leif Halvard Silli 2010-02-25 05:17:22 UTC
Sorry, I(In reply to comment #12)
> (In reply to comment #10)


> Now, there were very many XHTML pages amongst your selection of pages, and all
> of them contained a "[url=" instead of a quote mark in the system identifier
> part. Typically this:
> 
> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
> [url=http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
> 
> (<http://www.gpees.co.uk/content/001_Search/004_New_Homes/>)
> 
> In IE8 this doctype does not trigger quirks. Firefox also not. Opera 10.5 beta
> yes. But not in released versions of Opera.

Sorry, what I said about Opera was not true. Opera 10 also lands in quirks mode here.
Comment 14 Leif Halvard Silli 2010-02-25 05:37:26 UTC
(In reply to comment #0)
> Currently "[" in between-doctype-public-and-system-identifiers-state is handled
> like any other unrecognised character, and forces quirks mode.
> 
> Firefox appears to have special-casing for "[" here. Compare (in Firefox 3.6
> with html5.enable off):
> 
> http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C!doctype%20html%20public%20%22%22%20[!%22%C2%A3%24%25^%26*%28%29{}[]%3E%0A
> - "CSS1Compat"
> 
> http://software.hixie.ch/utilities/js/live-dom-viewer/?%3C!doctype%20html%20public%20%22%22%20!%22%C2%A3%24%25^%26*%28%29{}[]%3E%0A
> - "BackCompat"

What these two tests shows w.r.t. Firefox is that  the [] (square brackts) has a toggle effect. The math goes like this:

No-Quirks Doctype + [] = Quirks doctype.

    Example: 
<!doctype html public> + [] = <!doctype html public  []> = no quirks


Quirks Doctype + [] = No-Quirks

   Example:
<!doctype html> + [] = <!doctype html  []> = Quirks.

Wild guess: Mozilla at one point in time wanted to be able to turn quirks into no-quirks and vice versa?

BUT NOTE: Firefox looks fore paired brackets. The reason why the second test lands in quirks is becaus there is no paired brackets. Opera 10.10 has the same issue. One cannot IMHO blame them for this.
Comment 15 Leif Halvard Silli 2010-02-25 06:06:30 UTC
(In reply to comment #14)
> (In reply to comment #0)

Sorry, I had some typos in the "math". And in addition to the sever typos, it also seems like the toggling effect is in principle related to "" and not to []:

Quirks + "" = No-Quirks (<!doctype html public> + "" = <!doctype html public  "">)

No-Quirks + "" = Quirks (<!doctype html> + "" = <!doctype html "">)

The exception, when [] DOES play the role of a no-quirks trigger is when it comes directly after the FPI of a   doctype that is known to trigger quirks. 

QUIRKS:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" >

NO-QUIRKS:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"  [] >

In a summary: [] triggers no-quirks in Firefox directly after the FPI for a FPI that triggers quirks mode without it. But otherwise [] doesn't affect quirks/no-quirks in Firefox.

This exception is never the less a very peculiar Firefox behavour
Comment 16 Simon Pieters 2010-02-25 22:17:42 UTC
(In reply to comment #10)
> I changed my regexps to only look at pages that match this:
> 
>    /<!doctype\s+html\s+public\s+"[^"]+"\s*\[/i
>    /<!doctype\s+html\s+public\s+'[^']+'\s*\/i
> 
> The results, looking for this data in the Google index, found about 0.000125%
> of pages are have this particular DOCTYPE pattern. Note, though, that this
> doesn't include DOCTYPEs that are simply bogus, e.g. that have a missing quote
> in the system identifier part, which Philip's data _does_ catch.
> 
> Here's a random selection of some of the matching pages:
> 
> http://www.austinwyatt.co.uk/property-details-rpsMSE-AWE090138
doesn't matter

> http://www.bairstow-eves.co.uk/content/011_Legal_Information
doesn't matter (doesn't seem to have css applied)

> http://www.bairstoweves.co.uk/content/008_Offices/
doesn't matter

> http://www.boekbesprekingen.nl/cgi-bin/auteur.cgi?auteur=311737&type=biografie
N/A

> http://www.chappellandmatthews.co.uk/content/006_Information/001_HIPs/
doesn't matter

> http://www.countrywidescotland.co.uk/property-details-rpsCWN-ADR080332
doesn't matter

> http://www.daiwaint.co.jp/stock/Sheets/SK1.htm
doesn't matter

> http://www.diolla.ru/catalog/pharmacy/preparates/anticought/nose-drops/p_103129
N/A

> http://www.entwistlegreen.co.uk/property-details-rpsBAD-RUN090794
doesn't matter

> http://europroject.pl/index.php?pid=3:15:44
slight layout change, but doesn't matter

> http://www.frankinnes.co.uk/content/001_Contact_Us/001_Sales/
doesn't matter

> http://www.gallex.ch/gallex/1/141.41.html
doesn't matter

> http://www.gpees.co.uk/content/001_Search/004_New_Homes/
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" [url=http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
has different spacing in standards mode and quirks mode. Can't tell which is intended.

> http://www.jazz-network.com/kumpf/p-lyrik.html
doesn't matter

> http://www.manncountrywide.co.uk/property-details-rpsMSE-CWS090264
doesn't matter

> http://symptomresearch.nih.gov/chapter_13/sec5/ckns5pg2.htm
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" []>
needs quirks

> http://oestjyllandsflyt.dk/privatflytning/flyttetilbud/
doesn't matter

> http://www.palmersnell.co.uk/content/002_To_Let/002_Lettings/
doesn't matter

> http://www.spencers.co.uk/content/002_To_Let/001_Lettings_Area_Search/
doesn't matter

> http://www.strattoncreber.co.uk/property-details-rpsSTC-REH090342
doesn't matter

> http://www.sugano-foods.co.jp/products2.html
N/A

> http://runker_room.tripod.com/tiestalk/japped.htm
doesn't matter

> http://www.lpl.univ-aix.fr/projects/multext/CES/CES1.Annex7.html
doesn't matter

> http://www.winncom.com/moreinfo/item/5054-BSUR-LR-US/index.html
N/A


Many of these seem to be based on the same template (the ones that have [url= in the doctype).


> I'm leaning towards not changing the spec, based on the rarity of this and
> based on Simon's findings earlier in this bug.

We could make "[" after public identifyer go into bogus doctype without setting force-quirks, while letting any other garbage character set force-quirks ("S" and "/" needed force-quirks from my earlier findings), and not regress compat. However, it's just one page of those analyzed that is affected by it (and for that page I couldn't tell whether it would actually be helped or not), so I would suggest to Avoid Needless Complexity.
Comment 17 Simon Pieters 2010-02-25 22:24:13 UTC
BTW, Firefox's current behavior of forcing standards mode for [] is incompatible with:

>> http://symptomresearch.nih.gov/chapter_13/sec5/ckns5pg2.htm
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" []>
> needs quirks

and

> But it also fixes the menu spacing and skip-link underlining in
> http://symptomresearch.nih.gov/grantopportunities.htm
Comment 18 Leif Halvard Silli 2010-02-26 00:14:40 UTC
(In reply to comment #17)
> BTW, Firefox's current behavior of forcing standards mode for [] is
> incompatible with:
> 
> >> http://symptomresearch.nih.gov/chapter_13/sec5/ckns5pg2.htm
> > <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" []>
> > needs quirks


Indeed.  Because it is HTML40Transition (not 401Transitional). I agree that this particular behavoiur of Firefox, should be ignored. 

This example - and our common conclusion about Firefox's behavour -  IMHO supports the view that [] should in be ignored w.r.t. quirks/non-quirks.
Comment 19 Philip Taylor 2010-04-01 10:36:43 UTC
Closing, since Simon's analysis indicates this doesn't have a significant effect on legacy compatibility and the current handling is good enough and it's better not to add new complexity that doesn't improve compatibility.