22999 2013-08-18 20:09:29 +0000 Rules for omitting </p> don't match the parser 2013-11-25 18:43:37 +0000 1 1 1 Unclassified WHATWG HTML unspecified Other other RESOLVED WORKSFORME http://www.whatwg.org/specs/web-apps/current-work/#optional-tags P3 normal Unsorted 1 contributor ian ian mathias mike zcorpan contributor oldest_to_newest 92238 0 contributor 2013-08-18 20:09:29 +0000 Specification: http://www.whatwg.org/specs/web-apps/current-work/multipage/syntax.html Multipage: http://www.whatwg.org/C#optional-tags Complete: http://www.whatwg.org/c#optional-tags Referrer: http://www.whatwg.org/specs/web-apps/current-work/multipage/ Comment: Rules for omitting </p> don't match the parser Posted from: 90.230.218.37 User agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/29.0.1547.49 Safari/537.36 OPR/16.0.1196.45 (Edition Next) 92239 1 zcorpan 2013-08-18 20:21:39 +0000 [[ A p element's end tag may be omitted if the p element is immediately followed by an address, article, aside, blockquote, dir, div, dl, fieldset, footer, form, h1, h2, h3, h4, h5, h6, header, hgroup, hr, main, menu, nav, ol, p, pre, section, table, or ul, element, ]] parser's cases that "close a p element": A start tag whose tag name is one of: "address", "article", "aside", "blockquote", "center", "details", "dialog", "dir", "div", "dl", "fieldset", "figcaption", "figure", "footer", "header", "hgroup", "main", "menu", "nav", "ol", "p", "section", "summary", "ul" A start tag whose tag name is one of: "h1", "h2", "h3", "h4", "h5", "h6" A start tag whose tag name is one of: "pre", "listing" A start tag whose tag name is "form" A start tag whose tag name is "li" A start tag whose tag name is one of: "dd", "dt" A start tag whose tag name is "plaintext" A start tag whose tag name is "table" A start tag whose tag name is "hr" A start tag whose tag name is "xmp" That includes obsolete elements, but the current list has <dir> which is obsolete. Someone writing a serializer that omits tags might want to know about the obsolete elements and maybe also li/dd/dt even though that doesn't happen in conforming content. 92250 2 ian 2013-08-19 04:15:57 +0000 Yeah, that's fair enough. Should probably include all the obsolete elements too. 92475 3 ian 2013-08-22 20:25:08 +0000 See also bug 23000 and bug 23001. 95590 4 ian 2013-10-30 23:33:18 +0000 *** Bug 23000 has been marked as a duplicate of this bug. *** 95592 5 ian 2013-10-30 23:33:20 +0000 *** Bug 23001 has been marked as a duplicate of this bug. *** 95594 6 ian 2013-10-30 23:50:40 +0000 Ok the things that these three bugs are suggesting are: - add the non-conforming elements to the list of places you could omit </p>. - add the non-conforming combinations of thead/tfoot/tbody to the list of places you can omit those tags - add the non-conforming <head> elements to the list of elements before which you cannot omit <body> The theory is that a conforming serialiser might omit the wrong tag if exposed to non-conforming input. I think that makes the most sense for the third case. For the first two, it doesn't let you omit the tag in the non-conforming cases, but that's ok, right? I think if we add this we should be explicit that these are non-conforming cases. (in http://html5.org/r/8248 I made the conforming cases work) 95600 7 zcorpan 2013-10-31 08:19:46 +0000 (In reply to Ian 'Hixie' Hickson from comment #6) > I think that makes the most sense for the third case. For the first two, it > doesn't let you omit the tag in the non-conforming cases, but that's ok, > right? I guess it's ok from the point of view that it gets parsed correctly. But I still think it's unexpected to serialize the tag if it can be omitted and the user asked for tags to be omitted. > I think if we add this we should be explicit that these are non-conforming > cases. Sure. 95618 8 ian 2013-10-31 17:33:27 +0000 Well, it's unexpected to be serialising a non-conforming output in the first place. My concern is that if we say "You may omit the </p> if the element after a paragraph is a <listing> element", people will read that as "you may use the <listing> element". The more I think about this the more I feel like we shouldn't mention the non-conforming cases at all. I don't really understand the value here. We've already told people that they cannot use <bgsound> in <body>. Why would we remind them that they shouldn't omit <body> if they start with <bgsound>? They're not allowed to do that, since they're not allowed to include <bgsound> in the first place. I mean, if the concern is just that using <bgsound> is going to result in a non-round-tripped DOM, shouldn't we also say that they should never use <isindex> and <image> tags? If we're happy saying that the current text — which does indeed say that you can't use <isindex> and <image> — is enough to avoid those problems, why isn't the same text enough to avoid the problems with <bgsound>? After all, the same text in fact makes <bgsound> non-conforming in the exact same way. 95661 9 zcorpan 2013-11-01 09:02:54 +0000 (In reply to Ian 'Hixie' Hickson from comment #8) > Well, it's unexpected to be serialising a non-conforming output in the first > place. If the DOM is non-conforming, it seems quite expected that that the serializer outputs something non-conforming, too. > My concern is that if we say "You may omit the </p> if the element > after a paragraph is a <listing> element", people will read that as "you may > use the <listing> element". So don't say that. We already agreed to be explicit about it being non-conforming. > The more I think about this the more I feel like we shouldn't mention the > non-conforming cases at all. I don't really understand the value here. We've > already told people that they cannot use <bgsound> in <body>. Why would we > remind them that they shouldn't omit <body> if they start with <bgsound>? The value is that people can configure their serializer to omit tags and still have the result be parsed the same as if they didn't omit tags, even for non-conforming DOMs. > They're not allowed to do that, since they're not allowed to include > <bgsound> in the first place. I mean, if the concern is just that using > <bgsound> is going to result in a non-round-tripped DOM, shouldn't we also > say that they should never use <isindex> and <image> tags? If we're happy > saying that the current text — which does indeed say that you can't use > <isindex> and <image> — is enough to avoid those problems, why isn't the > same text enough to avoid the problems with <bgsound>? After all, the same > text in fact makes <bgsound> non-conforming in the exact same way. <isindex> and <image> roundtrip the parse->serialize->parse fine. The DOM will be the same. 95717 10 ian 2013-11-01 23:01:27 +0000 (In reply to Simon Pieters from comment #9) > <isindex> and <image> roundtrip the parse->serialize->parse fine. The DOM > will be the same. But they won't survive serialise->parse->serialise. 96553 11 ian 2013-11-19 22:12:33 +0000 (In reply to Simon Pieters from comment #9) > (In reply to Ian 'Hixie' Hickson from comment #8) > > Well, it's unexpected to be serialising a non-conforming output in the first > > place. > > If the DOM is non-conforming, it seems quite expected that that the > serializer outputs something non-conforming, too. It's not expected that the DOM be non-conforming in software that is outputting HTML. Indeed, it's non-conforming for the DOM to be non-conforming. :-) > > The more I think about this the more I feel like we shouldn't mention the > > non-conforming cases at all. I don't really understand the value here. We've > > already told people that they cannot use <bgsound> in <body>. Why would we > > remind them that they shouldn't omit <body> if they start with <bgsound>? > > The value is that people can configure their serializer to omit tags and > still have the result be parsed the same as if they didn't omit tags, even > for non-conforming DOMs. There's no way you can guarantee a round-trippable DOM if you start with a non-conforming DOM. If your DOM starts, for example, with a comment that contains "-->", or with an <hr> which has children elements, or with a <div> element before the <head>, or any number of other weird cases, you're not going to round-trip. I just don't see the value here. 96562 12 zcorpan 2013-11-19 22:48:27 +0000 (In reply to Ian 'Hixie' Hickson from comment #11) > There's no way you can guarantee a round-trippable DOM if you start with a > non-conforming DOM. Right. > If your DOM starts, for example, with a comment that > contains "-->", or with an <hr> which has children elements, or with a <div> > element before the <head>, or any number of other weird cases, you're not > going to round-trip. But the parsed result in those cases will be the same whether you omit optional tags or not. 96708 13 ian 2013-11-22 18:21:58 +0000 No it won't, not necessarily. As an extreme example, take this DOM: #document | +-- #comment: "--><plaintext>" | +-- <html> | +-- <head> | +-- <body> | +-- <div> If you omit tags, the result of parsing will be this DOM: #document | +-- #comment: "" | +-- <plaintext> | +-- #text: "<div></div>" If you don't omit tags, it'll be: #document | +-- #comment: "" | +-- <plaintext> | +-- #text: "<html><head></head><body><div></div></body></html>" 96744 14 zcorpan 2013-11-23 10:47:04 +0000 Hmm, yeah OK. Do as you wish. 96799 15 ian 2013-11-25 18:43:37 +0000 Ok. In that case, I'm closing this since I think the issues with conforming markup were fixed already.