24799 – Text-content elements in HTML

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 24799 - Text-content elements in HTML

Summary: Text-content elements in HTML

Status:	RESOLVED WORKSFORME

Alias:	None

Product:	HTML Checker
Classification:	Unclassified
Component:	General (show other bugs)
Version:	unspecified
Hardware:	PC Windows NT

Importance:	P2 normal
Target Milestone:	---
Assignee:	Michael[tm] Smith
QA Contact:	qa-dev tracking

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2014-02-25 13:08 UTC by Andrea Rendine
Modified:	2014-02-26 21:43 UTC (History)
CC List:	2 users (show)

See Also:

Attachments

Description Andrea Rendine 2014-02-25 13:08:18 UTC

The validator does not flag as incorrect markup present inside <title>, <textarea> and presumably other text-content elements in HTML document, as expected by the spec. But it does flag them so in XHTML documents.
Since the presence of markup inside these elements constitutes both a risk and a reason for different DOM trees in documents of either type, please correct it ASAP.

Comment 1 Michael[tm] Smith 2014-02-26 21:08:41 UTC

(In reply to Andrea Rendine from comment #0)
> The validator does not flag as incorrect markup present inside <title>,
> <textarea>

That's because inside those elements in text/html there is no such thing as markup. Any characters that looks like markup within those elements are actually just text.

> and presumably other text-content elements in HTML document, as
> expected by the spec.

No, what's expected by the spec is that all characters in those elements are handled as text, not as markup.

> But it does flag them so in XHTML documents.

That's because in XML those characters that look like markup are in fact always markup. So <title> and <textarea> in XML documents can't contain markup.

> Since the presence of markup inside these elements constitutes both a risk
> and a reason for different DOM trees in documents of either type, please
> correct it ASAP.

No, please read the spec more carefully and take time to actually understand it.

The validator is following the spec here. So it sounds like you don't like the behavior that the spec requires. Which is to say, you don't like the way that browsers actually handle the contents of <title> and <textarea> elements.

Comment 2 Andrea Rendine 2014-02-26 21:23:54 UTC

I have read and understood clearly the spec, but thank you for your kind proposal.

Thus said, there is a slight difference in meaning between
[[The HTML parser treats markup inside iframe elements as text.]](iframe)
which EXPLICITLY states the behavior of UAs in regard to an element whose content is considered text no matter what characters are used, and
[[The IDL attribute text must return a concatenation of the contents of all the Text nodes that are children of the title element (ignoring any other nodes such as comments *or elements*), in tree order. On setting, it must act the same way as the textContent IDL attribute.]](title)

If you are right, then tell me what kind of element could ever be inside <title>. Or <textarea>, for what is the matter.
Or the 'text' IDL attribute is required to return something different than the actual title of the document (i.e. the one shown in the browser's window/tab)? First off, the browsers' 'text' IDL attribute implementation does not seem to honour the requested behaviour. Or the sentence "Text nodes that are children of the title element (ignoring any other nodes such as comments *or elements*)" is meant to be completed with the sentence "actually everything inside a <title> is text, except for comments (but it is not stated)".

Comment 3 Michael[tm] Smith 2014-02-26 21:37:33 UTC

(In reply to Andrea Rendine from comment #2)

> [[The IDL attribute text must return a concatenation of the contents of all
> the Text nodes that are children of the title element (ignoring any other
> nodes such as comments *or elements*), in tree order. On setting, it must
> act the same way as the textContent IDL attribute.]](title)

What possible relevance does that have to the behavior of the validator?

When you give a document to the validator to check, it is impossible for that document to ever have a title element that contains any child elements.

> If you are right, then tell me what kind of element could ever be inside
> <title>. Or <textarea>, for what is the matter.

You filed this bug against the validator. That question has zero relevance to the behavior of the validator. If you're curious about it, file a bug against the spec or something.

For the purposes of documents you check with the validator, the no elements can ever be a child of <title> or <textarea>.

Again the statement you cite above has absolutely no relevance to the behavior of the validator. So I don't know why you want to continue a discussion about this here.

> Or the 'text' IDL attribute is required to return something different than
> the actual title of the document (i.e. the one shown in the browser's
> window/tab)? First off, the browsers' 'text' IDL attribute implementation
> does not seem to honour the requested behaviour. Or the sentence "Text nodes
> that are children of the title element (ignoring any other nodes such as
> comments *or elements*)" is meant to be completed with the sentence
> "actually everything inside a <title> is text, except for comments (but it
> is not stated)".

Again, you filed this bug report against the validator. But I've told you already there's no bug in the validator here. So at this point I have no idea what else you expect me to do in response to this.

Comment 4 Andrea Rendine 2014-02-26 21:43:46 UTC

So there are "elements in <title> for the purpose of IDL" and "elements in <title> for the purpose of validation". I see. I thought that elements are elements for every purpose. And that elements were not allowed in <title> for any purpose. Got it, I'll refer to the spec.