This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.
Specification: http://www.whatwg.org/specs/web-apps/current-work/multipage/text-level-semantics.html Multipage: http://www.whatwg.org/C#head Complete: http://www.whatwg.org/c#head Comment: The <code> element should get a dedicated attribute for describing the computer language Posted from: 92.79.191.201 User agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:11.0) Gecko/20100101 Firefox/11.0
I believe that limiting the value space of class names for dedicated purposes is not a good way of implementing standards. There should be a dedicated string value attribute for defining the computer language used. The element should have a dedicated name, like "clang" or "clanguage" and values should be MIME types, so XPath expressions can be easily and reliably used to find appropriate HTML <code> elements for syntax highlighting.
Axel: What's the use case?
Ian, I actually see three use cases here: 1. Users and applications would not require extra testing to check whether a class name assigned to a <code> element is prefixed by "language-". If a prefix on class names were used, Web design applications, like e.g. DreamWeaver, would need to implement special code, checking HTML text for above class prefix in order to warn the user not to use class names prefixed with "language-" on <code> elements in order to avoid unexpected behaviour on particular display applications, e.g. web browsers. 2. Using a dedicated attribute, code implementing syntax highlighting would be able to apply their parser filters according to a dedicated attribute, not by string manipulations, thereby speeding up processing and reducing error-proneness. 3. Providing a structured HTML language design and allowing for robust and well structured programming.
1 isn't a use case, since if there's no reason to mark the language, there's no reason to avoid the class names. It's just something that you have to worry about _if_ there is a use case. I disagree with the premise of 2; string manipulation of the class attribute is trivial and wouldn't affect performance or reliability, IMHO, at least not compared to a dedicated attribute which has its own costs. 3 isn't a use case, it's just a design philosophy. It would apply if there was a use case, but not if there wasn't. Implementing syntax highlighting is a use case, but it's not clear to me that it happens enough to warrant a dedicated attribute. Browsers haven't shown any interest in implementing dedicated syntax highlighting, and scripts can already do it fine as it is.
I disagree. Here's an example for #1: <html> <head><title></title> <style> .language-header {} .language-item {} .language-footer {} </style> </head> <body> <div> <div class="language-header">The book is available in these languages:</div> <div><code class="language-item">DE</code></div> <div><code class="language-item">EN</code></div> <div><code class="language-item">FR</code></div> <div class="language-footer">(Not for resale)</div> </div> </body> </html> Regardless of the fact that there currently is no language called "item" a design application would always have to parse the html code for <code> elements having a class name beginning with "language-" and it would have to warn the user that some browsers might display unexpected results.
No browsers will show unexpected results because no browsers will do anything with these class names. The language-* class names are just a suggested convention, they're not a defined semantic.
Yes, from today's point of view. But given Firefox Web Developer menu items oder Internet Explorer F12 Developer Tools there *may* be in the future. And if they don't, any future add-on might and become a standard tool in the future. So, still, using class names for dedicated purpuses is a bad design decision for a worldwide standard.
Any such tools would be non-conforming. There's no way to tell what language a <code>'s contents are in without having coordinated with the page author. Now, if there are tools such as those you describe who want to implement that kind of thing, then that's a different matter, and we can at that point add such a feature. Are they interested in implementing such a feature?
> Any such tools would be non-conforming. Yes, today. But not if you are going to define a mechanism for automatically determining a code's language. No matter which way you're doing it. From that moment on, such tools are conforming by definition. > There's no way to tell what language a <code>'s contents are in without having coordinated with the page author. In the HTML5 description it reads: "authors who wish to mark code elements with the language used" ... I don't understand the gap. So what's th mechanism described above for then? Anything else than telling what language a <code>'s contents are? >Now, if there are tools such as those you describe who want to implement that kind of thing, then that's a different matter, and we can at that point add such a feature. Are they interested in implementing such a feature? You're asking the wrong guy here. You should ask this question to them. Just a few links: http://code.google.com/p/google-code-prettify/ http://dense13.com/blog/2008/08/17/new-javascript-syntax-highlighter-shjs/
> In the HTML5 description it reads: "authors who wish to mark code elements > with the language used" ... > > I don't understand the gap. So what's th mechanism described above for then? > Anything else than telling what language a <code>'s contents are? That's just documenting a possible way authors can mark this up for their own use. I've tried to make this clearer. > Just a few links: > > http://code.google.com/p/google-code-prettify/ > http://dense13.com/blog/2008/08/17/new-javascript-syntax-highlighter-shjs/ These all seem to just be scripts that work within the page, so they don't need a standard way to do things — they just need to document a convention for the author to use.
Checked in as WHATWG revision r7682. Check-in comment: Clarify that this is not a convention, just a possible technique for the author's own use. http://html5.org/tools/web-apps-tracker?from=7681&to=7682
Please reopen this bug if there are use cases (that is, if someone wants to write software that needs to know the programming language of the contents of a <code> block and yet cannot coordinate with the author, e.g. because it's a browser or search engine and not a script that the author chooses and embeds).