This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 26951 - why do these examples of <html> lack the lang attribute?
Summary: why do these examples of <html> lack the lang attribute?
Status: RESOLVED FIXED
Alias: None
Product: HTML WG
Classification: Unclassified
Component: HTML5 spec (show other bugs)
Version: unspecified
Hardware: Other other
: P3 normal
Target Milestone: ---
Assignee: steve faulkner
QA Contact: HTML WG Bugzilla archive list
URL: https://html.spec.whatwg.org/#structu...
Whiteboard:
Keywords:
Depends on: 26942
Blocks:
  Show dependency treegraph
 
Reported: 2014-10-02 09:06 UTC by steve faulkner
Modified: 2014-10-02 13:17 UTC (History)
6 users (show)

See Also:


Attachments

Description steve faulkner 2014-10-02 09:06:28 UTC
+++ This bug was initially created as a clone of Bug #26942 +++

Specification: https://html.spec.whatwg.org/multipage/introduction.html
Multipage: https://html.spec.whatwg.org/multipage/#structure-of-this-specification
Complete: https://html.spec.whatwg.org/#structure-of-this-specification
Referrer: https://html.spec.whatwg.org/multipage/

Comment:
why do these examples of <html> lack the lang attribute?

Posted from: 24.22.56.84
User agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:32.0) Gecko/20100101 Firefox/32.0
Comment 1 steve faulkner 2014-10-02 09:37:43 UTC
regardless of how many poeple do it, its best practice and useful for user agents such as AT that use the lang attribute to load the correct pronunciation dictionaries for a page. Thanks for calling this out, easily fixed. https://github.com/w3c/html/commit/fd501aa4b6167338bd994609e89d267fd2f1b422 

grep of data from 2013 indicates that lang use is widespread https://docs.google.com/spreadsheet/ccc?key=0AlVP5_A996c5dENJVkl4ZngxS0ZTZHVvbHdQYWQ2Zmc&usp=sharing
Comment 2 Simon Pieters 2014-10-02 10:07:10 UTC
It doesn't tell you how often lang is used correctly. lang="en" in particular is often used on non-English pages due to copy/paste from "best practice" examples...
Comment 3 steve faulkner 2014-10-02 10:12:03 UTC
(In reply to Simon Pieters from comment #2)
> It doesn't tell you how often lang is used correctly. lang="en" in
> particular is often used on non-English pages due to copy/paste from "best
> practice" examples...

am in process of looking at data to check usage, will update add advice as appropriate
Comment 4 steve faulkner 2014-10-02 12:37:16 UTC
(In reply to Simon Pieters from comment #2)
> It doesn't tell you how often lang is used correctly. lang="en" in
> particular is often used on non-English pages due to copy/paste from "best
> practice" examples...

so i did some digging on the latest available data from webdevdata (around 100,00 pages) found that approx 1 in 3 pages (33,000) had at least one lang attribute . I manually perused the code of approx 100 of those looking for how it was used. I found that approx 95%+ the lang attribute correctly reflected the language of the page.
Comment 5 Simon Pieters 2014-10-02 12:42:02 UTC
How many of those were non-English content?
Comment 6 steve faulkner 2014-10-02 12:42:44 UTC
(In reply to Simon Pieters from comment #5)
> How many of those were non-English content?

approx a 3rd
Comment 7 Simon Pieters 2014-10-02 12:58:45 UTC
So on github...

https://github.com/search?l=html&q=%28búsqueda+OR+nombre+OR+contraseña%29&ref=searchresults&type=Code&utf8=✓

4,341,523 Spanish HTML pages with <html lang...>

https://github.com/search?l=html&q="html+lang+en"+%28búsqueda+OR+nombre+OR+contraseña%29&ref=searchresults&type=Code&utf8=✓

4,142,691 of those (95%) specify <html lang=en>

https://github.com/search?l=html&q="html+lang+es"+%28búsqueda+OR+nombre+OR+contraseña%29&ref=searchresults&type=Code&utf8=✓

87,594 (2%) specify <html lang=es>
Comment 8 Simon Pieters 2014-10-02 13:13:08 UTC
(In reply to Simon Pieters from comment #7)
> So on github...
> 
> https://github.com/
> search?l=html&q=%28búsqueda+OR+nombre+OR+contraseña%29&ref=searchresults&type
> =Code&utf8=✓

Sorry, wrong link.

https://github.com/search?l=html&q="html+lang"+%28búsqueda+OR+nombre+OR+contraseña%29&ref=searchresults&type=Code&utf8=✓
Comment 9 steve faulkner 2014-10-02 13:17:30 UTC
(In reply to Simon Pieters from comment #7)
> So on github...
> 
> https://github.com/
> search?l=html&q=%28búsqueda+OR+nombre+OR+contraseña%29&ref=searchresults&type
> =Code&utf8=✓
> 
> 4,341,523 Spanish HTML pages with <html lang...>
> 
> https://github.com/
> search?l=html&q="html+lang+en"+%28búsqueda+OR+nombre+OR+contraseña%29&ref=sea
> rchresults&type=Code&utf8=✓
> 
> 4,142,691 of those (95%) specify <html lang=en>
> 
> https://github.com/
> search?l=html&q="html+lang+es"+%28búsqueda+OR+nombre+OR+contraseña%29&ref=sea
> rchresults&type=Code&utf8=✓
> 
> 87,594 (2%) specify <html lang=es>

I am sure you can find all sorts of cruft on github, think its more worthwhile to look at published pages actually used by masses, rather than github files generally only used/viewed by the person who created them