This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 21553 - place further document-conformance constraints on use of <main>
Summary: place further document-conformance constraints on use of <main>
Status: RESOLVED WONTFIX
Alias: None
Product: WHATWG
Classification: Unclassified
Component: HTML (show other bugs)
Version: unspecified
Hardware: PC All
: P2 normal
Target Milestone: Unsorted
Assignee: Ian 'Hixie' Hickson
QA Contact: contributor
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-04-02 17:53 UTC by Michael[tm] Smith
Modified: 2013-06-27 09:58 UTC (History)
3 users (show)

See Also:


Attachments

Description Michael[tm] Smith 2013-04-02 17:53:32 UTC
The current HTML LS spec allows the <main> element to be used anywhere where flow content is expected. But that seems too liberal. It would seem to make more sense to more strictly constrain documents to using <main> in way such that is really is used only for something "main".

The W3C HTML spec places the following additional constraints on <main>:

1. Must not have any article, aside, footer, header or nav element ancestors.
2. Must not include more than one <main> element in a document.

Those strike me as sane constraints, and the validator (validator.nu backend) currently implements those additional constraints.

I recognize that those requirements essentially constrain <main> to only being used to mark up the main content of a document -- rather than say, also being used multiple to mark of the main contents of parts of a document. But I have to say that I've never seen any commenters ask for a "main" element for marking up, say, the main part of a section or an article -- whereas I have seen many authors asking for a "main" element specifically for the purpose of marking up the main contents of document, once per document (and have seen many existing documents that are marked up in such a way, with, e.g., id=main).
Comment 1 Ian 'Hixie' Hickson 2013-04-02 17:59:29 UTC
How about a blog where each article has a header, footer, and main portion?

People are going to use <main> just as a way to style the body of an article, the way they use <div class=main> or <div class=content> now, it doesn't make much sense to prevent it in <article> or to prevent there from being more than one, as far as I can tell. What harm does having more than one cause? It's not like there can be only one <h1>, and the UI for jumping to <h1> is basically the same as the UI for jumping to <main>, right?

Not having an <aside>, <footer>, <header>, or <nav> ancestor makes sense, though.
Comment 2 Michael[tm] Smith 2013-04-02 23:51:55 UTC
High-level comment: I'd suggest that we're better of starting out with stricter constraints on usage of <main> now (e.g., a requirement that <main> can be used only once per document). Then we can relax those constraints later if it becomes clear that authors are having trouble following the constraints, or are actively ignoring them.

It would be much more difficult to do the opposite: That is, it'd be difficult to change the spec to make the constraints stricter later, if we start out with more liberal usage rules now.

So I'd suggest we try converging on the constraints that are in the W3C HTML spec, and give it a try for a while, and if over time usage in the wild turns out to indicate we picked wrong initially, we can fix it by loosening up the constraints later.

(In reply to comment #1)
> How about a blog where each article has a header, footer, and main portion?
> 
> People are going to use <main> just as a way to style the body of an
> article, the way they use <div class=main> or <div class=content> now,

Well if people actually do that already with class=main, they can keep doing it and everything will work fine. But just because they are doing that I don't think it necessarily follows that they're going to replace that markup with <main> now that it's available.

And the thing is, do we want authors to do that? I'm asking. I mean, do you think it would be a good thing or a bad thing if they did that with <main>? IMHO, it'd be a bad thing and we don't want them to do it, and we want to discourage them from using it that way -- discourage it now, not wait until later.

So making that usage of <main> invalid and having that validator report it as an error would be a good way to discourage it and instead encourage them to use it only to mark up the main content of the overall document.


> it
> doesn't make much sense to prevent it in <article> or to prevent there from
> being more than one, as far as I can tell.

I guess I have to disagree and say that I think it does in fact make sense to try to prevent it from being used that way.

> What harm does having more than one cause?

If it's allowed to be used only once, it has special value as a mechanism uniquely for unambiguously indicating the main content of the overall document. If it's allowed to be used multiple times, then it uses the unique value, and what it's being used for instead becomes ambiguous.

> It's not like there can be only one <h1>, and the UI for jumping
> to <h1> is basically the same as the UI for jumping to <main>, right?

No, because HTML has never defined a constraint to say that H1 can only be used once per document. If it had, what you said might hold true. But because it never has, it doesn't.

In contrast, we have an opportunity with <main> to define it as document-wide unique now, before it starts to get used widely. And we have the opportunity to enforce that constraint in the validator so that authors actually follow it instead of ignoring it.

So I would suggest that we go with the stricter constraints for now, and give that a try, and plan to revisit this at some point later to see how authors end up actually using <main> in practice.

> Not having an <aside>, <footer>, <header>, or <nav> ancestor makes sense,
> though.

OK
Comment 3 Ian 'Hixie' Hickson 2013-04-11 23:05:29 UTC
> > How about a blog where each article has a header, footer, and main portion?
> > 
> > People are going to use <main> just as a way to style the body of an
> > article, the way they use <div class=main> or <div class=content> now,
> 
> Well if people actually do that already with class=main, they can keep doing
> it and everything will work fine. But just because they are doing that I
> don't think it necessarily follows that they're going to replace that markup
> with <main> now that it's available.

Why not? That's exactly what's happened with all the other new elements. People switched from <div class="post"> to <article>, from <div class="sidebar"> to <aside>, from <div class="nav"> to <nav>, etc.


> And the thing is, do we want authors to do that? I'm asking.

Sure, why not?

I think the element isn't compelling enough to have in the spec, much like <cite> or <samp>. But given that browsers support it, it seems reasonable, even desireable, to look at what authors are doing which could use an element with this name, and use it for that purpose. Much like with <cite> we looked at what people did (namely, give names of works) and then defined <cite> to fit that broad definition, it makes sense, IMHO, to look at what people do here (mark up the "content" part of articles, etc) and define <main> to fit that broad definition.


> I mean, do you
> think it would be a good thing or a bad thing if they did that with <main>?

I don't see any harm, if browsers support it.


> IMHO, it'd be a bad thing and we don't want them to do it, and we want to
> discourage them from using it that way

Why? What harm does it cause?


> So making that usage of <main> invalid and having that validator report it
> as an error would be a good way to discourage it and instead encourage them
> to use it only to mark up the main content of the overall document.

I don't understand what you mean by "main content of the overall document" if not the kinds of thing people are marking up with class="main".


> If it's allowed to be used only once, it has special value as a mechanism
> uniquely for unambiguously indicating the main content of the overall
> document.

The "main content" can be distributed across multiple parts of the document.

See, for instance, http://cnn.com/, or http://googlepublicpolicy.blogspot.com/, or http://www.w3.org/. Is there really only one range in the DOM that counts as "main" content? I think there's a number of places in all those documents where you'd want (as a user) to be able to jump to when asking for "main content".


> If it's allowed to be used multiple times, then it uses the unique
> value, and what it's being used for instead becomes ambiguous.

I don't really see what value there is in it being unique per document.


> > It's not like there can be only one <h1>, and the UI for jumping
> > to <h1> is basically the same as the UI for jumping to <main>, right?
> 
> No, because HTML has never defined a constraint to say that H1 can only be
> used once per document. If it had, what you said might hold true. But
> because it never has, it doesn't.

How is the UI different?


> In contrast, we have an opportunity with <main> to define it as
> document-wide unique now, before it starts to get used widely. And we have
> the opportunity to enforce that constraint in the validator so that authors
> actually follow it instead of ignoring it.

I agree that we have that opportunity, I just see no value in it.
Comment 4 contributor 2013-04-11 23:06:54 UTC
Checked in as WHATWG revision r7817.
Check-in comment: Restrict <main> from having <aside>, <footer>, <header>, or <nav> ancestors
http://html5.org/tools/web-apps-tracker?from=7816&to=7817
Comment 5 Michael[tm] Smith 2013-04-13 22:47:26 UTC
(In reply to comment #3)
> > Well if people actually do that already with class=main, they can keep doing
> > it and everything will work fine. But just because they are doing that I
> > don't think it necessarily follows that they're going to replace that markup
> > with <main> now that it's available.
> 
> Why not? That's exactly what's happened with all the other new elements.
> People switched from <div class="post"> to <article>, from <div
> class="sidebar"> to <aside>, from <div class="nav"> to <nav>, etc.

I guess the difference is that it's not clear to me that people actually use class=main multiple times in the same document, or even if they do, if they also us id=main and separately from whatever they have marked up with class=main. In which case I can imagine that they'd switch to using <main> for whatever they were using id=main for previously, and just leave the class=main stuff as it is.
 
> > And the thing is, do we want authors to do that? I'm asking.
> 
> Sure, why not?

Well, I guess it comes down to whether it's useful to have a single element that uniquely identify the main content of a document. A lot of people seem to believe that it does. Which is what actually led to the element being added to implementations and to the spec. From what I recall in the discussions at least, what most of the commenters were asking for was a new element to mark up the main content of document -- not an element to mark up the main content of each major part of document.

> I think the element isn't compelling enough to have in the spec, much like
> <cite> or <samp>. But given that browsers support it, it seems reasonable,
> even desireable, to look at what authors are doing which could use an
> element with this name, and use it for that purpose. Much like with <cite>
> we looked at what people did (namely, give names of works) and then defined
> <cite> to fit that broad definition,

Actually, that's not strictly what you did at all. There are a lot of people out there who also use <cite> to mark up the names people people, not works. You know that quite well. And they asked to you to (re)define the authoring-conformance requirements for <cite> such that using it to mark up the name of people was conforming. However, you decided to keep the conformance requirements more strict, and make it non-conforming to use <cite> to mark up names of people. (And FWIW, I happen to agree with you that the spec should not make it conforming for marking up names of people).

So yeah the case with <main> is very much like that case with <cite> -- it's just not like it in the way you're asserting. Instead it's like <cite> in that we have an opportunity to make the conformance requirements on <main> stricter than what some authors might like for them to be, and not just base the requirements for <main> on how some authors would prefer to (mis)use it.

> it makes sense, IMHO, to look at what
> people do here (mark up the "content" part of articles, etc) and define
> <main> to fit that broad definition.

Sure it makes sense to do that. But why does it make any more sense to have stricter conformance requirements for <cite> than what some (many) authors actually lean toward using it for in practice, but not do the same thing for <main>?

> > I mean, do you
> > think it would be a good thing or a bad thing if they did that with <main>?
> 
> I don't see any harm, if browsers support it.

In what way would the browser support for handling of <main> need to be any different if a document used <main> multiple times instead of a single time? The only relevant browser handling I'm aware of is what browsers expose to the accessibility tree, and in that case they expose it with the "main" landmark role, which as I understand is meant to be a role that only occurs once per document.

So as you've pointed out yourself, encouraging authors to use <main> multiple times in a document breaks the expectations for AT about what "main" content is. 

> > IMHO, it'd be a bad thing and we don't want them to do it, and we want to
> > discourage them from using it that way
> 
> Why? What harm does it cause?

I think the principle harm it can cause is that having <main> multiple times in a document can result in a worse user experience for AT users.

> > So making that usage of <main> invalid and having the validator report it
> > as an error would be a good way to discourage it and instead encourage them
> > to use it only to mark up the main content of the overall document.
> 
> I don't understand what you mean by "main content of the overall document"
> if not the kinds of thing people are marking up with class="main".

id=main, not class=main.

> > If it's allowed to be used only once, it has special value as a mechanism
> > uniquely for unambiguously indicating the main content of the overall
> > document.
> 
> The "main content" can be distributed across multiple parts of the document.

Oh really? Do you reckon most Web authors see it that way? Based on the markup examples I've seen and the comments I've seen in the discussion about this element over the years, they don't in fact see it that way. Or at least they don't mark up their content in such a way as to indicate that the so-called "main" content is distributed across multiple parts of a document.

> See, for instance, http://cnn.com/, or
> http://googlepublicpolicy.blogspot.com/, or http://www.w3.org/. Is there
> really only one range in the DOM that counts as "main" content? I think
> there's a number of places in all those documents where you'd want (as a
> user) to be able to jump to when asking for "main content".

What places, exactly? For each of those pages, I would think that there is one single part that somebody would want to mark up with <main>. How much that part contains is up to the author to decide. For example, given a page with three-column layout, for some authors, the main content might be all three columns. For some other authors, it might just be the middle column.

To look at a couple of the examples you gave:

In the case of the Google Public Policy Blog, there's a part that's marked up there now as <div id="main-wrapper"> that contains just the middle column, and that's within a <div id="content-wrapper"> that also contains the sidebar on the right). And note it's using the id attribute: id=main-wrapper, not class=main-wrapper. 

In the case of cnn.com, there's a <div id="cnn_maintoplive"> that marks up the contents of the middle column only, and there's a <div id="cnn_maintopt1"> that marks up the contents of the middle column plus the left sidebar, and then above that, there's a <div id="cnn_maincntnr"> that contains all the page contents after the header. But those three different divs are not marking up multiple parts of the document as the "main" part of the document. I mean, the author of that page is not going to replace each of them to with a <main> element. Instead the author is going to replace only one of those divs with the <main> element. Which div, I don't know. But I suspect in the case of this particular page, it would be the top <div id="cnn_maincntnr">. Yeah, that's designating the main content of the page more broadly than the Google Public Policy Blog page does, but that's fine. It's up to authors to determine what single part of their document constitutes the main part.

So anyway, those examples pretty much illustrate my point: Documents generally already have one part that the authors want to designate as the main part of the document, and authors are typically already using a single div with a unique id to do that. And all they'd want to do is replace that one div with the <main> element.

As far as I can see, none of those examples illustrate at all your hypothetical case of pages having markup to indicate that the main content is distributed across multiple parts of a document rather than being completely contained within one single part that could be marked up with a single <main> element.

> > If it's allowed to be used multiple times, then it uses the unique
> > value, and what it's being used for instead becomes ambiguous.
> 
> I don't really see what value there is in it being unique per document.

What value is there in the Google Public Policy Blog having a <div id="main-wrapper"> that is unique in the document and contains what the author is designating at the main content?

> > > It's not like there can be only one <h1>, and the UI for jumping
> > > to <h1> is basically the same as the UI for jumping to <main>, right?
> > 
> > No, because HTML has never defined a constraint to say that H1 can only be
> > used once per document. If it had, what you said might hold true. But
> > because it never has, it doesn't.
> 
> How is the UI different?

Sorry, I don't actually know what UI you mean. And I also don't understand what that UI might have to do with this discussion. Yeah, true, it's not like there can be only one <h1>, sure. But there can be only one <div id="main-wrapper">. Well, or should be, of course. And that one <div id="main-wrapper"> can be replaced with <main>. That's what seems relevant to me here.

> > In contrast, we have an opportunity with <main> to define it as
> > document-wide unique now, before it starts to get used widely. And we have
> > the opportunity to enforce that constraint in the validator so that authors
> > actually follow it instead of ignoring it.
> 
> I agree that we have that opportunity, I just see no value in it.

The value in it is that, for example, sites like the Google Public Policy Blog can replace the unique-per-document main-content-indicators like <div id="main-wrapper"> with a <main> element, and AT can then automatically recognize that as the main landmark, and I guess indexers and such can recognize it as the single chunk of content that the author considers the main content of the page, etc., etc.
Comment 6 Ian 'Hixie' Hickson 2013-04-14 22:46:14 UTC
(Haven't read the previous comment yet, just wanted to add the following example for my earlier comments. Sorry if this is now irrelevant.)

This document is an example of one where there might be two "main" elements:

http://www.csmonitor.com/USA/DC-Decoder/Decoder-Wire/2013/0414/If-babies-had-guns-they-wouldn-t-be-aborted.-Is-Rep.-Steve-Stockman-serious

Search in the source for "nextParagraph".
Comment 7 Michael[tm] Smith 2013-04-14 23:32:25 UTC
(In reply to comment #6)
> (Haven't read the previous comment yet, just wanted to add the following
> example for my earlier comments. Sorry if this is now irrelevant.)
> 
> This document is an example of one where there might be two "main" elements:
> 
> http://www.csmonitor.com/USA/DC-Decoder/Decoder-Wire/2013/0414/If-babies-had-
> guns-they-wouldn-t-be-aborted.-Is-Rep.-Steve-Stockman-serious
> 
> Search in the source for "nextParagraph".

I believe the author of that page would not want to mark up the page with two main elements. I think instead they would either

A. Replace the current <div id="mainColumn"> element with <main>, and keep the <a name="nextParagraph"></a> and <a class="hide" href="#nextParagraph">Skip to next paragraph</a> markup as-is.

B. Keep the current <div id="mainColumn"> element as-is, and wrap a <main> element around the part where <a name="nextParagraph"></a> occurs.

I don't think the author would want to mark both places as the "main" content of the pages.

Anyway, even this example doesn't illustrate your hypothetical case of 'The "main content" can be distributed across multiple parts of the document.' Even is this example, it's not a case of the main content be distributed. Instead it's only a case of there being a parent content that conceptually might be the considered the main content, and then some child content of that parent that might be considered the main content. And it just depends on which the author chooses to designate as such -- it's just a matter of a choice of granularity, not at all as case where there a parallel/sibling sections or something that could both be considered the main content.
Comment 8 Ian 'Hixie' Hickson 2013-04-15 02:47:23 UTC
As I see it, there's two steps here:

1. Since we have an element named <main>, are there any purposes for which it can be usefully put? In particular, do authors mark parts of documents up in a way that could be called "main"; do ATs have, or could they have, a user interface driven from an element named "main" that would be useful to users.

2. Once we have a definition, what kinds of authoring mistakes are there might happen that we can catch with a validator?

So, step 1:

What could "main" mean for authors. Well, authors mark up certain parts of their documents up as being "main content". For example, http://blogs.wsj.com/scene/ uses class="postContent" to mark up what is currently equivalent to <header> and what currently doesn't have a dedicated element in HTML, but can be roughly considered "the main content" of each <article> (currently marked up as <li> with class="postitem".) This is relatively comment, for example w3.org has a class="event expand_block" for its <article>-analogue, a class="headline" for its <header>-analogue, and a class="description" for it's main article content.

Authors also mark up parts of their document that are much broader, as well as bits in between. For example, http://cnn.com/ has a dozen elements overall with "main" somewhere in their ID, and yet more with it in their class — they have <div>s at pretty much every level of their hierarchy that are "main" in some sense or another; and there's no reason to really limit ourselves to elements with "main" in their name here, there's lots of elements with "content" and "post" and "area" that would be equally valid candidates for filling in the role of "main" element.

(These two pages are just examples, but similar results can be found in many pages. I studied this in some depth a few years ago.)

If we were designing the language from scratch, I would conclude that authors didn't have a very specific need here, and that each author could use the meaningless <div> with an appropriate set of class names for their own purposes. However, we have a <main> element, so we can take at least one of the uses above and make it available.

Which one we pick depends on what other uses we can put the element to, and that leads us to the question about ATs.

The closest thing to something related to a "main" element that I could see making sense in an AT's UI is some form of navigation UI. There's two kind of navigation UIs that I can see; jumping to a single place in the document, and jumping backwards and forwards, or in a ring, around a list of places in the document. An example of the former would be jumping to the top of the document; an example of the latter would be walking the document outline by jumping to headings.

In practice, the former is a subset of the latter: jumping to one spot is the same thing as jumping backwards and forwards in a ring of one element.

So putting these together: where in the document would we have that could be described as "main" and would benefit from being somewhere the user can jump to?

Well, to make it useful it has to be somewhat predictable, and that means the definition has to be a bit more specific than "anything an author might call 'main' today", given how widely "main" is used (see the CNN example above). Possible candidate definitions include:

 - One part of the document that the author most wants the user to see
 - The parts of the document that are not duplicated on other pages (not boilerplate)
 - The parts of the document that are the reason the page exists
 - The dominant contents of the document

The last three are more or less equivalent definitions. The first is a subset of those definitions — that is, if you pick the dominant contents of the document, and there's more than one, pretty much by definition one of them will be the part of the document that the author most wants the user to see.

So IMHO it makes sense to go with "the dominant contents of the document", since it both matches one of the things that authors mark up, and matches one of the interfaces that it makes sense to expose, and fits the element's name.


> I guess the difference is that it's not clear to me that people actually use
> class=main multiple times in the same document, or even if they do, if they
> also us id=main and separately from whatever they have marked up with
> class=main.

Well, just look at the CNN file. There's two id=main elements (both <script> elements), and a dozen or more elements that could legitimately be considered, based on their ID or class, to be spiritual twins with an element named <main>.

I don't think CNN is particularly special there. They were literally the first site I looked at, picked at random, when I was commenting the other day.


> In which case I can imagine that they'd switch to using <main>
> for whatever they were using id=main for previously, and just leave the
> class=main stuff as it is.

Given that CNN uses id="main" for a <script>, I don't think so. ;-)


> Well, I guess it comes down to whether it's useful to have a single element
> that uniquely identify the main content of a document. A lot of people seem
> to believe that it does.

A lot of people believed the world was flat. That's what we call argumentum ad populum. :-)


> Which is what actually led to the element being
> added to implementations and to the spec.

It's what led to it being added to implementations, sure. It being in implementations is why the spec has it, though, not because people believe it's a good idea. (The arguments in favour of <main> are pretty weak, IMHO.)


> From what I recall in the
> discussions at least, what most of the commenters were asking for was a new
> element to mark up the main content of document -- not an element to mark up
> the main content of each major part of document.

I don't understand the distinction. The "main content of document" can be in multiple places, indeed it often is. It's rarely contiguous.


> > Much like with <cite> we looked at what people did (namely, give names
> > of works) and then defined <cite> to fit that broad definition,
> 
> Actually, that's not strictly what you did at all.

Sure, it's an oversimplification. In practice we did much the same as the two steps at the start of this comment. (Not to relitigate <cite>, but for completeness: usage fell into several categories: citations, quotations, works in general, names in general, italics in general; its legacy typographic defaults was italics; there were already elements for quotations and italics; people names don't generally want to be italics; the use cases for citations specifically weren't handled by just one element, and there were other solutions like microdata or microformats for those anyway; and that basically left it being used for names of works in general.)


> Sure it makes sense to do that. But why does it make any more sense to have
> stricter conformance requirements for <cite> than what some (many) authors
> actually lean toward using it for in practice, but not do the same thing for
> <main>?

Actually <cite> went from narrow (citations only in 1998) to quite general (names of works in general now). It could be even more general, but there has to be a balance struck between uselessly narrow and uselessly general.

Similarly with <main>, as discussed at the top of this comment, there has to be a balance struck between the narrow and the general. We could be really narrow — only allowed to be used for the first paragraph of the content of the page, say. Or only allowed to be used for discussing the river Main. We could be really general — allowed to be used for anything that <div> can be used for today, say. The key is to find the point in the middle that makes it useful to a lot of authors, while giving it a purpose useful to users.


> In what way would the browser support for handling of <main> need to be any
> different if a document used <main> multiple times instead of a single time?

Well the extent of the browser's UI for this feature is landmark navigation, as far as I can tell. So if we only supported one, the landmark navigation UI would only jump to the first one when used, even if the user was already in it. If we supported multiple landmarks, then the UI could jump from main to main.


> So as you've pointed out yourself, encouraging authors to use <main>
> multiple times in a document breaks the expectations for AT about what
> "main" content is. 

Not sure to what you refer here.


> I think the principle harm it can cause is that having <main> multiple times
> in a document can result in a worse user experience for AT users.

In what way?

We're talking about the exact same UI as jumping between <h1>s, and that's supported fine as far as I can tell.


> > I don't understand what you mean by "main content of the overall document"
> > if not the kinds of thing people are marking up with class="main".
> 
> id=main, not class=main.

Some authors use id="", some use class=""; for these purposes, they're much the same thing.


> > The "main content" can be distributed across multiple parts of the document.
> 
> Oh really? Do you reckon most Web authors see it that way?

Most assuredly.


> Based on the
> markup examples I've seen and the comments I've seen in the discussion about
> this element over the years, they don't in fact see it that way.

Well, I've given several counter-examples in this bug alone. I don't know what I could do to demonstrate this more conclusively. I agree that anecdotal evidence here is not compelling.


> In the case of the Google Public Policy Blog, there's a part that's marked
> up there now as <div id="main-wrapper"> that contains just the middle
> column, and that's within a <div id="content-wrapper"> that also contains
> the sidebar on the right). And note it's using the id attribute:
> id=main-wrapper, not class=main-wrapper. 

The interesting content on that page is marked up with class="post-body". (Actually, even that includes uninteresting content — the date and byline are in there too.)

The element that's the first child of the one you mention with id=main-wrapper is, amusingly, a <div> with id="main" _and_ class="main". But it doesn't seem like a particularly good candidate for role=main, even if we assume there's only one on the page — it contains permalinks, navigation links, bylines, all kinds of uninteresting content.

This is why personally I think the whole role=main/<main> thing is wildly misguided. A far more effective way of marking up pages would be to use the other elements like <header>, <nav>, <footer>, etc, and then have the user agent provide a UI that skipped past strings of uninteresting content. But if we're going to have a <main>, the least we can do is actually allow authors to mark up the multiple parts of the document that would correspond to removing the uninteresting part — in this case, the post-body (or a subset thereof).


> Yeah, that's designating the main content of the page
> more broadly than the Google Public Policy Blog page does, but that's fine.

Is it? It seems to defeat the entire point of the "main" AT UI.


> It's up to authors to determine what single part of their document
> constitutes the main part.

If it's up to authors, why can't authors have more than one?


> What value is there in the Google Public Policy Blog having a <div
> id="main-wrapper"> that is unique in the document and contains what the
> author is designating at the main content?

None outside styling, currently. Notice that they have multiple such elements, as you pointed out: one with id=content-wrapper, one with id=main-wrapped, one with id=main, one with class=blog-posts, one with class=post-body, etc. Right now, they're all just for styling, and most are nested inside each other.


> > > > It's not like there can be only one <h1>, and the UI for jumping
> > > > to <h1> is basically the same as the UI for jumping to <main>, right?
> > > 
> > > No, because HTML has never defined a constraint to say that H1 can only be
> > > used once per document. If it had, what you said might hold true. But
> > > because it never has, it doesn't.
> > 
> > How is the UI different?
> 
> Sorry, I don't actually know what UI you mean.

I wrote above "the UI for jumping to <h1> is basically the same as the UI for jumping to <main>, right", and you said "no", so I'm asking in what way it's different, if it's not the same as you claim.

The relevance of this UI is that this UI is the entire point of why AT advocates pushed <main> into browsers.


> The value in it is that, for example, sites like the Google Public Policy
> Blog can replace the unique-per-document main-content-indicators like <div
> id="main-wrapper"> with a <main> element, and AT can then automatically
> recognize that as the main landmark, and I guess indexers and such can
> recognize it as the single chunk of content that the author considers the
> main content of the page, etc., etc.

But why is there value in letting page authors do this for id="content-wrapper" or the one class="main", and not in letting them do this for the multiple class="post-body" elements or multiple class="main" elements?

Why is there value in the UI for jumping to the landmark only being able to jump to one, rather than being able to navigate around the whole document, if the document has multiple places that contain interesting content intermixed with less interesting content?
Comment 9 Michael[tm] Smith 2013-04-15 08:53:02 UTC
(In reply to comment #8)
> What could "main" mean for authors. Well, authors mark up certain parts of
> their documents up as being "main content". For example,
> http://blogs.wsj.com/scene/ uses class="postContent" to mark up what is
> currently equivalent to <header> and what currently doesn't have a dedicated
> element in HTML, but can be roughly considered "the main content" of each
> <article> (currently marked up as <li> with class="postitem".)

Actually, as far as I can see, the child content of what that page has marked up with <li class=postitem> is not the equivalent of "main content" of an <article>, it is in fact exactly the equivalent of an article. So if the author wanted to use the take advantage of specific HTML elements to mark it up, they could just use <li class=postitem><article>. They wouldn't need to use <main> there, nor would they probably want to. <article> would clearly be a better fit.

> This is
> relatively common, for example w3.org has a class="event expand_block" for
> its <article>-analogue, a class="headline" for its <header>-analogue, and a
> class="description" for its main article content.

Again there, it could just be changed to use <article>. There'd be no value to using <main> there.

So for the examples you've cited so far, we'd actually want to encourage authors to use <article>, if they were to use anything at all, and not to use <main>. That would be a best practice that would help to promote some consistency in the way that authors mark up this kind of content. And the way we could encourage that is by requiring that <main> only be used once per document, so that the choice for authors would then be unambiguous: use <article>. In fact I could have the validator emit supplemental guidance along those lines, if it finds <main> being used multiple times; e.g., "The <main> element must be used only once per document, to mark up the main content of the overall document. To mark up others parts of documents, consider using the <article> element or another appropriate element instead."

> Authors also mark up parts of their document that are much broader, as well
> as bits in between. For example, http://cnn.com/ has a dozen elements
> overall with "main" somewhere in their ID, and yet more with it in their
> class — they have <div>s at pretty much every level of their hierarchy that
> are "main" in some sense or another; and there's no reason to really limit
> ourselves to elements with "main" in their name here, there's lots of
> elements with "content" and "post" and "area" that would be equally valid
> candidates for filling in the role of "main" element.

No, I don't think most people looking objectively at the markup of the cnn.com site would say that those other things you mention would be 'equally valid
candidates for filling in the role of "main" element'. I think somebody looking at cnn.com with an open mind would pretty quickly conclude that the main content is as I described it in comment 5.

> The closest thing to something related to a "main" element that I could see
> making sense in an AT's UI is some form of navigation UI. There's two kind
> of navigation UIs that I can see; jumping to a single place in the document,

Yes, that as I understand it at least is what the "main" landmark is for.

> and jumping backwards and forwards, or in a ring, around a list of places in
> the document. An example of the former would be jumping to the top of the
> document;

Yeah

> an example of the latter would be walking the document outline by
> jumping to headings.

And that would be useful but it would be a case that has nothing to do with the "main" landmark. Not as far as I understand it at least.

> In practice, the former is a subset of the latter: jumping to one spot is
> the same thing as jumping backwards and forwards in a ring of one element.
> 
> So putting these together: where in the document would we have that could be
> described as "main" and would benefit from being somewhere the user can jump
> to?
> 
> Well, to make it useful it has to be somewhat predictable, and that means
> the definition has to be a bit more specific than "anything an author might
> call 'main' today", given how widely "main" is used (see the CNN example
> above). Possible candidate definitions include:
> 
>  - One part of the document that the author most wants the user to see
>  - The parts of the document that are not duplicated on other pages (not
> boilerplate)
>  - The parts of the document that are the reason the page exists
>  - The dominant contents of the document
> 
> The last three are more or less equivalent definitions.

Yeah, agreed. And I think all of those describe, in different words, what most people generally already agree is the "main" content of the document, and I think that is what the "main" landmark is for: A single element in the document that contains the main content for the document overall. I don't there's much confusion in the authoring community about that.

> The first is a
> subset of those definitions — that is, if you pick the dominant contents of
> the document, and there's more than one,

More than one what? More than one element that represents the "dominant contents of the document"? If so, I have yet to see in this discussion an actual document that has more than one element that represents the dominant contents of a document. I think in practice the way the authors mark up content that contains multiple important things is to put a "main" wrapper equivalent around all of them, with headings for each. So then the AT user can use the "main" landmark to get to the beginning of that, block of content, and then can cycle through the headings within that. AT users would not need for you to mark up each subpart with a <main> element or something else in order to enable them to cycle through them, once they've gotten to the start of the main content.

> pretty much by definition one of
> them will be the part of the document that the author most wants the user to
> see.

OK, yeah, agreed. Though I would not necessarily word it as being the one part that the author most wants the user to see. Instead it's the one part where the author would want to start the user out as the main part of the document. Once the user gets there, they can then get more granular in navigating to specific subsections. In that sense, <main> and the "main" landmark are more just about *orienting* the user to the main content of the document; that is, giving them a more specific starting point from which to continue further navigation, rather than them needing to start out with the <body> element.

> So IMHO it makes sense to go with "the dominant contents of the document",
> since it both matches one of the things that authors mark up, and matches
> one of the interfaces that it makes sense to expose, and fits the element's
> name.

Yeah, agreed.

> > I guess the difference is that it's not clear to me that people actually use
> > class=main multiple times in the same document, or even if they do, if they
> > also us id=main and separately from whatever they have marked up with
> > class=main.
> 
> Well, just look at the CNN file. There's two id=main elements (both <script>
> elements), and a dozen or more elements that could legitimately be
> considered, based on their ID or class, to be spiritual twins with an
> element named <main>.

I concede there is a lot of (over)use of variations on "main" in the markup of that page. I do not agree that most of them could legitimately be considered spiritual twins with an element named <main>. I think you need to not base your assessment so much on their ID or class values, but instead base it just on common-sense analysis of structure of the document. I don't think authors want to take every instance of some ID or class value that happens to have "main" as a substring in its value, and replace it with a <main> element. They would get no value from that. 

> > In which case I can imagine that they'd switch to using <main>
> > for whatever they were using id=main for previously, and just leave the
> > class=main stuff as it is.
> 
> Given that CNN uses id="main" for a <script>, I don't think so. ;-)

Yeah, I concede that in the case of cnn.com, it would not be id=main that they'd want to replace with <main>. Instead as I already pointed out in comment 5, it would be either the <div id="cnn_maintoplive"> element or the <div id="cnn_maintopt1"> element. Clearly.

> > From what I recall in the
> > discussions at least, what most of the commenters were asking for was a new
> > element to mark up the main content of document -- not an element to mark up
> > the main content of each major part of document.
> 
> I don't understand the distinction. The "main content of document" can be in
> multiple places, indeed it often is. It's rarely contiguous.

I hear you saying that but I've yet to see you give an example of a page where the main content is distributed across multiple places in the document. Instead, documents generally have one section that in document order is that element that marks the *start* of the main content. Then within that main content, you may have multiple articles or whatever -- contiguous. That is the common pattern that documents follow. What the <main> element does is that it orients users for further navigation, by giving the users a way to get to that main starting point.

I will concede that it's *possible* some documents may have content distributed in such as a way that you can't identify one single part as the start of the main content. But I would argue that such documents then in fact don't have "main" content and their authors would not want to mark them up with <main> anyway. Not every document needs to use <main>.

> Similarly with <main>, as discussed at the top of this comment, there has to
> be a balance struck between the narrow and the general. We could be really
> narrow — only allowed to be used for the first paragraph of the content of
> the page, say.

That would obviously not be appropriate, because the first paragraph of the content of the page might not actually be the main content. That would be overly prescriptive for not real purpose.

> Or only allowed to be used for discussing the river Main.

Which would plainly just be silly.

I'm not suggesting we be overly prescriptive for <main> for no real purpose or for a silly purpose. I'm suggesting we constrain it to use for what matches the conceptual model that people already have the for that one part of a document that represents the main content of the document.

> We
> could be really general — allowed to be used for anything that <div> can be
> used for today, say. The key is to find the point in the middle that makes
> it useful to a lot of authors, while giving it a purpose useful to users.

I think constraining <main> to being used only once per document is in fact that point in the middle between the narrow and the general, and making it useful to both users and authors. It's in the middle because we are not constraining users about what they choose to mark up as the main content -- it doesn't need to be the first paragraph, nor something discussing the river Main. They are free to use it for whatever the one part is that they want to indicate is start of the main content.

Again, I will concede that it's possible some documents may have content distributed in such as a way that you can't identify one single part as the start of the main content. But I would argue that such documents then in fact don't have "main" content and their authors would not want to mark them up with <main> anyway. Again, not every document needs to use <main>. Only the one for which it's actually a good fit should.

> > I think the principle harm it can cause is that having <main> multiple times
> > in a document can result in a worse user experience for AT users.
> 
> In what way?

In that <main> maps in the accessibility tree to the "main" landmark. And the semantics of the "main" landmark are that it's meant to be used only once per document. So if a document contains multiple <main> instances, that breaks the semantic.

> This is why personally I think the whole role=main/<main> thing is wildly
> misguided. A far more effective way of marking up pages would be to use the
> other elements like <header>, <nav>, <footer>, etc, and then have the user
> agent provide a UI that skipped past strings of uninteresting content.

When do you imagine UAs would ever get around to implementing that? We could use that kind of argument to avoiding adding all kinds of stuff. Pragmatically speaking I think it's clear that it's not a great idea to bet on a strategy of UAs all implementing some kind of UI magic to do this kind of stuff. Who knows, baybe they will eventually, but for now we have to work with what we can get done practically, and what others have expressed support for, and have implemented support for. And in the case of marking up main contents, we have browser support that exposes <main> as the "main" landmark, and we have AT support that tries to do something useful as far as processing that landmark info and providing navigation for it to AT users. 

> > It's up to authors to determine what single part of their document
> > constitutes the main part.
> 
> If it's up to authors, why can't authors have more than one?

Because first of all, I don't think most authors want to have more than one. And second, because if it occurs more than once then it degrades its value as a landmark for AT.

> I wrote above "the UI for jumping to <h1> is basically the same as the UI
> for jumping to <main>, right", and you said "no", so I'm asking in what way
> it's different, if it's not the same as you claim.
> 
> The relevance of this UI is that this UI is the entire point of why AT
> advocates pushed <main> into browsers.

OK

> > The value in it is that, for example, sites like the Google Public Policy
> > Blog can replace the unique-per-document main-content-indicators like <div
> > id="main-wrapper"> with a <main> element, and AT can then automatically
> > recognize that as the main landmark, and I guess indexers and such can
> > recognize it as the single chunk of content that the author considers the
> > main content of the page, etc., etc.
> 
> But why is there value in letting page authors do this for
> id="content-wrapper" or the one class="main", and not in letting them do
> this for the multiple class="post-body" elements or multiple class="main"
> elements?

They can do both. They just don't need to use <main> for both, nor would they likely want to. There are other elements they can use -- <article>, for example -- which don't carry the AT semantics of being a landmark.

> Why is there value in the UI for jumping to the landmark only being able to
> jump to one, rather than being able to navigate around the whole document,
> if the document has multiple places that contain interesting content
> intermixed with less interesting content?

Having <main> to mark up the main landmark does not prevent authors from providing other means for users to jump to other parts of the document. Obviously, not everything that you want to the user to be able to jump to needs to be marked up with <main>.
Comment 10 Ian 'Hixie' Hickson 2013-04-15 20:12:02 UTC
I think you are approaching this with a pre-determined meaning for the word "main", and are assuming that this is the same meaning that everyone else has for the word. I don't think that's a viable approach.

Even if we constrain ourselves to the ARIA definition of the "main" role, we have a wide latitude in precisely pinning down the meaning of the element. The ARIA role defines "main content" as "the content that is directly related to or expands upon the central topic of the document"; it is described as being an "alternative for "skip to main content" links". Even ARIA allows there to be multiple elements with the "main" role if there is a good reason for it — and the scoping for this is not the document, but elements with the document or application role; it explicitly says that a single document can have multiple elements with the "main" role. (The text saying this is a little confused because it at one point refers to "document nodes" where it really means "elements with a document role", as far as I can tell.)


> Again there, it could just be changed to use <article>. There'd be no value
> to using <main> there.

If we assume there can be value to <main> at all, I don't see why there wouldn't be value to <main> inside <article>, as in:

   <article>
     <header> ... </header>
     <main> ... </main>
     <footer> ... </footer>
   </article>

This certainly matches the overall markup structure of many pages, include those we've examined specifically in this bug. We don't need the <main> element for this; in the past I've advocated this instead:

   <article>
     <header> ... </header>
     <div class=main> ... </div> <!-- or class=content -->
     <footer> ... </footer>
   </article>

...or similar, and encouraged people to actually use:

   <article>
     <header> ... </header>
     ...
     <footer> ... </footer>
   </article>

...but if we have <main>, it seems like the perfect fit for this.

<article> and <main> seem orthogonal, like <article> and <header>.


> No, I don't think most people looking objectively at the markup of the
> cnn.com site would say that those other things you mention would be 'equally
> valid candidates for filling in the role of "main" element'. I think somebody
> looking at cnn.com with an open mind would pretty quickly conclude that the
> main content is as I described it in comment 5.

Given the element <main> *with no definition*, and with the explicit task of *finding* a definition for it, I don't see why we'd quickly be able to decide which of the 12+ "main"-related things in cnn.com would be the best match.

I think if you are quickly reaching a conclusion about which one is the right match, that you are not looking at this, as you put it, "with an open mind".


> > The closest thing to something related to a "main" element that I could see
> > making sense in an AT's UI is some form of navigation UI. There's two kind
> > of navigation UIs that I can see; jumping to a single place in the document,
> 
> Yes, that as I understand it at least is what the "main" landmark is for.
> 
> > and jumping backwards and forwards, or in a ring, around a list of places in
> > the document. An example of the former would be jumping to the top of the
> > document;
> 
> Yeah
> 
> > an example of the latter would be walking the document outline by
> > jumping to headings.
> 
> And that would be useful but it would be a case that has nothing to do with
> the "main" landmark. Not as far as I understand it at least.

On what are you basing this? ARIA explicitly says that there can be multiple elements with role=main in a Document, so presumably the UI _has_ to be one that can navigate back and forth in a ring, or a tree, or similar. Such UIs are hardly foreign to the user, tabbing focus around works that way, jumping around headings works that way, even pressing "home" and "end" on Mac works that way.

Even if we only allow one <main> per Document, you can nest Documents in frames, so the UI would probably still have to handle it.

There's only one thing, to my knowledge, that we limit to a one-per-document basis in HTML: IDs. There's a very pragmatic reasons for that; their entire purpose is to be able to unique identify elements. Even <html> and <head> and <body> can occur multiple times (in a compound document whose host language allows HTML documents, e.g. maybe SVG).


> >  - One part of the document that the author most wants the user to see
> >  - The parts of the document that are not duplicated on other pages (not
> >    boilerplate)
> >  - The parts of the document that are the reason the page exists
> >  - The dominant contents of the document
> > 
> > The last three are more or less equivalent definitions.
> 
> Yeah, agreed. And I think all of those describe, in different words, what
> most people generally already agree is the "main" content of the document,
> and I think that is what the "main" landmark is for: A single element in the
> document that contains the main content for the document overall. I don't
> there's much confusion in the authoring community about that.

All of the above except the first could appear multiple times in a document.


> > The first is a
> > subset of those definitions — that is, if you pick the dominant contents of
> > the document, and there's more than one,
> 
> More than one what? More than one element that represents the "dominant
> contents of the document"?

Right.


> If so, I have yet to see in this discussion an
> actual document that has more than one element that represents the dominant
> contents of a document.

If you view "the main content of the document" as something that can be large, e.g. containing every blog post on a blog home page, then it makes sense that you'd only see one, because you'd just view half the page as being the main bit. But if you consider <main> as being potentially useful for marking up the inside of each <article>, then obviously there can easily be more than one <main> per page. The question is, which is more helpful to users and to authors?

My argument is that for users, there's at least as much benefit, and probably more, to being able to jump to both the first key point of data on the page, and to jump then to the next one (e.g. jumping past the sidebar as in the "nextParagraph" example above), and so on. This is consistent with the ARIA spec saying that role=main is an "alternative for "skip to main content" links" (in this case, the link is "skip to next paragraph", but the idea is identical).

And for authors, I think it's clear that an element that replaces one <div> is not as useful as an element that replaces four <div>s. Consider this simplified structure which is basically what the blogs above, and even CNN if you squint enough (it's not obvious which parts of that page should be <article>s rather than <nav>s), are equivalent to:

   <body>
     <header> ... </header>
     <div class=a>
       <article>
         <header> ... </header>
         <div class=b> ... </div>
         <footer> ... </footer>
       </article>    
       <article>
         <header> ... </header>
         <div class=b> ... </div>
         <footer> ... </footer>
       </article>
       ...
     </div>
     <footer> ... </footer>
   </body>

Why would it be more useful to provide a dedicated element for the class=a div rather than for the class=b divs? I think it's pretty clear that if we have to make a choice, the dedicated element to remove more class=""es is the more helpful choice.


> I think in practice the way the authors mark up
> content that contains multiple important things is to put a "main" wrapper
> equivalent around all of them, with headings for each. So then the AT user
> can use the "main" landmark to get to the beginning of that, block of
> content, and then can cycle through the headings within that.

That seems like a net loss relative to being able to navigate amongst those items, especially if they are interspersed with other elements.


> Instead it's the one part
> where the author would want to start the user out as the main part of the
> document.

I don't think most authors will even think of the AT part of this, to be honest. If we wanted to provide a single location from which to begin, though, we wouldn't want to use <main>, since it has contents. We'd want to use a void element. It makes no sense to use an element that has a range to mark up a point. It only raises the question of where to put the end. (If we were to use <main> to mark up a point, one would need to ask why we're not asking for authors to write "<main></main>", with no content.)


> Once the user gets there, they can then get more granular in
> navigating to specific subsections. In that sense, <main> and the "main"
> landmark are more just about *orienting* the user to the main content of the
> document; that is, giving them a more specific starting point from which to
> continue further navigation, rather than them needing to start out with the
> <body> element.

In this regard, they seem redundant with heading navigation, IMHO. But that's another story.


> I will concede that it's *possible* some documents may have content
> distributed in such as a way that you can't identify one single part as the
> start of the main content. But I would argue that such documents then in
> fact don't have "main" content and their authors would not want to mark them
> up with <main> anyway. Not every document needs to use <main>.

Why would they not want to use <main>?


> I'm not suggesting we be overly prescriptive for <main> for no real purpose
> or for a silly purpose. I'm suggesting we constrain it to use for what
> matches the conceptual model that people already have the for that one part
> of a document that represents the main content of the document.

If it's the conceptual model people have, then they'll only use one <main> per document regardless of what the validator says. But I don't think you're right on this point; pages have multiple class="body", class="main", class="content", etc, <div>s today. In fact, pretty much every subpart with a header and a footer ends up having a <div> for its "main" contents.


> > This is why personally I think the whole role=main/<main> thing is wildly
> > misguided. A far more effective way of marking up pages would be to use the
> > other elements like <header>, <nav>, <footer>, etc, and then have the user
> > agent provide a UI that skipped past strings of uninteresting content.
> 
> When do you imagine UAs would ever get around to implementing that?

There's lots of low-hanging fruits that AT vendors don't seem to pick.


> We could use that kind of argument to avoiding adding all kinds of stuff.

Indeed. We do. We should.


> They can do both. They just don't need to use <main> for both, nor would
> they likely want to. There are other elements they can use -- <article>, for
> example -- which don't carry the AT semantics of being a landmark.

There's nothing about landmark roles that forces them to be unique per document. For example, "navigation" (<nav>'s role) is a landmark role, and there's nothing requiring there to be only one <nav> per document. Similarly, <aside>'s role is a landmark role.
Comment 11 Michael[tm] Smith 2013-04-16 05:59:55 UTC
(In reply to comment #10)
> I think you are approaching this with a pre-determined meaning for the word
> "main", and are assuming that this is the same meaning that everyone else
> has for the word. I don't think that's a viable approach.

I think my concept of what "main" means is this context is boringly conventional and matches the concept that most other people seem to have of it. On the other hand, your concept of it seems fairly idiosyncratic.

But I guess that's all neither here nor there. 

> Even if we constrain ourselves to the ARIA definition of the "main" role, we
> have a wide latitude in precisely pinning down the meaning of the element.
> The ARIA role defines "main content" as "the content that is directly
> related to or expands upon the central topic of the document"; it is
> described as being an "alternative for "skip to main content" links". Even
> ARIA allows there to be multiple elements with the "main" role if there is a
> good reason for it

The relevant statement I find in the ARIA document that states guidance on this most clearly is:

  "Within any document or application, the author SHOULD mark no more than one element with the main role."

Yeah, I realize RFC 2119 "should" is often (usually) interpreted as making the qualification "unless there is a good reason".

> — and the scoping for this is not the document, but
> elements with the document or application role; it explicitly says that a
> single document can have multiple elements with the "main" role. (The text
> saying this is a little confused because it at one point refers to "document
> nodes" where it really means "elements with a document role", as far as I
> can tell.)

The other relevant part I find there is

  "Because document and application elements can be nested in the DOM, they may have multiple main elements as DOM descendants"

As you worded it, it's saying that it's possible that document can/may have multiple main elements as DOM descendants. It's not saying that's OK or permitted; it's just stating a fact. It's not trying to contradict the earlier guidance it provided that "the author SHOULD mark no more than one element with the main role."; that remains the best-practice requirement it is asserting. 

> > Again there, it could just be changed to use <article>. There'd be no value
> > to using <main> there.
> 
> If we assume there can be value to <main> at all, I don't see why there
> wouldn't be value to <main> inside <article>, as in:
> 
>    <article>
>      <header> ... </header>
>      <main> ... </main>
>      <footer> ... </footer>
>    </article>
> 
> This certainly matches the overall markup structure of many pages, include
> those we've examined specifically in this bug. We don't need the <main>
> element for this; in the past I've advocated this instead:
> 
>    <article>
>      <header> ... </header>
>      <div class=main> ... </div> <!-- or class=content -->
>      <footer> ... </footer>
>    </article>
> 
> ...or similar, and encouraged people to actually use:
> 
>    <article>
>      <header> ... </header>
>      ...
>      <footer> ... </footer>
>    </article>

Right. So they can just keep doing that.

> ...but if we have <main>, it seems like the perfect fit for this.

Except that the <main> element is bound to the ARIA "main" role, and the ARIA spec says that authors should not use the main role more than once per document. So encouraging the use of <main> within article violates the best-practice requirements in the ARIA spec. So it's not a perfect fit in practice. If the ARIA spec did not exist and browsers didn't map <main> to some particular role for AT that is intended to occur only once per document, then yeah, it might be a perfect fit. But given the actual reality we find ourselves in, it's not perfect at all.

> <article> and <main> seem orthogonal, like <article> and <header>.

Yeah. Among other reasons because it's meant to be OK to use <article> multiple times in a document, and having <article> multiple times in a document doesn't violate the best-practice requirements in any other spec. Whereas, having <main> multiple times in a document does. Having it just once does not.


> > No, I don't think most people looking objectively at the markup of the
> > cnn.com site would say that those other things you mention would be 'equally
> > valid candidates for filling in the role of "main" element'. I think somebody
> > looking at cnn.com with an open mind would pretty quickly conclude that the
> > main content is as I described it in comment 5.
> 
> Given the element <main> *with no definition*, and with the explicit task of
> *finding* a definition for it,

It's not right to arbitrarily assign semantics to an element when browsers implement support for that element in such as way that its semantics are bound to requirements in another spec; in this case, the ARIA spec. It's not right to ignore the requirements in another spec unless the requirements are really bogus. I at least do not see the use-main-once-per-document best-practice requirement in the ARIA spec as bogus.

> I don't see why we'd quickly be able to
> decide which of the 12+ "main"-related things in cnn.com would be the best
> match.

And I don't see why you're not able to see that those 12+ things are not what most people other than you would ever spend time trying to assert are the main content of the document.

> I think if you are quickly reaching a conclusion about which one is the
> right match, that you are not looking at this, as you put it, "with an open
> mind".

Fair enough. I guess my mind is constrained by trying to see it from a common-sense perspective rather than overcomplicating it unnecessarily. 

> > > The closest thing to something related to a "main" element that I could see
> > > making sense in an AT's UI is some form of navigation UI. There's two kind
> > > of navigation UIs that I can see; jumping to a single place in the document,
> > 
> > Yes, that as I understand it at least is what the "main" landmark is for.
> > 
> > > and jumping backwards and forwards, or in a ring, around a list of places in
> > > the document. An example of the former would be jumping to the top of the
> > > document;
> > 
> > Yeah
> > 
> > > an example of the latter would be walking the document outline by
> > > jumping to headings.
> > 
> > And that would be useful but it would be a case that has nothing to do with
> > the "main" landmark. Not as far as I understand it at least.
> 
> On what are you basing this? ARIA explicitly says that there can be multiple
> elements with role=main in a Document, so presumably the UI _has_ to be one
> that can navigate back and forth in a ring, or a tree, or similar.

OK, yeah, I suppose that if it's possible for a document to have multiple main roles, than yeah, AT software ideally does need to be able that case. Whether it does so elegantly in practice is another matter. It still seems to me that if the default *user* expectation is that document is likely to have only one main role, then we should be encouraging authors to try to meet that user expectation, instead of violating it.

> Such UIs
> are hardly foreign to the user, tabbing focus around works that way, jumping
> around headings works that way, even pressing "home" and "end" on Mac works
> that way.
> 
> Even if we only allow one <main> per Document, you can nest Documents in
> frames, so the UI would probably still have to handle it.

Yeah, agreed that the UI should handle it. But it does not necessarily follow from that that we should cave and say, "Well, proper AT needs to handle multiple main roles anyway, so we should just give up and not try to get authors to use it only once per document as users would normally expect to see it."

> There's only one thing, to my knowledge, that we limit to a one-per-document
> basis in HTML: IDs. There's a very pragmatic reasons for that; their entire
> purpose is to be able to unique identify elements. Even <html> and <head>
> and <body> can occur multiple times (in a compound document whose host
> language allows HTML documents, e.g. maybe SVG).
> 
> > >  - One part of the document that the author most wants the user to see
> > >  - The parts of the document that are not duplicated on other pages (not
> > >    boilerplate)
> > >  - The parts of the document that are the reason the page exists
> > >  - The dominant contents of the document
> > > 
> > > The last three are more or less equivalent definitions.
> > 
> > Yeah, agreed. And I think all of those describe, in different words, what
> > most people generally already agree is the "main" content of the document,
> > and I think that is what the "main" landmark is for: A single element in the
> > document that contains the main content for the document overall. I don't
> > there's much confusion in the authoring community about that.
> 
> All of the above except the first could appear multiple times in a document.
> 
> 
> > > The first is a
> > > subset of those definitions — that is, if you pick the dominant contents of
> > > the document, and there's more than one,
> > 
> > More than one what? More than one element that represents the "dominant
> > contents of the document"?
> 
> Right.
> 
> 
> > If so, I have yet to see in this discussion an
> > actual document that has more than one element that represents the dominant
> > contents of a document.
> 
> If you view "the main content of the document" as something that can be
> large, e.g. containing every blog post on a blog home page, then it makes
> sense that you'd only see one, because you'd just view half the page as
> being the main bit. But if you consider <main> as being potentially useful
> for marking up the inside of each <article>, then obviously there can easily
> be more than one <main> per page. The question is, which is more helpful to
> users and to authors?
> 
> My argument is that for users, there's at least as much benefit, and
> probably more, to being able to jump to both the first key point of data on
> the page, and to jump then to the next one (e.g. jumping past the sidebar as
> in the "nextParagraph" example above), and so on. This is consistent with
> the ARIA spec saying that role=main is an "alternative for "skip to main
> content" links" (in this case, the link is "skip to next paragraph", but the
> idea is identical).

But it's not consistent with the spirit of the ARIA spec, nor with the explicit requirement in the spec that authors should use main only once per document.

> And for authors, I think it's clear that an element that replaces one <div>
> is not as useful as an element that replaces four <div>s. Consider this
> simplified structure which is basically what the blogs above, and even CNN
> if you squint enough (it's not obvious which parts of that page should be
> <article>s rather than <nav>s), are equivalent to:
> 
>    <body>
>      <header> ... </header>
>      <div class=a>
>        <article>
>          <header> ... </header>
>          <div class=b> ... </div>
>          <footer> ... </footer>
>        </article>    
>        <article>
>          <header> ... </header>
>          <div class=b> ... </div>
>          <footer> ... </footer>
>        </article>
>        ...
>      </div>
>      <footer> ... </footer>
>    </body>
> 
> Why would it be more useful to provide a dedicated element for the class=a
> div rather than for the class=b divs? I think it's pretty clear that if we
> have to make a choice, the dedicated element to remove more class=""es is
> the more helpful choice.

It doesn't have to be the same element for both cases. ARIA and AT provide a number of different means for markup that allows users to navigate around documents.

> > I think in practice the way the authors mark up
> > content that contains multiple important things is to put a "main" wrapper
> > equivalent around all of them, with headings for each. So then the AT user
> > can use the "main" landmark to get to the beginning of that, block of
> > content, and then can cycle through the headings within that.
> 
> That seems like a net loss relative to being able to navigate amongst those
> items, especially if they are interspersed with other elements.

You're not losing the ability to navigate amongst multiple items. There are other ways to mark it up that don't require using the same element for everything.

> > Instead it's the one part
> > where the author would want to start the user out as the main part of the
> > document.
> 
> I don't think most authors will even think of the AT part of this, to be
> honest. If we wanted to provide a single location from which to begin,
> though, we wouldn't want to use <main>, since it has contents. We'd want to
> use a void element. It makes no sense to use an element that has a range to
> mark up a point. It only raises the question of where to put the end. (If we
> were to use <main> to mark up a point, one would need to ask why we're not
> asking for authors to write "<main></main>", with no content.)
> 
> > Once the user gets there, they can then get more granular in
> > navigating to specific subsections. In that sense, <main> and the "main"
> > landmark are more just about *orienting* the user to the main content of the
> > document; that is, giving them a more specific starting point from which to
> > continue further navigation, rather than them needing to start out with the
> > <body> element.
> 
> In this regard, they seem redundant with heading navigation, IMHO. But
> that's another story.
> 
> 
> > I will concede that it's *possible* some documents may have content
> > distributed in such as a way that you can't identify one single part as the
> > start of the main content. But I would argue that such documents then in
> > fact don't have "main" content and their authors would not want to mark them
> > up with <main> anyway. Not every document needs to use <main>.
> 
> Why would they not want to use <main>?

Because they have other means they've already been using that work fine. So they don't need main for those uses.

> > I'm not suggesting we be overly prescriptive for <main> for no real purpose
> > or for a silly purpose. I'm suggesting we constrain it to use for what
> > matches the conceptual model that people already have the for that one part
> > of a document that represents the main content of the document.
> 
> If it's the conceptual model people have, then they'll only use one <main>
> per document regardless of what the validator says. But I don't think you're
> right on this point; pages have multiple class="body", class="main",
> class="content", etc, <div>s today. In fact, pretty much every subpart with
> a header and a footer ends up having a <div> for its "main" contents.
> 
> > > This is why personally I think the whole role=main/<main> thing is wildly
> > > misguided. A far more effective way of marking up pages would be to use the
> > > other elements like <header>, <nav>, <footer>, etc, and then have the user
> > > agent provide a UI that skipped past strings of uninteresting content.
> > 
> > When do you imagine UAs would ever get around to implementing that?
> 
> There's lots of low-hanging fruits that AT vendors don't seem to pick.

100% agreed with you there. But that is not something it's in our power to fix.

> > We could use that kind of argument to avoiding adding all kinds of stuff.
> 
> Indeed. We do. We should.

I don't think we should if the end result in practice is going to be a degrade in user experience for AT users. That is punishing the users for the sins of the AT vendors.

> > They can do both. They just don't need to use <main> for both, nor would
> > they likely want to. There are other elements they can use -- <article>, for
> > example -- which don't carry the AT semantics of being a landmark.
> 
> There's nothing about landmark roles that forces them to be unique per
> document.

Nothing expect that fact the ARIA spec says that the should be unique per document.

> For example, "navigation" (<nav>'s role) is a landmark role, and
> there's nothing requiring there to be only one <nav> per document.
> Similarly, <aside>'s role is a landmark role.

I didn't know that actually. It would seem to me those should not be landmarks, then.
Comment 12 Ian 'Hixie' Hickson 2013-04-26 23:02:27 UTC
> The relevant statement I find in the ARIA document that states guidance on
> this most clearly is:
> 
>   "Within any document or application, the author SHOULD mark no more than
> one element with the main role."
> 
> Yeah, I realize RFC 2119 "should" is often (usually) interpreted as making
> the qualification "unless there is a good reason".

Your quotation misses the most important detail here, which is that the words "document" and "application" in that sentence refer to the _roles_ with those names. So the following, for example, is completely conforming to ARIA, even if you pretend the "SHOULD" is a "MUST":

  <!DOCTYPE HTML>
  <title>Example</title>
  <body>
   <div role="document">
     <div role="main"> </div>
   </div>
   <div role="document">
     <div role="main"> </div>
   </div>
   <div role="document">
     <div role="main"> </div>
   </div>
  </body>
 

>   "Because document and application elements can be nested in the DOM, they
> may have multiple main elements as DOM descendants"
> 
> As you worded it, it's saying that it's possible that document can/may have
> multiple main elements as DOM descendants. It's not saying that's OK or
> permitted; it's just stating a fact.

"MAY" is an RFC2119 term meaning it's permissible, not just a statement of fact.


> It's not trying to contradict the
> earlier guidance it provided that "the author SHOULD mark no more than one
> element with the main role."; that remains the best-practice requirement it
> is asserting. 

No, it's reiterating the stuff about document and application roles you quoted.


> OK, yeah, I suppose that if it's possible for a document to have multiple
> main roles, than yeah, AT software ideally does need to be able that case.

It's not just possible, it's allowed.

It's allowed within one document (a document with many role=document elements, each with a role=main).

It's allowed within one browsing context (a document containing iframes each containing a role=main).

So ATs have to support this already, not just as a failure mode but as a native feature. It's not like we'd be leading users into a less-well-supported suboptimal failure mode.


> Yeah, agreed that the UI should handle it. But it does not necessarily
> follow from that that we should cave and say, "Well, proper AT needs to
> handle multiple main roles anyway, so we should just give up and not try to
> get authors to use it only once per document as users would normally expect
> to see it."

I don't think it's caving. Quite the opposite. I see absolutely no value in limiting documents to just one role=main or one <main>. We're not caving, we're pushing back on a pointless limitation (one that authors are going to ignore anyway).


> > And for authors, I think it's clear that an element that replaces one <div>
> > is not as useful as an element that replaces four <div>s. Consider this
> > simplified structure which is basically what the blogs above, and even CNN
> > if you squint enough (it's not obvious which parts of that page should be
> > <article>s rather than <nav>s), are equivalent to:
> > 
> >    <body>
> >      <header> ... </header>
> >      <div class=a>
> >        <article>
> >          <header> ... </header>
> >          <div class=b> ... </div>
> >          <footer> ... </footer>
> >        </article>    
> >        <article>
> >          <header> ... </header>
> >          <div class=b> ... </div>
> >          <footer> ... </footer>
> >        </article>
> >        ...
> >      </div>
> >      <footer> ... </footer>
> >    </body>
> > 
> > Why would it be more useful to provide a dedicated element for the class=a
> > div rather than for the class=b divs? I think it's pretty clear that if we
> > have to make a choice, the dedicated element to remove more class=""es is
> > the more helpful choice.
> 
> It doesn't have to be the same element for both cases.

Sure. But I'm saying if we have the opportunity to either provide a dedicated element for the class=a div or provide one for the class=b divs, why would we pick the less-used case?

Even if we look at this from the ARIA/landmark point of view, it's less useful to _users_ to have the class=a div marked up than to have the class=b divs marked up.


> I don't think we should if the end result in practice is going to be a
> degrade in user experience for AT users. That is punishing the users for the
> sins of the AT vendors.

I agree. But having <main> as a replacement for the class=b divs above _enhances_ the experience for AT users relative to using it for the class=a div, it doesn't degrade it. I don't understand why you think it would degrade it.


> > For example, "navigation" (<nav>'s role) is a landmark role, and
> > there's nothing requiring there to be only one <nav> per document.
> > Similarly, <aside>'s role is a landmark role.
> 
> I didn't know that actually. It would seem to me those should not be
> landmarks, then.

There's nothing in ARIA that says landmark roles have to be unique per Document (as opposed to role=document). Not for role=main (<main>), not for role=navigation (<nav>), not for role=complementary (<aside>). And as far as I can see, all can have the same UI.
Comment 13 Ian 'Hixie' Hickson 2013-06-06 19:56:02 UTC
(feel free to reopen if you disagree with the previous comment)
Comment 14 steve faulkner 2013-06-27 09:58:21 UTC
(In reply to comment #13)
> (feel free to reopen if you disagree with the previous comment)

Not asking to re-open as what the whatwg spec says, is your domain.

but, interested to know where you pulled this pearl from

"it's less useful to _users_ to have the class=a div marked up than to have the class=b divs marked up."

which users? the users I have talked to do not want more landmarks, they generally want less landmarks, so they indicate large chunks of the page, they already have ways to navigate and interact with smaller chunks via headings, articles, sections, lists, p, blockquote etc etc.

Leonie Watson explains how landmarks help in this video:
http://www.youtube.com/watch?feature=player_embedded&v=IhWMou12_Vk