11562 – address tag definition has no relation to it's name

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 11562 - address tag definition has no relation to it's name

Summary: address tag definition has no relation to it's name

Status:	RESOLVED NEEDSINFO

Alias:	None

Product:	HTML WG
Classification:	Unclassified
Component:	LC1 HTML5 spec (show other bugs)
Version:	unspecified
Hardware:	All All

Importance:	P2 normal
Target Milestone:	---
Assignee:	Ian 'Hixie' Hickson
QA Contact:	HTML WG Bugzilla archive list

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2010-12-16 12:50 UTC by arieh glazer
Modified:	2011-08-04 05:16 UTC (History)
CC List:	6 users (show)

See Also:

Attachments

Description arieh glazer 2010-12-16 12:50:24 UTC

Hey
I wanted to comment on the definition of the address tag. 
I recommend that the address tag will be used to describe an address or location, virtual or physical. 
Following are the reasons why I think this is important.
There are 3 main issues:

1. HTML needs a tag to specify a location, as this is a common meaningful information. It is absurd that the only way for us as developers do this is by 3rd party data structures, sch as microformats or microdata. 

2. As I see it, literally speaking, the address tag indicates it's content has something to do with an address, or a location. Making it mean something else entirely is completely counter-intuitive. 

3. The specification tells us that the tag should be used for contact information. That has nothing to do with the word "address", which makes the document very vague. A "contact" tag would be much more appropriate.

Another important point is that the phrase "contact information" is a little too vague. It can easily stretch to a point where any existing address can be considered a contact information.

As a side note, I have seen the tag used as a location indicator on many accounts, and have seen search engines recognize it as such. If we take the H5 philosophy of describing what is as well as innovating, the address tag should be allowed to contain addresses, as it already does in reality.

Comment 1 Benjamin Hawkes-Lewis 2010-12-16 16:00:46 UTC

(In reply to comment #0) 
> 1. HTML needs a tag to specify a location, as this is a common meaningful
> information.

Why?

> It is absurd that the only way for us as developers do this is by
> 3rd party data structures, sch as microformats or microdata.

Why?

> 3. The specification tells us that the tag should be used for contact
> information. That has nothing to do with the word "address", which makes the
> document very vague. A "contact" tag would be much more appropriate.

What is the ultimate difference between "contact information" and "an address or
location, virtual or physical", in your view?

> Another important point is that the phrase "contact information" is a little
> too vague. It can easily stretch to a point where any existing address can be
> considered a contact information.

The key is not that it is "contact information" but that it is a contact information for an author responsible for the document or section.

> As a side note, I have seen the tag used as a location indicator on many
> accounts,

Do you have data that it is used to mean "an address or
location, virtual or physical" more often than contact information for authors?

> and have seen search engines recognize it as such.

Which search engines? Can you prove that they are recognizing it as a "an address or
location, virtual or physical" as opposed to author contact information? In particular, how do you know they are recognizing the element rather than the contents of the element (i.e. just picking up on text that looks like a postcode or whatever)?

Comment 2 arieh glazer 2010-12-16 17:18:53 UTC

(In reply to comment #1)
> (In reply to comment #0) 
> > 1. HTML needs a tag to specify a location, as this is a common meaningful
> > information.
> 
> Why?

as I said - it seems like a very common information type, that IMO should be markupable.


> > It is absurd that the only way for us as developers do this is by
> > 3rd party data structures, sch as microformats or microdata.
> 
> Why?

mostly because both MF and MD are not official, and thus are not as reliable as a true markup tag that is standard, and they are not a part of HTML. It feels strange that such a generic content would need to use "outside" markup.

> > 3. The specification tells us that the tag should be used for contact
> > information. That has nothing to do with the word "address", which makes the
> > document very vague. A "contact" tag would be much more appropriate.
> 
> What is the ultimate difference between "contact information" and "an address
> or
> location, virtual or physical", in your view?

In most discussions I've participated on the subject, such markup as:
<li class='game'>
   <h3>Some1 vs Some2</h3>
  <address>Some city</address>
</li>
was considered off the spec. As I mentioned, I too find the definition so vague that it hardly matters, but then why is "contact information" a better definition than "address or location" - which is much more suited for a tag named "address"
 
 
> > Another important point is that the phrase "contact information" is a little
> > too vague. It can easily stretch to a point where any existing address can be
> > considered a contact information.
> 
> The key is not that it is "contact information" but that it is a contact
> information for an author responsible for the document or section.

first of all, the new specs say only "The address element represents contact information." http://dev.w3.org/html5/markup/address.html
2nd of all - why is that type of information more generic and more suited than a generic address. As I see it, it makes more sense that the tag should be used as a generic address/contact info, rather than a very specific, less useful "contact the owner of the document" tag (which can be expressed by many other means, such as title, rel and other MF/MD which are for special cases scenarios).
IMO, HTML should be as generic as possible, as to allow a large set of different valid markups..


> > As a side note, I have seen the tag used as a location indicator on many
> > accounts,
> 
> Do you have data that it is used to mean "an address or
> location, virtual or physical" more often than contact information for authors?

Yes, as you can see with the example above. 
 
> > and have seen search engines recognize it as such.
> 
> Which search engines? Can you prove that they are recognizing it as a "an
> address or
> location, virtual or physical" as opposed to author contact information? In
> particular, how do you know they are recognizing the element rather than the
> contents of the element (i.e. just picking up on text that looks like a
> postcode or whatever)?

Obviously I have no such proof. It is as likely that they simply extract text structures and analyze the raw text. But I have seen the tag used for other situations (as mentioned above) and indexed properly throughout the years. In fact, I only recently found out that I have been using it off the specs...
But even if we drop the above statement, I still feel my point is valid and should at least be considered.

Comment 3 Benjamin Hawkes-Lewis 2010-12-16 18:17:17 UTC

(In reply to comment #2)
> > > 1. HTML needs a tag to specify a location, as this is a common meaningful
> > > information.
> > 
> > Why?
> 
> as I said - it seems like a very common information type, that IMO should be
> markupable.

Verb and noun and sentence are even more common units of information and we
don't have markup to distinguish those.

You need to work out what problem you're trying to solve, before we can
consider possible solutions such as changing the meaning of "address":

http://wiki.whatwg.org/wiki/FAQ#Is_there_a_process_for_adding_new_features_to_a_specification.3F

Please note that HTML already includes a native way to unambiguously mark up
locations using hyperlinks:

For example:

<a href="http://example.com">Destination description</a>

<a href="mailto:somebody@example.com">somebody@example.com</a>

<a href="tel:+358-555-1234567">Tel: +358-555-1234567</a>

<a href="fax:+358.555.1234567">Fax: +358.555.1234567</a>

<a href="geo:48.2010,16.3695,183">Vienna, Austria</a>

See also:

   * http://tools.ietf.org/html/rfc6068 for mailto

   * http://tools.ietf.org/html/rfc5341 for tel and fax

   * http://tools.ietf.org/html/rfc5870 for geo

> both MF and MD are not official, and thus are not as reliable as
> a true markup tag that is standard, and they are not a part of HTML. It feels
> strange that such a generic content would need to use "outside" markup.

Microdata and HTML+RDFa (another option for annotating HTML5 with additional
semantics) have the same "standard" status as HTML5: they are on track to
becoming W3C Recommendations:

http://www.w3.org/TR/microdata/

http://www.w3.org/TR/rdfa-in-html/

I think it's ironic that you are arguing we should *change* the definition of
the "address" element from author contact information - the definition it has
had since 1993 - to simply mean any old address on the basis that HTML is a
reliable standard!

http://www.w3.org/MarkUp/draft-ietf-iiir-html-01.txt

http://www.w3.org/MarkUp/html-spec/html-spec_5.html#SEC5.5.3

http://www.w3.org/TR/html401/struct/global.html#h-7.5.6

> > What is the ultimate difference between "contact information" and "an address
> > or location, virtual or physical", in your view?
> 
> In most discussions I've participated on the subject, such markup as:
> <li class='game'>
>    <h3>Some1 vs Some2</h3>
>   <address>Some city</address>
> </li>
> was considered off the spec.

Correctly so. That is not author contact information, it's just a location.

> > The key is not that it is "contact information" but that it is a contact
> > information for an author responsible for the document or section.
> 
> first of all, the new specs say only "The address element represents contact
> information." http://dev.w3.org/html5/markup/address.html

First, read down on that page and you'll see it clarifies:

"If an address element applies to a body element, then it represents contact
information for the document as a whole. If an address element applies to a
section of a document, then it represents contact information for that section
only."

Second the "new specs" you're quoting are intended as a mere "non-normative
reference".

http://dev.w3.org/html5/markup/Overview.html#toc

The actual normative specification is clearer:

"The address element represents the contact information for its nearest article
or body element ancestor. If that is the body element, then the contact
information applies to the document as a whole."

http://dev.w3.org/html5/spec/sections.html#the-address-element

Since you're proposing a change to the normative specification, I'll change the
Component field so that your bug gets put into the queue of the correct editor,
if that's okay. :)

> 2nd of all - why is that type of information more generic and more suited
> than a generic address

It's more suited since that has *always* been the definition of "address".

> As I see it, it makes more sense that the tag should be used as a generic
> address/contact info, rather than a very specific, less useful "contact the
> owner of the document" tag (which can be expressed by many other means, such
> as title, rel and other MF/MD which are for special cases scenarios).  IMO,
> HTML should be as generic as possible, as to allow a large set of different
> valid markups..

Precisely because HTML is supposed to be a reliable standard, we should be wary
of arbitrarily changing the semantics of its elements and attributes.

The name of an element/attribute alone is a very poor reason to change its
semantics.

> > > As a side note, I have seen the tag used as a location indicator on many
> > > accounts,
> > 
> > Do you have data that it is used to mean "an address or location, virtual
> > or physical" more often than contact information for authors?
> 
> Yes, as you can see with the example above. 

No, that's an example of it being used incorrectly. It's not data showing that
it is used incorrectly more often than correctly in the web corpus.
Incidentally, it wouldn't particularly surprise me if it were used more often
incorrectly, but we should be careful to make decisions based on actual data.
 
> > > and have seen search engines recognize it as such.
> > 
> > Which search engines? Can you prove that they are recognizing it as a "an
> > address or location, virtual or physical" as opposed to author contact
> > information? In particular, how do you know they are recognizing the
> > element rather than the contents of the element (i.e. just picking up on
> > text that looks like a postcode or whatever)?
> 
> Obviously I have no such proof. It is as likely that they simply extract text
> structures and analyze the raw text. But I have seen the tag used for other
> situations (as mentioned above) and indexed properly throughout the years. In
> fact, I only recently found out that I have been using it off the specs...

I think we can conclude the claim "search engines recognize it as such" is
unsubstantiated and should not be taken into account.

Comment 4 arieh glazer 2010-12-16 18:36:31 UTC

> Please note that HTML already includes a native way to unambiguously mark up
> locations using hyperlinks:
> 
> For example:
> 
> <a href="http://example.com">Destination description</a>
> 
> <a href="mailto:somebody@example.com">somebody@example.com</a>
> 
> <a href="tel:+358-555-1234567">Tel: +358-555-1234567</a>
> 
> <a href="fax:+358.555.1234567">Fax: +358.555.1234567</a>
> 
> <a href="geo:48.2010,16.3695,183">Vienna, Austria</a>
> 
> See also:
> 
>    * http://tools.ietf.org/html/rfc6068 for mailto
> 
>    * http://tools.ietf.org/html/rfc5341 for tel and fax
> 
>    * http://tools.ietf.org/html/rfc5870 for geo

I was not aware of these. These seem like a valid use, though my point was about specificaly marking up a location that is not exact (or it would have been easy to use it as a contact information). It is for that use case that I'm arguing address should be allowed to be used.

> 
> Microdata and HTML+RDFa (another option for annotating HTML5 with additional
> semantics) have the same "standard" status as HTML5: they are on track to
> becoming W3C Recommendations:
> 
> http://www.w3.org/TR/microdata/
> 
> http://www.w3.org/TR/rdfa-in-html/
> 
> I think it's ironic that you are arguing we should *change* the definition of
> the "address" element from author contact information - the definition it has
> had since 1993 - to simply mean any old address on the basis that HTML is a
> reliable standard!
> 
> http://www.w3.org/MarkUp/draft-ietf-iiir-html-01.txt
> 
> http://www.w3.org/MarkUp/html-spec/html-spec_5.html#SEC5.5.3
> 
> http://www.w3.org/TR/html401/struct/global.html#h-7.5.6

technically this is correct, but as the address is already a part of HTML, it's support already exist, unlike the RDFa/MD specs which are unsupported by some major browsers, and are not even prommissed to see support. Using RDFa correctly on a page means no current IE version can open it (other than declaring the doctype - I mean the ability to use the content as XML).

> > In most discussions I've participated on the subject, such markup as:
> > <li class='game'>
> >    <h3>Some1 vs Some2</h3>
> >   <address>Some city</address>
> > </li>
> > was considered off the spec.
> 
> Correctly so. That is not author contact information, it's just a location.
>
That much I understand. My point is that it should (that's basically my entire case).
 

> It's more suited since that has *always* been the definition of "address".
> 
> > As I see it, it makes more sense that the tag should be used as a generic
> > address/contact info, rather than a very specific, less useful "contact the
> > owner of the document" tag (which can be expressed by many other means, such
> > as title, rel and other MF/MD which are for special cases scenarios).  IMO,
> > HTML should be as generic as possible, as to allow a large set of different
> > valid markups..
> 
> Precisely because HTML is supposed to be a reliable standard, we should be wary
> of arbitrarily changing the semantics of its elements and attributes.
> 

I am not suggesting to change the semantics of the element, just to broaden it. Saying it can also have another use case doesn't cancel the old one. And as I understand, the _original_ meaning did include my use case. It was dropped later on (I think on H3).

> The name of an element/attribute alone is a very poor reason to change its
> semantics.
> 

This depends. I guess it's more of a philosophical question, but IMO html should be readable and understandable to humans as much as machines. But my point is that although the current use case vaguely falls into the semantics of the word "address", the semantics should include a broader spectrum of meanings (I know - repeating myself unnecessarily)  

> Incidentally, it wouldn't particularly surprise me if it were used more often
> incorrectly, but we should be careful to make decisions based on actual data.

Agreed - we should be careful. But it is still a point to consider. Or rather - we should consider why that is. If the reason is a one that doesn't break support or behavior, and can be explained by a lack of a better solution it could be a valid point. 
 
> I think we can conclude the claim "search engines recognize it as such" is
> unsubstantiated and should not be taken into account.

Agreed.

Comment 5 Benjamin Hawkes-Lewis 2010-12-16 19:05:46 UTC

(In reply to comment #4)
> These seem like a valid use, though my point was
> about specificaly marking up a location that is not exact (or it would have
> been easy to use it as a contact information).

Please note geo allows you to specify imprecise locations using:

http://tools.ietf.org/html/rfc5870#section-3.4.3

For example, you could specify a city by getting its center then specifying its approximate radius as the value of the "u" parameter. Granted, that probably isn't something authors are going to do.

> as the address is already a part of HTML, it's
> support already exist, unlike the RDFa/MD specs which are unsupported by some
> major browsers, and are not even prommissed to see support.

What sort of "support" are you worried browsers might not supply?

All popular browsers do with "address" is italicise it.

> Using RDFa
> correctly on a page means no current IE version can open it (other than
> declaring the doctype - I mean the ability to use the content as XML).

Not so.

Currently, there is a plenty of deployed markup using RDFa in text/html, thanks to Facebook's adoption of Open Graph:

http://developers.facebook.com/docs/opengraph

HTML+RDFa aims to standardize the use of RDFa in text/html, on top of HTML5 (which standardizes the parsing of text/html).

> > Precisely because HTML is supposed to be a reliable standard, we should be wary
> > of arbitrarily changing the semantics of its elements and attributes.
> > 
> 
> I am not suggesting to change the semantics of the element, just to broaden it.
> Saying it can also have another use case doesn't cancel the old one.

It's not necessarily safe to broaden semantics.

Imagine if your UA provided a menu command that would open up an email to the author of the current page based on looking for an email address in the "address" element.

If we broadened the definition to allow "address" to include any virtual location, the UA could end up sending feedback to the email address of a person who had no responsibility for the document.

Similarly, imagine you were aggregating citation and authorship information across the web based on "cite" and "address". If you used "address" as the source of your authorship information and we broadened the definition, you'd end up with lots of false positive author attributions thanks to addresses of people who were not authors.

Note these examples are purely illustrative of the dangers of broadening semantics; I'm not saying any software actually does this or even that, given the misuse of "address", that it would be feasible to do this in practice.

> And as I understand, the _original_ meaning did include my use case. It was dropped
> later on (I think on H3).

By my reading, the idea of author contact information was there from the start, but the definitions did get clearer. At any rate, this has been the standard definition since 1997.

> > The name of an element/attribute alone is a very poor reason to change its
> > semantics.

> This depends. I guess it's more of a philosophical question, but IMO html
> should be readable and understandable to humans as much as machines.

Naming things is famously one of the hard things in computer science and lots of things in HTML are badly named ("cite" was apparently meant to mean "work title", many think it means "quotation").

If we were starting from scratch, I'd agree that "address" is a bad choice for a name for this element. But now that it's been part of the language since 1997 (or earlier, depending on your interpretation), backwards compatibility trumps better naming - all other things being equal.

All other things may not be equal in this case though.

Comment 6 arieh glazer 2010-12-16 19:29:05 UTC

(In reply to comment #5)
The href:geo specs look really cool, and I am already on my way to implement it on some of my projects. But as you say, it is far from generic and it is in no way easy to embed in dynamic content generators (such as CMSs), comparing to the address tag. 

> All popular browsers do with "address" is italicise it.

All popular browsers nowadays do not do much on any element other than style it (putting anchors and form elements aside). As I understand, H5 is a lot about creating a better semantic markup (most added elements have not style). 
The only important part IMO is that out of the box, this will not break browser behavior even a bit.

> > Using RDFa
> > correctly on a page means no current IE version can open it (other than
> > declaring the doctype - I mean the ability to use the content as XML).
> 
> Not so.
> 
> Currently, there is a plenty of deployed markup using RDFa in text/html, thanks
> to Facebook's adoption of Open Graph:
> 
> http://developers.facebook.com/docs/opengraph
> 
> HTML+RDFa aims to standardize the use of RDFa in text/html, on top of HTML5
> (which standardizes the parsing of text/html).
> 
I have not been aware of this. Last time I stumbled uppon RDFa it required text/xml+html for adding more namespaces (which broke IEs). 
I am aware that H5 brings together a lot of good and rich solutions for a very wide veriety of problems via MD and RDFa. 
My thoughts were about fixeing what I understand as a problem in a definition, by adding more use cases to an existing element.

> > > Precisely because HTML is supposed to be a reliable standard, we should be wary
> > > of arbitrarily changing the semantics of its elements and attributes.
> > > 
> > 
> > I am not suggesting to change the semantics of the element, just to broaden it.
> > Saying it can also have another use case doesn't cancel the old one.
> 
> It's not necessarily safe to broaden semantics.
> 
> Imagine if your UA provided a menu command that would open up an email to the
> author of the current page based on looking for an email address in the
> "address" element.
> 
> If we broadened the definition to allow "address" to include any virtual
> location, the UA could end up sending feedback to the email address of a person
> who had no responsibility for the document.
> 
> Similarly, imagine you were aggregating citation and authorship information
> across the web based on "cite" and "address". If you used "address" as the
> source of your authorship information and we broadened the definition, you'd
> end up with lots of false positive author attributions thanks to addresses of
> people who were not authors.
> 
> Note these examples are purely illustrative of the dangers of broadening
> semantics; I'm not saying any software actually does this or even that, given
> the misuse of "address", that it would be feasible to do this in practice.
>

I accept the above is a valid reason for not fixing the problem, but IMO this should at least be checked. If the change does create a breaking change for various software, it will be a reason for not accepting my suggestion. 
But, as you say yourself, this change is already in use (be it non-standard) and thus the change won't "break the web" any more than it already is broken.
 
> > And as I understand, the _original_ meaning did include my use case. It was dropped
> > later on (I think on H3).
> 
> By my reading, the idea of author contact information was there from the start,
> but the definitions did get clearer. At any rate, this has been the standard
> definition since 1997.
> 

There have been other changes to the specs that changed the semantics of other elements - such as b and i - which, if implemented by generic software would create a much larger problem than adding more definitions to the address tag. 
(if it is not clear - my point is that the case s valid, not that the above shouldn't be touched, as I am sure a lot of discussion was done on the above mentioned topic which I am not very familiar with).

 
> Naming things is famously one of the hard things in computer science and lots
> of things in HTML are badly named ("cite" was apparently meant to mean "work
> title", many think it means "quotation").
> 
> All other things may not be equal in this case though.

Name alone is indeed not a valid reason. I believe I have made my point in a way that exceeds naming.

Comment 7 Toby Inkster 2010-12-18 23:34:05 UTC

(In reply to comment #6)
> I have not been aware of this. Last time I stumbled uppon RDFa it required
> text/xml+html for adding more namespaces (which broke IEs). 

The XHTML+RDFa spec has never required the "application/xhtml+xml" media type. In fact it doesn't make any explicit media type requirements. It is, however, built upon XHTML 1.1, which according to http://www.w3.org/TR/xhtml-media-types/ may be labeled as "text/html" provided it follows certain guidelines.

The HTML+RDFa spec, which is still just a draft, is truly HTML (not XHTML) based.

Comment 8 Ian 'Hixie' Hickson 2011-01-11 19:32:57 UTC

I don't understand what problem this bug describes, other than the known issue of <address> being a terrible name for its semantic, but that's a historical problem we inherited from HTML4, which inherited from even earlier versions.

EDITOR'S RESPONSE: This is an Editor's Response to your comment. If you are satisfied with this response, please change the state of this bug to CLOSED. If you have additional information and would like the editor to reconsider, please reopen this bug. If you would like to escalate the issue to the full HTML Working Group, please add the TrackerRequest keyword to this bug, and suggest title and text for the tracker issue; or you may create a tracker issue yourself, if you are able to do so. For more details, see this document:
   http://dev.w3.org/html5/decision-policy/decision-policy.html

Status: Did Not Understand Request
Change Description: no spec change
Rationale: We need a clear description of a problem before we can solve a problem.

Comment 9 Michael[tm] Smith 2011-08-04 05:16:28 UTC

mass-move component to LC1