5744 – Improved Fragment Identifiers

This is an archived snapshot of W3C's public bugzilla bug tracker, decommissioned in April 2019. Please see the home page for more details.

Bug 5744 - Improved Fragment Identifiers

Summary: Improved Fragment Identifiers

Status:	CLOSED WONTFIX

Alias:	None

Product:	HTML WG
Classification:	Unclassified
Component:	pre-LC1 HTML5 spec (editor: Ian Hickson) (show other bugs)
Version:	unspecified
Hardware:	PC Windows XP

Importance:	P2 normal
Target Milestone:	---
Assignee:	Michael[tm] Smith
QA Contact:	HTML WG Bugzilla archive list

URL:	http://lists.w3.org/Archives/Public/p...
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2008-06-13 01:27 UTC by Erik Wilde
Modified:	2011-01-26 20:28 UTC (History)
CC List:	10 users (show)

See Also:

Attachments

Description Erik Wilde 2008-06-13 01:27:24 UTC

The recently published HTML 5 draft does not change anything regarding HTML fragment identifiers. They are still limited to IDs only (with <a name=""> as alternative for backwards-compatibility). This means that any reference into an HTML page depends on how the page is using IDs.

But wouldn't HTML 5 be a wonderful opportunity to bring a little bit more hypermedia back to the Web? XML had XLink and XPointer. Both were failures for a number of reasons, but I am still a big fan of trying to make the Web more hypermedia-like. So why not learn from XPointer and try to give HTML 5 a more practical and useful set of fragment identification methods than just IDs?

The whole fragment identification idea is a classic chicken and egg problem. Why use them when they're not supported? Why support them when they're not used? We had a lot remarks like that when we worked on fragment identifiers for plain text files, but I still believe it is good to have mechanisms like that. Assume Firefox had a feature where you just moused over a paragraph, right-clicked, and then you could send an email with a pointer to that paragraph. If the receiver had Firefox, the browser would scroll to and highlight that paragraph. I am still convinced a lot of people would find such a feature pretty useful. And things would not break in another browser, users would simply not get the scroll/highlight behavior.

While I am convinced that HTML 5 would be the right point in time to introduce such an improved fragment identification method and try to fix the fact that few people use HTML fragment identification, I am not really sure how to best do it. My guess is there should be three basic ways of identifying fragments:

* IDs: For backwards compatibility, IDs (and <a name="">) should be supported. It would be what XPointer called barenames or shorthands.

* Child Sequences: Similar to XPointer's child sequence, there should be one in HTML 5, which could either start at the page body, or at an ID. The fragment identifier #warning/2/3 would identify the third child of the second child of the id=warning element.

* Character Pointers: Should there also be a way of how to point to a position? Maybe defined by counting characters in the page's string value? Hard to tell, but this is where XPointer definitely went over the top and was never finished, because it even tried to define arbitrary ranges, which is really hard to do.

Maybe just IDs and child sequences could do the trick? There also should be a well-defined behavior for browsers, so that a user instructing a browser to create a fragment identifier could be sure that it will always be rooted at the nearest ID, to make it less likely to break. I am sure there are many more details to figure out, but I am curious whether anybody else thinks this could become a pretty useful addition to how HTML can be used.

And please don't even ask about how to handle situations where CSS is hiding parts of the document, maybe dynamically, or even worse, where scripting code is changing the document's DOM. It would be necessary to have well-defined behavior for all possible situations, but my guess is that for the majority of static Web pages, fragment identification in a rather simple form would already be pretty useful as a way to better communicate about Web content.

Comment 1 Ian 'Hixie' Hickson 2008-06-13 06:57:26 UTC

What problem are we solving here? Is giving a fragment identifier into a document really something that causes difficulties? Most people seem to deal fine with just saying "Look at bla on this page" with a URI without a fragment identifier, no?

It seems like if this was really a problem, people would have been doing things to work around it, as they do with many other limitations of the Web platform, but in this case I really see nobody working to index into pages better. What evidence of the need is there?

Even if the problem exists, though, and is worth solving, why is XPointer not good enough? We can easily redefine XPointer to work for HTML as well as XML, since HTML5 defines text/html HTML in the same terms as XML-based HTML.

Are user agents willing to actually implement this?

Incidentally, I recommend reading:
   http://wiki.whatwg.org/wiki/FAQ#Is_there_a_process_for_adding_new_features_to_the_spec.3F

Comment 2 Erik Wilde 2008-06-13 17:42:43 UTC

"What problem are we solving here? Is giving a fragment identifier into a
document really something that causes difficulties? Most people seem to deal
fine with just saying "Look at bla on this page" with a URI without a fragment
identifier, no?"

that's correct, but kind of a tautology, because nowadays it's the only option that people have. whether people are dealing "fine" or not is kind of hard to say, but it is hard to build better tools (such as a browser providing the capability to create more specific links) when the spec does not support that because only @id elements can be used as fragment identifiers.

"It seems like if this was really a problem, people would have been doing things
to work around it, as they do with many other limitations of the Web platform,
but in this case I really see nobody working to index into pages better. What
evidence of the need is there?"

http://www.codedread.com/fxpointer/ is an attempt to do something about it, but it may not be the evidence you are looking for. it is hard to imagine forums full of "HTML should have better fragment identification" posts, so what kind of evidence are you looking for?

"Even if the problem exists, though, and is worth solving, why is XPointer not
good enough? We can easily redefine XPointer to work for HTML as well as XML,
since HTML5 defines text/html HTML in the same terms as XML-based HTML."

xpointer may be good enough, even though i would argue that its not a good spec. but i think looking at how to do it would be a second step, the first step would be to decide that yes, html5 should have better fragment identification, specifically one that does not depend on @ids.

"Are user agents willing to actually implement this?"

i don't know. the simpler it is (and i think it should be simpler), the easier it can be and probably will be implemented. and i think the html5 spec could and should explicitly encourage implementors to support fragment identifiers, both through a way of constructing them (creating a link while browsing a page), and through interpreting them (scrolling to the fragment and applying the CSS :target pseudo-class formatting).

Comment 3 Ian 'Hixie' Hickson 2008-06-13 20:08:14 UTC

> that's correct, but kind of a tautology, because nowadays it's the only option
> that people have. whether people are dealing "fine" or not is kind of hard to
> say, but it is hard to build better tools (such as a browser providing the
> capability to create more specific links) when the spec does not support that
> because only @id elements can be used as fragment identifiers.

Historically, Web authors have found incredibly ingenious ways of working around the slightest limitation when there's something they want to solve. Plugins get developed (e.g. Flash video) to fill holes in the specs, people develop massive widget libraries to get around the lack of native widgets, etc. It is rare that a feature is needed without lots of people finding a workaround and using it.


> http://www.codedread.com/fxpointer/ is an attempt to do something about it

Yeah, that's the kind of thing I mean. Does it have many users?


>> Are user agents willing to actually implement this?
> i don't know

Getting browsers to be ok with implementing something is one of the first things we have to do.


I guess I'm not convinced that there is a real need here, and that even if there is a need, that it's not already solved by XPointer. We shouldn't be reinventing the wheel just because we're not sure we like the current spec -- we should work with that spec to make it better.

So in conclusion I recommend approaching the XPointer group and asking them to make the improvements you feel it needs, possibly simplifying it if necessary, or explicitly saying it should work with HTML if that isn't already the case.

If you disagree with this conclusion, please either show what information I overlooked in reaching my conclusion, or, if you agree with the facts but disagree with the interpretation of the facts, raise this issue with one of our chairs. Thanks!

Comment 4 Erik Wilde 2008-06-13 22:54:39 UTC

(In reply to comment #3)
> > http://www.codedread.com/fxpointer/ is an attempt to do something about it
> Yeah, that's the kind of thing I mean. Does it have many users?

we'll see. the author is on the public-html list and will probably reply soon...

> I guess I'm not convinced that there is a real need here, and that even if
> there is a need, that it's not already solved by XPointer. We shouldn't be
> reinventing the wheel just because we're not sure we like the current spec --
> we should work with that spec to make it better.

xpointer is a half-finished set of specs that was basically abandoned when the xml linking working group disappeared. people seem to think xpointer is something finished and readily available that could simply be reused - i don't think this is the case.

> So in conclusion I recommend approaching the XPointer group and asking them to
> make the improvements you feel it needs, possibly simplifying it if necessary,
> or explicitly saying it should work with HTML if that isn't already the case.

there is no xpointer working group. the xml linking working group disappeared a number of years ago.

> If you disagree with this conclusion, please either show what information I
> overlooked in reaching my conclusion, or, if you agree with the facts but
> disagree with the interpretation of the facts, raise this issue with one of our
> chairs. Thanks!

so one conclusion of yours was that you don't think something like that is necessary. it is hard to argue against that. the other conclusion was that maybe xpointer should be adopted. here i would say that (a) xpointer is a half-ready set of specifications, and (b) it is not being developed anymore, so if we want better fragment identification beyond @ids, we have to do it ourselves.

Comment 5 Ian 'Hixie' Hickson 2008-06-14 08:57:00 UTC

It's relatively easy to show that a feature is necessary -- provide evidence that people are working around the lack of the feature. If they're not, then the feature probably isn't necessary. This isn't really that much of a judgement call -- it's usually pretty clear when a feature is missing or not.

If the XPointer work was abandoned, that's even more evidence that this kind of thing isn't especially wanted. If you think it's needed, I'd say the best way forward is to reopen that work item. That's independent of HTML5 -- it should work regardless of the markup language, be it HTML, SVG, or whatever.

I'm marking this WONTFIX again. Please don't change the resolution (as it breaks the issue accounting mechanisms I have in my workflow), unless you have substantial new information that I've missed (in which case simply reopen the bug). Please don't reopen the bug without adding substantial new information or pointing clearly to what information I have missed. If you disagree with my conclusion but don't think I've missed any information, please raise the issue with the chairs, so that they can determine whether to override me. Thanks!

Comment 6 Rob Burns 2008-06-14 10:16:04 UTC

> I guess I'm not convinced that there is a real need here, and that even if
> there is a need, that it's not already solved by XPointer. We shouldn't be
> reinventing the wheel just because we're not sure we like the current spec --
> we should work with that spec to make it better.

The need here arises because:

  authors often have a need to reference a specific section of another document in a persistent or semi-persistent way. However, other referencee authors may not provide sufficient id attributes to meet the needs of referencing authors. This happens quite frequently. My most recent case was a need to reference a piece of the HTML5 draft, but there was no nearby fragment id to use and the nearest id came in the next chapter or section (keep in mind here that this is a persistent document since I was linking to a version snapshot of the draft)

  other times authors do not check the uniqueness of ids and so id referencing is broken (for example, see http://krijnhoetmer.nl/irc-logs/whatwg/20080603#l-323)

  it is considered best practice to provide accurate and precise referencing of other peoples work for both users and authors benefit

  perhaps it is somewhat beyond the scope of the HTML WG to provide a URL pointer mechanism, but there are certainly steps a WG as pivotal as HTML can do to address the situation such as 1) calling for a resurrection of a URL pointer, WG 2) recommending XPointer support in HTML UA, just to name two

Having said all that I agree that more specifics are needed of what is missing from XPointer, and some analysis of where XPointer went wrong before we can address it in the HTML WG.

Ideally, a fragment should be identified with an URL by either: 1) id reference, 2) a path of named siblings; 3) a path of indexed siblings; or 4) a combination of all three. Some simple syntax drawn from XPointer/XPath and included in the HTML5 spec might be all that is necessary to achieve that. Most UAs that would need to implement this already have some XPath capabilities already built in.

Comment 7 Erik Wilde 2008-06-14 15:38:26 UTC

(In reply to comment #5)
> It's relatively easy to show that a feature is necessary -- provide evidence
> that people are working around the lack of the feature. If they're not, then
> the feature probably isn't necessary. This isn't really that much of a
> judgement call -- it's usually pretty clear when a feature is missing or not.

i think you are applying the wrong logic here. for pure authoring purposes, looking at whether widgets or libraries have been created works well, because these will be loaded dynamically with a page and authors can do a lot by developing these workarounds.

for the thing i am talking about, two things are different:

* the person trying to create a link to a fragment of HTML does not have write access to the document, so there is no possibility to go the usual route of developing workarounds as widgets/libraries.

* even if there is a local workaround (like the extension supporting xpointer), it only makes sense if this is installed (!) on both sides, the usual dynamic loading of scripting does not work here.

so i think that before saying that the lack of widgets/libraries is a proof that the proposed extension is not necessary, it is important to realize that this approach simply does not work for this particular problem. so using this kind of "check" to decide whether a feature is necessary in this case may not be the best foundation for a decision.

fragment identifiers are a very typical chicken-and-egg problem. why create them when the other side cannot understand them? why implement them when nobody understands them? thus, creating better fragment identification would be a explicit decision to try to improve the hypermedia capabilities of the web, allowing people to better link to things. the web's hypermedia capabilities often have not been at the core of web development ("why link when you can search?"), but i think HTML is important enough (after all, the "H" stands for hypertext) to at least consider a modest proposal to enable better hyperlinking on the web, rather than dismiss it out of general principle.

Comment 8 Ian 'Hixie' Hickson 2008-06-14 18:49:29 UTC

Where there's a need, people try to fill it. Sometimes entire working groups grow up around a need to fill it. When those working groups stop work before they're done, that usually means the need isn't big enough.

In any case, this really is out of scope for HTML5. You could create something like XPointer totally independently of HTML, we don't need to specify it in HTML5.

Comment 9 Ian 'Hixie' Hickson 2008-06-14 18:50:36 UTC

(If you disagree, please tell the chairs. Sorry to be blunt, but merely repeating previously made points in this bug doesn't work, since if I've already taken something into account, it is unlikely that taking it into account again will change my mind. Merely disagreeing with me doesn't introduce more information.)

Comment 10 Julian Reschke 2008-06-14 19:00:21 UTC

(In reply to comment #8)
> In any case, this really is out of scope for HTML5. You could create something
> like XPointer totally independently of HTML, we don't need to specify it in
> HTML5.

Defining fragment identifiers for text/html totally is in scope. Even if XPointer was done and good and everybody loved it, HTML5 still would need to define how it applies to the text/html serialization.

Comment 11 Ian 'Hixie' Hickson 2008-06-14 19:30:36 UTC

No, it would be in scope for the MIME type RFC, but that's a separate problem, and could be done by either group (or both).

Comment 12 Julian Reschke 2008-06-14 19:49:22 UTC

(In reply to comment #11)
> No, it would be in scope for the MIME type RFC, but that's a separate problem,
> and could be done by either group (or both).

The HTML WG defines the format of text/html, thus also should define fragment identifiers for it (I don't care in which document or in which SDO, but it's certainly the job of those who define text/html).

Comment 13 Ian 'Hixie' Hickson 2008-06-14 19:59:03 UTC

Well in that case we're still back to the issue that there doesn't seem to be much demand for this. :-)

Comment 14 Julian Reschke 2008-06-14 20:22:45 UTC

(In reply to comment #13)
> Well in that case we're still back to the issue that there doesn't seem to be
> much demand for this. :-)

Aha :-)

I think improvements in linking to parts of HTML documents would be great. Not sure what the best way *how* to do this (XPointer? XPointer subset? Something much simpler?).

Comment 15 Ian 'Hixie' Hickson 2008-06-14 20:33:41 UTC

I think improvements to a lot of things would be great. :-) But we have to concentrate on things that have demand and that browsers are willing to implement.

Comment 16 Julian Reschke 2008-06-14 21:44:54 UTC

I think it's pretty clear that there is *some* demand.

Comment 17 Erik Wilde 2008-06-14 22:43:03 UTC

(In reply to comment #16)
> I think it's pretty clear that there is *some* demand.

i really think that if you define demand as "people are implementing javascript libraries to emulate this feature", more or less by definition this cannot happen for a scenario like fragment identifiers.

and there is "demand" (more loosly defined) for HTML being a better document format. tons of PDF (as terribly bad as it is as a document format) are still being produced because of basic HTML problems, such as not printing very well, or not not being able to point to document parts (which in PDF can be done by pointing to a page; even PDF had fragment identifiers for this). CSS is (very very slowly) going in that direction with the advanced layout and paged media modules. better fragment identification would be another piece in that puzzle.

Comment 18 Ian 'Hixie' Hickson 2008-06-15 02:37:18 UTC

I don't think I've ever seen anyone point to an arbitrary paragraph or page in a PDF. Do you have an example of a page doing that?

I recommend reading:
   http://wiki.whatwg.org/wiki/FAQ#Is_there_a_process_for_adding_new_features_to_the_spec.3F

We have to show a need, and show a desire to implement, before we can really come up with a proposal. Just saying that there is "clearly a need" isn't data showing a need. So far the only evidence I've seen presented that there is a need here is a working group that tried to address the issue and apparently gave up, and a Firefox extension whose name comes up with under a hundred hits on Google. This is not exactly a resounding base of support.

Comment 19 Erik Wilde 2008-06-15 02:45:54 UTC

(In reply to comment #18)
> I don't think I've ever seen anyone point to an arbitrary paragraph or page in
> a PDF. Do you have an example of a page doing that?

other than me doing it i cannot point to one from the top of my head, and what people probably do much more often is use a pdf and point to it and then say in plain text "read page 42". that i am seeing all the time, but i don't have statistics on it.

> I recommend reading:
> http://wiki.whatwg.org/wiki/FAQ#Is_there_a_process_for_adding_new_features_to_the_spec.3F

i did read that. it starts with use cases. this is what i am trying to do here.

> We have to show a need, and show a desire to implement, before we can really
> come up with a proposal. Just saying that there is "clearly a need" isn't data
> showing a need. So far the only evidence I've seen presented that there is a
> need here is a working group that tried to address the issue and apparently
> gave up, and a Firefox extension whose name comes up with under a hundred hits
> on Google. This is not exactly a resounding base of support.

xpointer was not finished because the xml linking group stopped working, and that happened because xml took off as a back-end format, and not in the way expected by the w3c originally: as a format directly used by browsers. this is not any proof that users and browsers could not benefit from better fragment identification, it only shows that xml did not take off as the user-facing format it was expected to become.

i am wondering, though, why you are not addressing my argument that fragment identification inherently needs cooperating peers, so if you ask for proof where people are fixing the problem on their end (as you can do it with stuff that can be fixed with scripting by page authors missing html language features), more or less by definition it will be impossible to come up with anything that is likely to satisfy you.

Comment 20 Ian 'Hixie' Hickson 2008-06-15 03:26:43 UTC

> [...] use a pdf and point to it and then say in plain text "read page 42".

Sure, but that's the same thing as people saying "look at this page and read the section numbered 3.4", which happens a lot even when the page in question has IDs and thus wouldn't need it (i.e. when fragment identifiers would be enough). (This actually suggests a technical solution might not be useful here.)


> i am wondering, though, why you are not addressing my argument that fragment
> identification inherently needs cooperating peers

There are plenty of examples where problems that need cooperating peers have still flourished. At the extreme we have Flash, where a single vendor decided there was a problem space (animations, videos) to fill, and filled it, with insane penetration numbers. Similarly with MathML, where some authors are using it even though most clients don't support it -- there are plugins that provide it.

The problem could also be worked around from the other side. For example, if this was a need that many users had, I would expect CMS tools like WordPress to automatically assign IDs to every paragraph and "br" element. I know of a few people who go out of their way to do this (I have done it in the past myself), but they are all theoreticians, people who understand the benefits we could derive from this, if only people cared enough to use it.

One can imagine many ways that people might have worked around the lack of this feature. But in practice, they haven't, at least not insofar as I have seen. I don't disagree that the idea is a good one, and that (assuming we could resolve the brittleness issue) it would have huge potential to those who used it, but we can't just go around solving problems we like. We have to focus on the problems that people are actually running into. We don't have infinite resources, and our resources are likely better spent on more important things, like video, like maths, like duplex long-term connections.

Comment 21 Erik Wilde 2008-06-15 04:13:23 UTC

(In reply to comment #20)
> > [...] use a pdf and point to it and then say in plain text "read page 42".
> Sure, but that's the same thing as people saying "look at this page and read
> the section numbered 3.4", which happens a lot even when the page in question
> has IDs and thus wouldn't need it (i.e. when fragment identifiers would be
> enough). (This actually suggests a technical solution might not be useful
> here.)

of course a technical solution *alone* is not useful. browsers have to support that, so that users can right-click on a fragment and say "create link to paragraph". the fact that browsers today don't do this is because fragment identification relies on @id, which are often not there. if there is a reliable and generally applicable technical foundation, you can build a good user interface for it. both things must be there to make it work.

> > i am wondering, though, why you are not addressing my argument that fragment
> > identification inherently needs cooperating peers
> There are plenty of examples where problems that need cooperating peers have
> still flourished. At the extreme we have Flash, where a single vendor decided
> there was a problem space (animations, videos) to fill, and filled it, with
> insane penetration numbers. Similarly with MathML, where some authors are using
> it even though most clients don't support it -- there are plugins that provide
> it.

your examples all talk about *authors* wanting to do something, and then they can do various things (plug-ins, scripting). fragment identifiers require somebody (not the page author) to create a link with an identifier, and then this information must be used by the recipient of the link. this is just a different scenario from the author-centered scenarios you are describing.

> The problem could also be worked around from the other side. For example, if
> this was a need that many users had, I would expect CMS tools like WordPress to
> automatically assign IDs to every paragraph and "br" element. I know of a few
> people who go out of their way to do this (I have done it in the past myself),
> but they are all theoreticians, people who understand the benefits we could
> derive from this, if only people cared enough to use it.

i do know these tools and i like them. but they introduce site-specific ways of creating links to these fragments (if they do support creation at all). a browser could implement a mechanism that would work for all web pages, which is a very different thing.

> One can imagine many ways that people might have worked around the lack of this
> feature. But in practice, they haven't, at least not insofar as I have seen. I
> don't disagree that the idea is a good one, and that (assuming we could resolve
> the brittleness issue) it would have huge potential to those who used it, but
> we can't just go around solving problems we like. We have to focus on the
> problems that people are actually running into. We don't have infinite
> resources, and our resources are likely better spent on more important things,
> like video, like maths, like duplex long-term connections.

i suspect the "we" here is the pluralis majestatis. so far, nobody has said that they don't want this, and (admittedly few) people said that it would be at least worth considering. this is a great opportunity to fix something that's clearly not useful in html4, and it can be done with a moderate amount of effort. i am not asking you to develop this yourself. but what i am asking for, however, is that you consider something like that seriously, with the appropriate scenarios in mind, and with some sort of cost/benefit analysis. html5 should not be only for authors, it should also be for users.

html5 is a great way to improve the web, and the web should still be viewed as a hypermedia system, at least that is my motivation for this thread. "why link when you can search" might be the motto of the past 10 years, and it will remain dominant, but as web and hopefully hypermedia experts, we should look a little bit further than that. why search when you can link.

Comment 22 Julian Reschke 2008-06-15 08:10:46 UTC

(In reply to comment #10)
> (In reply to comment #8)
> > In any case, this really is out of scope for HTML5. You could create something
> > like XPointer totally independently of HTML, we don't need to specify it in
> > HTML5.
> 
> Defining fragment identifiers for text/html totally is in scope. Even if
> XPointer was done and good and everybody loved it, HTML5 still would need to
> define how it applies to the text/html serialization.

OK, so...

- fragment identifier syntax depend on media formats

- RFC 2854 currently defines fragment identifiers for text/html, based on the HTML 4.01 spec 

- RFC 3236 currently defines fragment identifiers for application/xhtml+xml, based on RFC 3023 (XML media types)

- There's also NOTE-xhtml-media-types-20020801 which probably should be updated when HTML5 is ready

- it would be undesirable to have fragment identifiers becoming incompatible between text/html and application/xhtml+xml, so if something is changed, it probably would have to update *both* RFC 2854 and RFC 3236.

Comment 23 Ian 'Hixie' Hickson 2008-06-15 09:15:04 UTC

> browsers have to support that

Is there any evidence that browsers are interested in providing this?


> your examples all talk about *authors* wanting to do something [...]

Examples of things users might want to do: searching across multiple sites -- someone provided that. Annotating sites -- someone provided that. Chatting with other visitors while at a site -- someone provided that. Browser vendors are also a good proxy for user desires, as they do user studies to determine what they should work on (this is especially true in this newly ultra-competitive market). Some features are probably worth adding to HTML5, others aren't. We can determine where to concentrate by looking at what features users are using.

My point here is just that we need to set our priorities based on clear research, not on our desires.


> > The problem could also be worked around from the other side. For example, if
> > this was a need that many users had, I would expect CMS tools like WordPress to
> > automatically assign IDs to every paragraph and "br" element. I know of a few
> > people who go out of their way to do this (I have done it in the past myself),
> > but they are all theoreticians, people who understand the benefits we could
> > derive from this, if only people cared enough to use it.
> 
> i do know these tools and i like them. but they introduce site-specific ways of
> creating links to these fragments (if they do support creation at all). a
> browser could implement a mechanism that would work for all web pages, which is
> a very different thing.

My point is that if there was really a pent up need for this feature, people would be using these half-assed measures and asking for better solutions. In practice, few people are even using the half-assed measures. This is evidence suggesting that this is not a high-priority feature.

That doesn't mean it's not a great idea, just that it's not what we should be concentrating on right now.

Having said that, I would encourage you to spec something independently, and try to get browsers to implement it. I'm happy to provide the hooks on the HTML5 side (in particular in the text/html MIME type registration) as needed. I do not believe this is a feature that is HTML5-specific.


> i suspect the "we" here is the pluralis majestatis.

I mean "we" as in "the Web community as a whole", in particular those of us actually writing the specs, writing the test suites, reviewing the specs and test suites, writing the implementations, testing the implementations, etc.


> so far, nobody has said that they don't want this

Sure, if it were free then it'd be great.


> i am not asking you to develop this yourself. but what i am asking for,
> however, is that you consider something like that seriously, with the
> appropriate scenarios in mind, and with some sort of cost/benefit analysis.

I consider _all_ feedback seriously, with careful study of the costs and benefits. This is my full-time job. I basically do nothing else than study proposals all day long.


> html5 should not be only for authors, it should also be for users.

HTML5 should be _primarily_ for users.

It's not clear that many users want this.



Regarding the RFCs: updating the RFCs is a separate open issue. We need to register and update several MIME types as part of HTML5, but I'm waiting for the spec to be ready before doing that, as we don't want the MIME types to be ready before the language.

Comment 24 Julian Reschke 2008-06-16 14:05:14 UTC

I would argue that the state "WONTFIX" is not appropriate here.

BugZilla defines this as:

  "The problem described is a bug which will never be fixed."

...which IMHO clearly does not apply here; at least until the working group makes a decision not to consider it.

Comment 25 Ian 'Hixie' Hickson 2008-06-16 20:47:47 UTC

Ok...

Comment 26 Ian 'Hixie' Hickson 2008-06-16 20:48:31 UTC

...reassigning to Mike for arbitration.

Comment 27 Rob Burns 2008-06-17 19:56:46 UTC

I added a corresponding wiki page for this bug. Please feel free to add or change as you see fit.

http://esw.w3.org/topic/HTML/DocFragPointer

Comment 28 Michael[tm] Smith 2008-06-21 01:40:24 UTC

(In reply to comment #26)
> ...reassigning to Mike for arbitration.

I will be closing this issue out as far as bugzilla discussion of it goes.

But note that does not in any way mean that this is somehow the terminal point in discussion of the issue. It simply reflects that fact that after quite of bit of discussion within bugzilla and an analysis of the issue by the editor, it seems clear that we do not yet at this point have a definitive mandate for including spec'ing this feature out and including it in the HTML5 draft.

For one thing (and this is perhaps the most important reason) it is not yet clear that we can expect any kind of committment at all from browser vendors to consider implementing this if we were to spec it out. It's also not clear that HTML5 would even be the appropriate place to spec it (my personal opinion, fwiw, is that it would not be).

So I think the next best step in the lifecycle of this issue is for Erik (or anyone else with a strong interest in seeing this get spec'ed and implemented) to take the appeal directly to implementors -- for example, by posting a message to the public-html and perhaps to other lists specifically asking browser vendors and other implementors to provide feedback on it.

That is not to say that browser vendors and other implementors are the only stakeholders whose views are important. It is just acknowledging the fact that feature proposals that have not yet shown a reasonable likelihood of actually getting implemented are not proposals that we can as a group afford to invest a lot of time in. In particular, the time and attention of the editor are a key asset for the group, and we need to be very careful about not misusing that.

Comment 29 Michael[tm] Smith 2008-06-21 01:41:50 UTC

Closing this, See my previous comment.

Comment 30 Erik Wilde 2008-06-21 03:51:38 UTC

(In reply to comment #28)
> It's also not clear that
> HTML5 would even be the appropriate place to spec it (my personal opinion,
> fwiw, is that it would not be).

it is part of html 4.01:

http://www.w3.org/TR/html401/intro/intro.html#h-2.1.2

it is part of the current html 5 draft:

http://www.w3.org/TR/2008/WD-html5-20080610/history.html#scroll-to-fragid

i think it would be bad design to not include fragment identifiers in the html5 spec (in their html4 form or an improved form). html is a document format intended for publishing and navigating hypertext, and factoring out fragment identifiers into the media type registration only would send a very clear message that they are regarded as add-on, and not as an integral part of the spec.

Comment 31 Michael[tm] Smith 2008-06-21 05:43:08 UTC

(In reply to comment #30)
> (In reply to comment #28)
> > It's also not clear that
> > HTML5 would even be the appropriate place to spec it (my personal opinion,
> > fwiw, is that it would not be).
> 
> it is part of html 4.01:
> 
> http://www.w3.org/TR/html401/intro/intro.html#h-2.1.2

We have in the years of time that the work that HTML5 has been going on explicitly avoided using the HTML 4.01 spec as any kind of precedent for this kind of stuff. The HTML 4.01 spec is of a completely different era, one that occurred long before we know what we know now, and it is now widely agreed that the HTML 4.01 spec is in no way an adequate model of proper specification for defining HTML user-agent conformance criteria.

> it is part of the current html 5 draft:
> 
> http://www.w3.org/TR/2008/WD-html5-20080610/history.html#scroll-to-fragid

The HTML5 draft normatively references RFC 3987 for the actual definition of what a fragment id is. In particular, as far as I can see, the HTML5 draft doesn't redefine fragment IDs nor say anyting additional about their nature than what is already specified in RFC 3987. The HTML5 draft simply specifies what UA behavior should be with respect to the fragment IDs defined in RFC 3987.
 
> i think it would be bad design to not include fragment identifiers in the html5
> spec (in their html4 form or an improved form).

If that is the case, and you really feel strongly about it, I suggest taking that to public-html list for further discussion. It's not clear to me at least what value there would be in the spec trying to say anything more about what fragment IDs actually are than what is said about them in RFC 3987. But I will admit that I may be missing something here.

> html is a document format
> intended for publishing and navigating hypertext, and factoring out fragment
> identifiers into the media type registration only would send a very clear
> message that they are regarded as add-on, and not as an integral part of the
> spec. 

I can't say that I agree at all with that assessment. If we were to try to define in the HTML5 spec itself everything that is integral part of publishing and navigating hypertext, we would end up with a much big spec the gigantic one we already have and which many in the community think is way too big as is.

I will concede that it is a fact that there are cases where the HTML5 draft does actually redefine or refine definitions of some things already definined in other specifications. But for most of those cases, the intent at least is to specify how browsers actually handle those cases -- in cases where browsers do something different that what those other specs say they should (and there are a number of such cases).

The spec for fragments IDs does not seem to be one of those cases. If browsers treat fragment IDs as defined in RFC 3987 then there is no value in the HTML5 draft providing its own separate definitino of them.

Comment 32 Michael[tm] Smith 2008-06-21 05:55:37 UTC

(In reply to comment #31)
> It's not clear to me at least
> what value there would be in the spec trying to say anything more about what
> fragment IDs actually are than what is said about them in RFC 3987.

To be clear and more precise, what I meant here is that I don't see that there would be any value in the HTML5 draft saying anything more about existing *RFC 3987* fragment IDs than what is said bout them in RFC 3987 itself.

I would personally like to see better fragment IDs than the rudely simplistic ones that RFC 3987 defines and that we have all been limited to for all these years. In particular, the value proposition for being able to have URLs[1] with fragment IDs that can point to parts of text/plain documents seems *blazingly obvious* to me. But I'm not the one who needs to be convinced, because I'm not the one who's going to need to implement it.

  --Mike

[1] yes, fwiw, I'm choosing the word "URLs" here on purpose because I like it better than URI and IRI and the are slightly more people in the outside world who actually understand what it means -- compared to the significantly insignificant number outside our little world who have any idea what a URI or IRI is)

Comment 33 Erik Wilde 2008-06-21 05:59:22 UTC

(In reply to comment #31)
> The HTML5 draft normatively references RFC 3987 for the actual definition of
> what a fragment id is. In particular, as far as I can see, the HTML5 draft
> doesn't redefine fragment IDs nor say anyting additional about their nature
> than what is already specified in RFC 3987. The HTML5 draft simply specifies
> what UA behavior should be with respect to the fragment IDs defined in RFC
> 3987.

rfc 3987 defines the syntax for a fragment identifier. what that part means
(i.e., actually identifies) is specific for a media type and must be defined
for that media type. this can be done in either the media type registration, or
in the media type definition itself. i think those are the formal rules
regarding fragment identifiers (as i recall them).

> > i think it would be bad design to not include fragment identifiers in the html5
> > spec (in their html4 form or an improved form).
> If that is the case, and you really feel strongly about it, I suggest taking
> that to public-html list for further discussion. It's not clear to me at least
> what value there would be in the spec trying to say anything more about what
> fragment IDs actually are than what is said about them in RFC 3987. But I will
> admit that I may be missing something here.

a media type has to say what is actually identified by the string found after
the '#'. rfc 3987 only says that anything after that string is a fragment
identifier. html4 for example says "search for */@id or a/@name with that value
and that's the fragment."

> > html is a document format
> > intended for publishing and navigating hypertext, and factoring out fragment
> > identifiers into the media type registration only would send a very clear
> > message that they are regarded as add-on, and not as an integral part of the
> > spec. 
> I can't say that I agree at all with that assessment. If we were to try to
> define in the HTML5 spec itself everything that is integral part of publishing
> and navigating hypertext, we would end up with a much big spec the gigantic one
> we already have and which many in the community think is way too big as is.

i think this again demonstrates my old school hypermedia background. the key
aspects of an hypermedia system are document formats and linking, and linking
must be specified for outgoing and incoming links. to me, that really is the
very core of hypermedia, and saying that "yes, we do specify outgoing links but
we leave incoming links to a secondary document, the media type registration"
to me would look like sending a bad signal. but that's just my opinion.

> I will concede that it is a fact that there are cases where the HTML5 draft
> does actually redefine or refine definitions of some things already definined
> in other specifications. But for most of those cases, the intent at least is to
> specify how browsers actually handle those cases -- in cases where browsers do
> something different that what those other specs say they should (and there are
> a number of such cases).
> The spec for fragments IDs does not seem to be one of those cases. If browsers
> treat fragment IDs as defined in RFC 3987 then there is no value in the HTML5
> draft providing its own separate definitino of them.

there is no other spec for fragment identifiers. there is the old html4 spec,
where it is an integral part of the spec. there is xpointer, which is
unfinished and has been designed for xml. other than that, there is nothing
that could be referenced. if it is decided that fragment identifers should be
improved in html5, it must be described in html5. if they should stay as in
html4, this also must be said in html5 (or its media type registration).

Comment 34 Erik Wilde 2008-06-21 06:06:11 UTC

(In reply to comment #32)
> (In reply to comment #31)
> > It's not clear to me at least
> > what value there would be in the spec trying to say anything more about what
> > fragment IDs actually are than what is said about them in RFC 3987.
> To be clear and more precise, what I meant here is that I don't see that there
> would be any value in the HTML5 draft saying anything more about existing *RFC
> 3987* fragment IDs than what is said bout them in RFC 3987 itself.

i am confused now. rfc 3987 just defines a syntax for uri/iri, nothing else. fragment identifier semantics are defined per media type.

> I would personally like to see better fragment IDs than the rudely simplistic
> ones that RFC 3987 defines and that we have all been limited to for all these
> years. In particular, the value proposition for being able to have URLs[1] with
> fragment IDs that can point to parts of text/plain documents seems *blazingly
> obvious* to me. But I'm not the one who needs to be convinced, because I'm not
> the one who's going to need to implement it.

plain text fragment identifiers: http://tools.ietf.org/html/rfc5147

there once, a long time ago, was an attempt to define a generic syntax for fragment identifiers, are you referring to that idea? the idea was that if there was a common framework for fragment identifiers, they (or parts of them) could be reused across media types. nice idea, but it went nowhere, because it was impossible to predict what kind of mechanism different media types would like to have as fragment identification. finding that draft would require advanced googling, it is pretty old and did not live very long.... long live google! here it is:

http://www.openhealth.org/RDDL/fragment-syntax

but like i said, this died very quickly, it was shouted down from all corners, if i recall correctly. i liked the idea, but had to agree that it probably would have been hard to actually define a useful syntax.

Comment 35 Erik Wilde 2008-06-21 06:15:40 UTC

(In reply to comment #34)
> (In reply to comment #32)
> > To be clear and more precise, what I meant here is that I don't see that there
> > would be any value in the HTML5 draft saying anything more about existing *RFC
> > 3987* fragment IDs than what is said bout them in RFC 3987 itself.
> 
> i am confused now. rfc 3987 just defines a syntax for uri/iri, nothing else.
> fragment identifier semantics are defined per media type.

i was not as clear here as i should have been: rfs 3987 just says there is a random string after the '#'. a media type then defines additional syntax constraints (an xml name for html4, the more complicated keyword/parenthesis-based syntax of xpointer, something like page=42 for PDF) and defines semantics. for rfc 3987, the fragment identifier is totally opaque, it is not a name or an @id or anything, it is just a string.

Comment 36 Julian Reschke 2008-06-21 11:08:26 UTC

(In reply to comment #33)
> ...
> there is no other spec for fragment identifiers. there is the old html4 spec,
> where it is an integral part of the spec. there is xpointer, which is
> unfinished and has been designed for xml. other than that, there is nothing
> that could be referenced. if it is decided that fragment identifers should be
> improved in html5, it must be described in html5. if they should stay as in
> html4, this also must be said in html5 (or its media type registration).
> ...

Actually, there are, both in IETF land (for text/html and application/xhtml+xml), and in W3C land:

- RFC 2854 currently defines fragment identifiers for text/html, based on the
HTML 4.01 spec 

- RFC 3236 currently defines fragment identifiers for application/xhtml+xml,
based on RFC 3023 (XML media types)

- There's also NOTE-xhtml-media-types-20020801 which probably should be updated
when HTML5 is ready

(<http://www.w3.org/Bugs/Public/show_bug.cgi?id=5744#c22>)

The question is whether it's in scope for us to update these specs (and yes, I think it is).

Comment 37 Rob Burns 2008-06-21 17:52:26 UTC

(in reply to comment #32)
I'm not sure why this bug would be closed as a wontfix. There has been ample evidence provided here that we have  a use case that needs addressing. You're reasoning here completely reverses our WGs priority of constituencies: placing the editor and the implementors above the needs of users and authors.

The only issues raised against this were misunderstandings about the scope of HTML5 (both it chartered scope and the current scope of the document). To me it would be better to take these sorts of issues to the WG as a whole to decide what to do (that's all I'll say on process here, but since you opened a process discussion on the bug, I thought someone should respond to it).

Comment 38 Michael[tm] Smith 2008-06-21 21:08:29 UTC

(In reply to comment #36)
> (In reply to comment #33)
> > ...
> > there is no other spec for fragment identifiers. there is the old html4 spec,
> Actually, there are, both in IETF land (for text/html and
> application/xhtml+xml), and in W3C land:
> 
> - RFC 2854 currently defines fragment identifiers for text/html, based on the
> HTML 4.01 spec 
> 
> - RFC 3236 currently defines fragment identifiers for application/xhtml+xml,
> based on RFC 3023 (XML media types)
> 
> - There's also NOTE-xhtml-media-types-20020801 which probably should be updated
> when HTML5 is ready
> 
> (<http://www.w3.org/Bugs/Public/show_bug.cgi?id=5744#c22>)
> 
> The question is whether it's in scope for us to update these specs (and yes, I
> think it is). 

OK, then that's something actionable. Julian, can you please raise in issue in the group Tracker for this?

Comment 39 Michael[tm] Smith 2008-06-21 21:38:59 UTC

(In reply to comment #38)
> (In reply to comment #36)
> > (In reply to comment #33)
> > > ...
> > > there is no other spec for fragment identifiers. there is the old html4 spec,
> > Actually, there are, both in IETF land (for text/html and
> > application/xhtml+xml), and in W3C land:
> > 
> > - RFC 2854 currently defines fragment identifiers for text/html, based on the
> > HTML 4.01 spec 
> > 
> > - RFC 3236 currently defines fragment identifiers for application/xhtml+xml,
> > based on RFC 3023 (XML media types)
> > 
> > - There's also NOTE-xhtml-media-types-20020801 which probably should be updated
> > when HTML5 is ready
> > 
> > (<http://www.w3.org/Bugs/Public/show_bug.cgi?id=5744#c22>)
> > 
> > The question is whether it's in scope for us to update these specs (and yes, I
> > think it is). 
> 
> OK, then that's something actionable. Julian, can you please raise in issue in
> the group Tracker for this?

To be precise, what I meant to ask was, could you please raise an issue in the group tracker stating, "Need to update RFC 2954, RFC 3236, and NOTE-xhtml-media-types-20020801 when HTML5 is ready".

Comment 40 Erik Wilde 2008-06-22 01:20:23 UTC

(In reply to comment #36)

sorry to be so nitpicking, but i think this is important, and thus want to make things as clear as possible. i think it is important because there is no example here where fragment identifiers were actually defined outside of the language spec; in html4 they are in the spec, and for xhtml and xml, technically there are no fragment identifiers defined at all.

> - RFC 2854 currently defines fragment identifiers for text/html, based on the
> HTML 4.01 spec 

rfc 2854 does not define fragment identifiers, it mentions them and points to the html4 spec, where they are defined:

"For documents labeled as text/html, the fragment identifier designates the correspondingly named element; any element may be named with the "id" attribute, and A, APPLET, FRAME, IFRAME, IMG and MAP elements may be named with a "name" attribute.  This is described in detail in [HTML40] section 12."

i don't think this is particularly well-written, but i think it shows that the actual definition is part of html4.

> - RFC 3236 currently defines fragment identifiers for application/xhtml+xml,
> based on RFC 3023 (XML media types)

the text in section 3 is a bit confusing. it points to rfc 2854, but does not say whether this means that this mechanism also applies to application/xhtml+xml, but probably not (that's how i read it). it then points to rfc 3023, mentioning xpointer.

btw, rfc 3023 does not define any fragment identifiers, it simply points (in a clearly non-normative way, i would say) to the (at that time) ongoing development of xpointer.

what this means to me is that for application/xhtml+xml resources, html fragment identifiers (at least the ones to a/@name, */@id would be covered by xpointer in theory, but that is not really specified in rfc 3023) don't work. is that intentional?

> - There's also NOTE-xhtml-media-types-20020801 which probably should be updated
> when HTML5 is ready

this one does not say anything about fragment identifiers at all.

it seems to me that the issue that technically, html4 fragment identifiers are not defined for the application/xhtml+xml media type, may just be an oversight.

Comment 41 Julian Reschke 2008-06-23 07:09:12 UTC

(In reply to comment #40)
> sorry to be so nitpicking, but i think this is important, and thus want to make
> things as clear as possible. i think it is important because there is no
> example here where fragment identifiers were actually defined outside of the
> language spec; in html4 they are in the spec, and for xhtml and xml,
> technically there are no fragment identifiers defined at all.

Understood, and thanks for the clarification.

Comment 42 Nick Levinson 2009-02-22 09:04:08 UTC

I think something like this is needed, and I think I found the beginning of a solution. I opened issue 6610 (http://www.w3.org/Bugs/Public/show_bug.cgi?id=6610). But if it gets shot down there, I'll probably drop it.

Thanks.

-- 
Nick

Comment 43 Ms2ger 2011-01-26 20:28:50 UTC

*** Bug 11864 has been marked as a duplicate of this bug. ***