See also: IRC log
<scribe> scribe: Ashok
<DKA> First draft of "product" page for privacy drafts: http://www.w3.org/2001/tag/products/PrivacyFriendlyWeb.html
<johnk> http://appliedlife.blogspot.com/2009/08/markup-languages-family-tree.html - I did this when I was reading the HTML5 spec last year
Noah: Norm is chairing the XML/HTML unification taskforce
... Issue-120 on HTML is on distributed extensibility
... there is also an issue on RDFa prefixes
Norm: We consituted the taskforce with a mixture of XML and HTML folks
<noah> Norm's blog entry on the state of play in the HTML/XML Unification subgroup: http://norman.walsh.name/2011/02/08/html-xml
Norm: started to figure out what the problem was ... didn't get very far
... then started on usecases
<noah> Use cases wiki: http://www.w3.org/wiki/HTML_XML_Use_Cases
Norm: We can discuss the usecases
LM: They are usecase categories
... you say XML Toolcahin but there are many flavors of Toolchains with different requirements
... I don't see roundtripping
Norm: Roundtripping was something we talked about but did not make it as a usecase
LM: Some may not think some of these use cases are important ... relating them to successful use may be very helpful
ht: Kai Scheppe from Deutsche Telekom AG talked about how XHTML had been very helpful
... discusses another commercial usecase
<masinter> i think our feedback that going down to get more concrete examples that would increase credibility
ht: Such commercial usecases would be useful
<masinter> HTML is not good for data scraping....
ht: Many colleagues scrape data and waste lots of time with HTML ... XHTML is much better for them
LM: (discusses use case details) -- for example analysis and extraction, looking for keywords, summarization
Noah: Norm, could you talk about the mindset of the group and where it is going
<masinter> different detailed use cases have different requirements...e.g., "scraping" might have performance requirements, while those of "processing" care about fidelity
<masinter> round-tripping has even higher requirement for fidelity beyond import + export
Noah: Says group members ready to leave
... if we refine usecases that may convince some people to stay and work on the issue
LM: We need to solicit additional requirements from more real (esp commercial) users
Norm: Roundtripping may be a new usecase
LM: usecase is starting with HTML, doing some XML processing abd then enitting HTML
Tim: The common DOM does not work because you don't add new TBody elements
<Zakim> timbl, you wanted to wonder about scripts
LM: Using an XML Toolcahin to produce HTML -- new usecase
<noah> TBL: If the task force just nourishes and maintains the concept of polyglot, that would be very userful
Norm: The HTML folks were quick to reject the Polyglot spec as too brittle ... too strict about angle brackets etc.
<noah> Norm: Polyglot is perceived as fragile for the same reasons as any XML, I.e. too strict about perfect syntax
<masinter> (note "race to the bottom" from ht)
<noah> Noah: I don't buy that, because I think the #1 use case for polyglot is for people who are using XML tool chains or are happy to produce "perfect" syntax, but whose users require content served text/html...
<noah> ...so, they want a spec that tells them just what they can and can't put into that perfect syntax and have it work right when served text/html
ht: Argument is that producing polyglot is hard, so once someone starts using a single language everyone goes to that -- race to the bottom
<masinter> "I think "use XML toolchain to produce HTML" is the most common use case in the industry, and that polyglot is likely the most appropriate direction for them
LM: Task force might recommend changes to HTML spec, e.g., options for API to the DOM ... e.g. not failing in some way
... or include some guidance about what not to use
ht: That's the polyglot document
LM: No, it can have unbalanced brackets but does not use some features
Norm: I think there is a single DOM
<noah> New use case wiki page (very rough): http://www.w3.org/wiki/HTML_XML_Use_Case_08
<masinter> document.write is a leading example
Tim: For many people the DOM is an API ... supports the same methds
Noah: XML and HTML processors working on the DOM
<Zakim> timbl, you wanted to talk about race to the top
ht: There is html in conversation and the html out conversation
Tim: It is easy to produce polyglot documents ... avoids document.write
... run it thru tidy ... if you produce polyglot you gaet 2 sets of people using it ... html folks and xml folks
... so there will be a 'race to the top'
Noah: Sympathetic to polyglot
<masinter> polyglot is useful for use cases that weren't in the set of use cases written up
Noah: useful for simple cases ... what about using external libraries, etc.
... these may use document.write
... so does Polyglot apply in these cases
<masinter> "document.write" isn't the entire set of things that are "HTML specific DOM operations", but it's a good poster child for it
Norm: The vast majority of Web docs are using string concatenetaion and they don't want to run tidy
LM: People may be discounting Polyglot because they are not looking at right usecases
Noah: Added usecase 8
<masinter> the task force should be looking at creating a document that is acceptable to the W3C and web community... their local agreement is ok
Noah: Should we invest in improving the Polyglot document
Norm: I thought the Polyglot document went as far as it could
LM: There is a large community of people with toolchain who needs to satisfied
Norm: The taskforce will produce a report and that will be reviewed
... I was unable to persuade people to make technical changes
Noah: Talks about the taskforce and peoples motivations
<masinter> A good faith participation in a task force would be to agree on a problem statement for the task force.
larry: What is the task?
Norm: It proved to be difficult to state the problem
... so people moved on to usecases
LM: Now that you have usecase are you going to try and define the prooblem again
Noah: The tone of the taskforce has been constructive
LM: My experience is that when you are at loggerheads, bring in more people
... bring in people who need the solution
Noah: Will the real users come to the taskforce and explain their usecases?
LM: Document in the report where there is not consensus and why
Norm: Usecase number 4 is most bizzare
<masinter> the XML -> (XML/HTML polyglot ) -> XML or HTML tool chain
<masinter> and the use case of "scraping" as a kind of consuming
Noah: Some folks claim no changes are needed ... HTML is the answer and XML is not helpful
Norm: I think taskforce has gone as well as it could
... no usecase has convinced the HTML folks that they need to change
Peter: What changes are you thinking of
<noah> NW: Even the script hack can be useful.
<noah> TBL: What's the script hack?
<noah> NW: <script type="application/xml"> plus a shim that finds that stuff in the DOM and parses the XML
<noah> NW: The XQuery folks are actually doing this.
<noah> NW: On good days, you can almost imagine this is acceptable.
Noah: For running XQuery in the browser
LM: The thing that will cause change is serious users
Norm: Now that many browsers ship with XHTML support you can just use XHTML
Noah: People have different perspectives ... worried about different users
<Zakim> ht, you wanted to make the XSLT-in-the-browser poiint
ht: I'm concerned that people say that the XML to HTML problem is the same as anything to HTML
<masinter> xml & xslt use case is important
ht: so why do we have XSLT in the browser
... Use script tag to put not HTML stuff in HTML
<masinter> XML as constituted part
<Zakim> timbl, you wanted to ask about FBML
ht: what is the real substantive value of XML as how data gets on the web
Tim: Asks about FBML ... adds tags to HTML
<johnk> FBXML: http://developers.facebook.com/docs/reference/fbml/
Tim: Talk about lack of modularity in CSS
<masinter> many IETF specs use XML for interchange, and need presentation... would like to make sure those use cases are represented
Dan: Activity streams and other social network speca are XML-based
<masinter> XML + XSLT might be more important than XHTML?
Norm: XML has failed only in the client otherwise very useful and widely used
... some pressure to move to JSON
<DKA> Ostatus specification I mentioned: http://ostatus.org/sites/default/files/ostatus-1.0-draft-2-specification.html
<DKA> To be brought in as an input into http://www.w3.org/2005/Incubator/federatedsocialweb/
<masinter> (1) task force should agree to "change proposals" to HTML spec that encompass the proposed solutions as "best practice", perhaps by making reference to task force report.
<DKA> Leveraging (XML) activity streams spec: http://activitystrea.ms/
<masinter> (2) question about XML + XSLT vs. XHTML in priority
<Zakim> noah, you wanted to answer Henry
Noah: I don't think XSL will come and go because of the taskforce
... many apps would break if clients dropped support for it, at least in the immediate future.
<Zakim> masinter, you wanted to note that perspective, "best practice" recommendations are important
<noah> Noah: to be clearer, what I said is that XSLT won't go away in the browsers, and that's for the right reasons, I.e., it would break lots of existing deployed software if XSLT were removed.
<noah> Noah: maybe or maybe not there would be enough future value to motivate keeping it if there weren't such compatibilty issues, but I believe it will stay if only for compatibility, at least for awhile. Just my opinion..
LM: You will come up with best practices. These should be pointed to by the HTML spec
Norm: Do you think there is stuff in HTML spec that contradicts what the taskforce says? That would be interesting.
... and much tougher area
LM: Perhaps your charter should be: look at usecases and recommend best practices
Norm: I think I can get the taskforce to agree to that
<timbl> (Suppose you parse XML to a JS object not a dom .. how close is XML to JSON anyway? you have to decide whether element contents are going to be null or a string or list (mixed content)) Certainly the problem of mapping to RDF is a common problem, and a common mapping language would probably work.)
<ht> The XMLHttpRequest CR draft http://www.w3.org/TR/XMLHttpRequest/#document-response-entity-body does still 'privilege' XML, as parsed per the XML specs
Break for 20 minutes
Noah: Describes background of issue -- decentralized extensibility in HTML
... the HTML WG did a survey that was officially of the WG membership, but the TAG also sent a note.
<noah> HTML WG held a survey, TAG input at http://lists.w3.org/Archives/Public/www-tag/2010Oct/0033.html
<noah> HTML WG Chairs' decision: http://lists.w3.org/Archives/Public/public-html/2011Feb/0085.html
Noah: The chairs chose the "no changes" option
The note says they looked for evidence that decentralized extensibility was important and did not find enough
scribe: they will look at new evidence
<noah> The main decentralized extensibility issue is http://www.w3.org/html/wg/tracker/issues/41
<noah> There is also http://www.w3.org/html/wg/tracker/issues/120 on prefixing, especially for RDFa
scribe: they say use RDFa without prefix mechanism
Noah: Back to HTML WG issue 41
Working thru mail from HTML WG re. the decision
Noah: The TAG discussed all the proposals and decided to back the "like SVG" proposal
ht: It is a qualified version of the Microsoft proposal
Tim: Re. Uncontested Observations. We did not argue for removal of existing extensibility points
... existing extensibility points have serious architectural limitations
... <object> is horrible ... would not use this to add a new form of bold
LM: Users do often understand relation between prefixes and namespaces ... some may find this confusing
Dan: Maybe we should pick our battles with HTML WG
... put on our energies into the taskforce
JohnK: Not useful to go thru the email point by point
... we want ability to add attributes with prefixes without any approval
Tim: Some people argue that if you add a namespace that is bad
... they don't have a model of special user communities of browser users
JohnK: Asks whether architectural arguments are not self-evident
<Zakim> masinter, you wanted to talk about process
Could we just list these arguments
LM: I see no point in TAG responding to HTML WG at this point
... we can advise the Director how to respond to the appeal
... better to let the HTML document get to Last Call
<johnk> johnk's specific potential architectural issues "What we mean when we say distributed extensibility
<johnk> arguments for:
<johnk> * that it should be possible for anyone to define their own markup
<johnk> extensions (and the syntactic/semantic "meaning" of said extensions)
<johnk> without permission from anyone else
<johnk> * that we should encourage these extensions to be publicly (not
<johnk> "proprietarily") available without the permission of the HTML WG
<johnk> counter-argument: encourages proprietary extensions to HTML?
<Zakim> noah, you wanted to talk about possible response
LM: It is in their charter "encouraged to find extensibility mechanisms"
<masinter> "The HTML WG is encouraged to provide a mechanism to permit independently developed vocabularies such as Internationalization Tag Set (ITS), Ruby, and RDFa to be mixed into HTML documents. Whether this occurs through the extensibility mechanism of XML, whether it is also allowed in the classic HTML serialization, and whether it uses the DTD and Schema modularization techniques, is for the HTML WG to determine."
Noah: Worth looking at how much decentralized extensibility is already in the spec...
Noah: I think it allows decentralized extensibility
... what it does not have is a mechanism to avoid name collisions.
... If I come up with a new element I cannot put it in a namespace if I'm using the text/html serialization. I can write a document that is an "applicable specification" describing the element; if used, the element will appear in the DOM.
Noah: So, you do have distributed extensibility .... what you don't have a mechanism for preventing collisions
<Zakim> ht, you wanted to support the pick our fights proposition
Noah: So, if we can agree that's the state of play, then we can decide whether we wish to raise any additional concerns.
<masinter> and also http://tools.ietf.org/html/bcp125
ht: I agree with Dan and Larry in saying that there is no point in pursuing the opportunity for pushback that is in this note
<noah> Noah: you also are, and I can see the arguments on both sides of this, losing the ability to "follow your nose" to find the pertinent specs when some random document is encountered, and that document uses applicable specs. You can't in general find the specs from the document.
<noah> Noah: with namespaces, whatever their other problems, you can.
<Zakim> masinter, you wanted to talk about http://tools.ietf.org/html/draft-iab-extension-recs-05
LM: We could respond to IETF document on extensibility ... brings in a broader perspective
<noah> Hmm, Larry says HTML is a protocol "sort of". Well, yes sort of, but I'm more familiar with the "protocols & formats formulation". HTML is more a format, and I don't think the versioning considerations for formats are in general the same as for protocols.
LM: we could look at their arguments and see if they apply to HTML
... some new evidence to bear on the process
... Another related document
Noah: Looks like 4775 is recomending Registries
Discussion about registries
scribe: and whether they help ot hinder distributed extensibility
<noah> From: http://tools.ietf.org/html/rfc4775
<noah> " An extension is often likely to make use of additional values added
<noah> to an existing IANA registry (in many cases, simply by adding a new
<noah> "TLV" (type-length-value) field). It is essential that such new
<noah> values are properly registered by the applicable procedures,"
<masinter> the power struggle is part of it "who has control"
<masinter> but the power struggle is confounded by the technical issues
Discussion of how extensibility really works
LM: HTML decision narrow ... there were no acceptable proposals
Tim: We are trying to provide a solution for the little guy ... URLs are easy to mint
<Zakim> noah, you wanted to do a logistics & time check
<trackbot> ACTION-120 -- Dan Connolly to review of "Usage Patterns For Client-Side URL parameters" , preferably this week -- due 2008-03-20 -- CLOSED
Tim: create little community of browser users
<Zakim> ht, you wanted to mention the Accessibility parallel
<trackbot> ISSUE-120 does not exist
ht: Since HTML WG have resolved Issue 41 this can wait
... you can send mail asking if we can wait on 120
<ht> In terms of thinking about advising the Director as we come up to a Process milestone at which objections wrt DistrExtens may be on the agenda, Tim's point about standing up for the little guy reminded me of a possible parallel with I18N and Accessibility -- Director's Review is the point at which unrepresented consituencies are considered
<ht> Candidate small languages for use in distr. exten. : XForms, XMP, FBML (Facebook Markup Language, now deprecated), CML (Chemical Markup Language), [Music?]
<Norm> There is a music markup language, Michael Kay brought it up as an example
<ht> I think the plugin support is already there
<masinter> scribe: masinter
<scribe> scribenick: masinter
ht: What is goal of his activity?
noah: goal is to help this task force be successful
norm: want to go through use case in more detail
... if there are specific use cases that aren't satisfied, especially interesting
ht: how many such parsers are there?
norm: I believe there are 2 or 3. Henri in Java, Sam in Ruby, someone else....
ht: when I looked a few months ago, there was no tool that did what I needed, which were 'error recovery'
... this "Solution" is at least misleading. "Truth in advertising"
larry: Henry said he found NONE. If there is NONE, it might mean that it is impossible. A solution that requires something 'impossible' isn't a solution.
noah: if parsers are needed, then ones that are needed will get built.
johnk: there isn't enough need from stand-alone parsers, such as they are extractable from browsers.
tim: I rewrote problem statement, and edited it into the "Discussion" tag
<johnk> johnk: it hasn't yet been determined that there is enough need for a standalone HTML5 parser such that there is a clear need to separate it from other software (such as browser)
tim: I took out some of the derogatory comments that were garbage ("race to the top" vs. "race to the bottom")
... I would like a ringing endorsement of polyglot to come out of this task force.
norm: that isn't polyglot... the mapping of HTML into XML because there is an XML document that has the same DOM as the HTML
tim: the requirement to accept polylot on the priority
larry: there are really at least three very sub-categories here (HTML -> XMLO tool chain)
... (1) extract, analyze (2) round-trip (3) ...
norm: Use case 2: (looking at http://www.w3.org/wiki/Talk:HTML_XML_Use_Case_02)
tim: you need to put something in the examples to make it clear that this is not "XHTML" but XML in general, e.g., docbook
norm: not sure that this is a real use case, not a lot of enthusiasm for this
(looking at http://www.w3.org/wiki/Talk:HTML_XML_Use_Case_03 now)
larry: in #2, separate 'browser' from 'non-browser'
Examples are things like documentation
larry: copy/paste and clipboard thing is a separate use case
tim: I'm impressed that copy/paste from web to email works
... table from web page into mail message and it works
norm: I expect the techniques that it will let that work
... oxygen does a whole bunch of work to make that work
tim: thinking about the RDF case... you get a piece of HTML in the middle of RDF so that works
... if you do any form of escaping, in general there is no expectation that if you put some escaped CDATA in the XML that it has any meaning, and no expectation... this happens in RSS
norm: of the two, the escaped text is far less effective
... I noticed in the Twitter API that the identity of the submitter is escaped HTML
tim: Microsoft's odata ("almost linked data") when you get a feed it's an RSSFeed
ht: ((missed example))
<Norm> In Atom, HTML markup is sometimes escaped and sometimes not, using a type attribute to distinguish between them.
<ht> Is it expected that this will work: <object type="application/xml" data="data:,<hello xml:lang='en'>world</hello>" /> ?
noah: couldn't introduce a new tag other than 'script'
henry: in polyglot, need CDATA in script, if you need polyglot and use <> in script
or use data:application/xml,<hello ....
now looking at http://www.w3.org/wiki/Talk:HTML_XML_Use_Case_04
(discussion of XML5 document)
norm: XML community could take this up....
noah: discussion of robustness principle
... you should have the same burden to be conservative in what you said
dana: observation: people use string concatenation to produce HTML because to do otherwise wouldn't be satisfactory for performance reason... that's the implicit reason, and they are prone to error
tim: related use case: jQuery. jQuery allows you to parse .navigate + something that looks like xquery (it isn't xquery but looks like it, or css selectors) + insert things (looks like HTML), there is no reason that it actually could use implicit tags on close tags, they could do all kinds of things, the critical thing is to get the code to all fit on one line or one page
... in cases where people are stuffing strings in... for things that stuff in little bits of syntax (Turtle example), in those cases, it is a nice situation where xml tools could ive people an ability in their scripting
(have been looking at http://www.w3.org/wiki/HTML_XML_Use_Case_05)
now looking at http://www.w3.org/wiki/HTML_XML_Use_Case_06
"dead use case", a lot like use case 1
no one was prepared to stand up to do this
larry: separation between situations where things render, vs. things are auxiliary data
noah: what some subgroups don't like is "stop on first error"
...the XML Recommendation doesn't provide any interpretation or mappings for such documents, other than to say that they are indeed not well formed. One could image revisions to the XML Rec., or other specs, that would provide such interpretations, and that would support error recovery. I'm told that XML5 is an attempt in that direction.
larry: this is a kind of social engineering through spec writing that is difficult to accomplish without consensus on the goal and agreement to abide by it. Social engineering is to get senders to be conservative in what they send by having some conservative receivers that they are likely to test against.
larry: have to get agreement to do social engineering in the first place, and that the goal of having conservative senders is an important goal
noah: is it really doing the fixup you want or not?
... have the specs enable you to turn off when you want to
... how often or with how much noise or smoke would be a debate you'd have to have
... Keep in mind that a main goal for XML was for exchanging mission critical data — for that, silent recovery is not the right approach.
ht: in the first two years, the idea that we were building XML for machine-to-machine communication was not on the forefront. It was about getting information in front of humans, and the 'error handling' was there was because the arms race of forgiving viewers was harmful
... the motivation was to end the "arms race" of fixup by saying "no one will do fixup"
... that's opposite of what we're doing now, which is to say "everyone will do the same kind of fixup"
noah: could go to the community to see if there are some XML fixups that would be useful
ashok: ask the user, flag it, how aggressive a fixup, mash HTML5 fixup
peter: I have no problem with relaxing some of the rules of XML, but I wouldn't like to go all the way of tag fixup, such as happens in HTML. Leave XHTML being an XML application with all of the XML rules.
... all you're doing is allowing people to write bad XML
noah: will more people use this if we do this?
tim: too much of a pain typing the quotes around the attributes... some of those things where there is absoluetely no ambiguity, perhaps we could relax the rules.
noah: we should go only as far as necessary to get widespread adoption, vs. abandonment.
larry: 7 isn't really a use case, it's a proposed solution looking for use cases. my claim is that the proposal doesn't actually seem to solve any known problem
looking at about http://www.w3.org/wiki/HTML_XML_Use_Case_08
norm: this wasn't there earlier, should have been, because task force talked about it. "Right" answer is that XML tools should grow an HTML output method
(Larry points out again that 'round trip' is more than 'consume and produce' because round trip may have more requirements for preservation )
<DKA> Scribe: Dan
<DKA> ScribeNick: DKA
Norm: You're not likely to be cdata in script elements.
... it doesn't work if you use script elements...
Henry: A normal xml serializer would never use cdata sections...
... In all the use that many of us make of xmlspec dtd - you must use output-mode=html - because this produces <p></p> when you have empty paragraphs. Because if you produce <p/> this [messes up most browsers.]
<scribe> Scribe: masinter
<scribe> ScribeNick: masinter
noah: Norm, have you gotten useful feedback from us?
norm: I got useful feedback. I'll go back into the minutes, lots of cases for making use cases more detailed. No one has said I've gone off in all the wrong direction....
... the trajectory the task force is going to land, I have no idea what to do next....
<noah> LM: I think our role here is to figure out what the TAG should do given where the taskforce stands.
<noah> LM: I think part of our role is to help those who have a stake in XML to be more easily heard in this process. A lot don't feel they've been heard. These use cases are the vehicle.
<noah> LM: I can see that doing more can be frustrating, but I believe that someone has to do a lot more.
<noah> NW: I'm not at all unwilling to do more work, I do keep asking >what< you want me to do.
<noah> LM: I would ask Roy... (discussion tails off)
<noah> LM: Roy has an XML toolchain, and his review might be interesting.
<noah> NW: I'll break out the use cases and try to figure good candidates to provide feedback on each.
<noah> NM: You could somewhat publicly ask people for review.
<noah> NW: Prefer to do it after the report's a bit cleaner -- I don't want to be responsible for people misunderstanding the wiki in its current form
(discussion of process)
dka: in spirit of providing feedback, worth saying "kudos for doing this", amazing you've managed to make the progress you have
<noah> DKA: Major kudos to Norm for doing what is in many ways a thankless job. There's a lot of good progress here. I support publishing as a TAG note or something like that, once baked.
dka: Not only a browser group, to consider 'what changes should be considered for XML as well', we need to really believe that, to think about how this stuff could be put into place
norm: James did microXML and John Cowan has picked this up and is producing this group. Liam did agree to put something in XML Core that they may would add something into their charter revision about this.
... XML5 is an attempt to say how XML as it exists might work better, while MicroXML might be 'how to make XML smaller'; things like "namespaces aren't special"
... maybe James was thinking there might be some movement from the HTML side.
noah: how relevant will this be practically?
norm: microXML might be interesting, would like to know more what problems it solves
<Zakim> ht, you wanted to say something more about templating
ht: in terms of looking for concrete use cases, the phrase "templating" does describe some tooling that I've observed ... (XForms is a partial example of this), a successive refinement approach to producing web pages.
... there are some architecures out there that work that way... it's a mixture of HTML and proprietary markup, that push it through (not a pipeline, an interate-to-fixed-point processing step) until it gets to the point where there is nothing left but XHTML....
there is a requirement that HTML5 make it not any harder to produce (polyglot) HTML output that way than it is today
there are a lot of systems that now support IE6....
ht: maybe it is already the case that polyglot HTML5 is not harder than producing XHTML 1.1 polyglot
<ht> One example of this is the Factonomy (www.factonomy.com) Framework
<jar> on break now.
<trackbot> ISSUE-63 -- Metadata Architecture for the Web -- open
<trackbot> ACTION-282 -- Jonathan Rees to draft a finding on metadata architecture. -- due 2011-04-01 -- OPEN
jar: slide 6.... not getting consensus
... RDFa, tooling might be different, all the deployed stuff will be called into question
... slide 7 interoperability issue: same name used for two different things
... another example, 'wants'
ht: facebooks 'likes'... one person likes the page, one person likes the screwdriver
jar: creative commons 'licenses' is clearly a problem, 'likes' or 'wants' are less
... slide 9.... new uri scheme, foaf...
... slide 9 second line shows 6 alternatives for notation
<ht> (Discussion about RDF about="" and the status of Same Document Reference)
<noah> Hmm, from http://www.apps.ietf.org/rfc/rfc3986.html#sec-4.4
<noah> "When a same-document reference is dereferenced for a retrieval action, the target of that reference is defined to be within the same entity (representation, document, or message) as the reference; therefore, a dereference should not result in a new retrieval action. "
<noah> That doesn't quite say: "The null reference identifies the same resource as the URI used to retrieve the document." Sort of an odd construction. Why? Does this matter?
JAR: I think the best way to get consensus around this is to take it to REC track.... is this a task force thing? is it an objective?
<ht> Because not all s-d-rs are null references
tim: this broke out on the linked open data list
<noah> I'm not hung up on the null part, I'm hung up on the "target is defined to be within"
<noah> That doesn't say what the URI(s) identify.
<ht> Right -- the 'within' is there because the target of "#foo" is not the target of the base URI
<noah> Yes, but it doesn't mention the resource, it mentions the representation, which is very odd.
tim: linked open data list has many people who have joined recently. Looking at that, there was some real pain expressed ... when you are producing linked data for a bunch of abstract things, it's a pain to have to do 303 all the time, and using hash wasn't satisfactory
... two things to do, "Hash is beautiful", or "add a 208"
<noah> We don't usually say that a URI identifies something within the representation, except in very unusual edge cases.
<noah> (We do in particular cases where the media type spec says it does.)
<ht> Yes, that reference/resource distinction is not well-respected here
jar: the TAG should engage on the linked open data list, or invite them to discuss it on the TAG list
<Norm> Hashes are problematic if the number of items in the document is very large.
<ht> Let's look at HTTP-bis
<noah> But if it's not well respected, then what does the above mean?
<noah> More to the point, does it matter that we straighten this out in the context of the discussion that JAR is leading?
<ht> I don't think
<noah> Hmm. OK.
jar: is the tag willing to engage in good faith process intended to get editor's draft
<ht> This is the answer, noah: "When a same-document reference is dereferenced for a retrieval action"
<ht> retrieval actions _are_ about representations
ashok: there are other stakeholders
... I would like "those guys" part of the discussion
noah: I think Jonathan means "Recommendation"
<ht> I agree that "is within" is bad -- it should have used wording that said "is related to in the same way that a full use of the baseURI plus #... if any is related"
<noah> JAR: right Noah, I'm proposing a formal W3C Recommendation produced using the full W3C process
noah: we had agreed to push this forward as a Rec, and then dropped the ball?
(scribe uncertain what the topic is)
ht: we have precedent for issuing documents on the rec track. We should do that with the content Jonathan is presenting to us.
tim: question is, are there alternatives for solving the problem?
jar: there are three alternatives: engage on LOD, do an architectural rec, form a new working group
<noah> ACTION: Noah to figure out where we stand with http://www.w3.org/TR/2006/WD-namespaceState-20060329/ on the rec track [recorded in http://www.w3.org/2011/02/09-tagmem-irc]
<trackbot> Created ACTION-521 - Figure out where we stand with http://www.w3.org/TR/2006/WD-namespaceState-20060329/ on the rec track [on Noah Mendelsohn - due 2011-02-16].
<noah> ACTION-521 Due 2011-03-01
<trackbot> ACTION-521 Figure out where we stand with http://www.w3.org/TR/2006/WD-namespaceState-20060329/ on the rec track due date now 2011-03-01
<noah> HT: We should do an architectural rec.
larry: if the topic is as broad as JAR's presentation, i would favor a new working group
tim: the TAG could do a focused 'nut' of the core element of httpRange-14
noah: the right thing to do would be to set off on the road of doing that in the tag
... if this worth the effort at all, set off down the road to engage the right community, have to watch IP issues
... that's the place where they or we would go on
ashok: should this be a separate mailing list?
noah: at some point we should put out an announcement, hey we're working on this
<noah> Noah: Jonathan, are you willing to actually play the leadership role in taking this down a REC track.
<noah> JAR: Yes, if the group is willing to provide reviews, or at least stay out of the way.
JAR is showing draft which might become a rec
larry: I would be more comfortably with a working group with a charter around metadata architecture, partly because i know people i would like to get to participate, who would not follow a www-tag discussion
tim: (re jar slide 15) WebArch covers this
jar: someone else holding Nadia responsible for someone else using Dirk's URI referentially
... slide 16, (why these questions are useless)
... slide 17: segue to persistence
<trackbot> ACTION-201 -- Jonathan Rees to report on status of AWWSW discussions -- due 2011-01-25 -- PENDINGREVIEW
<trackbot> ACTION-201 -- Jonathan Rees to report on status of AWWSW discussions -- due 2011-01-25 -- OPEN
<noah> ACTION-201 Due 2011-03-07
<trackbot> ACTION-201 Report on status of AWWSW discussions due date now 2011-03-07
<trackbot> ACTION-478 -- Jonathan Rees to prepare a first draft of a finding on persistence of references, to be based on decision tree from Oct. F2F Due: 2010-01-31 -- due 2011-01-31 -- PENDINGREVIEW
jar: if you take the problem as a reference to a document, that reliably refers to some document, and you want it to work 100 years into the future....
... ... and you want that computational agent to be able to resolve it
ht: ... and the tree was an analysis of the failures?
jar: several functions: publisher producing the document; one who assigns identifier; one who archives the document for a long time; one who looks up a reference
... the 19th century view is that the description is written out in natural language (publisher, title, author, date), but "not machine friendly"
... if they're actionable, then someone can track these down
ht: the reliability of the citeseer parser for database is 70%
... datapoint... that's just correctly identifying what the parts are
<timbl> ... just parsing a reference
jar: Hybrid approach... is the hybrid approach good enough?
<Ashok> LM: dont like the term 'human-friendly' here
larry: (2) Hybrid is between (1) and all the rest
LM: "Not a URI" means a structured reference
... note there was early IETF work on "URC" which was attribute/value pairs for identiying
jar: if you write a URI, you have to have some faith that the scheme registrations are reliable
larry: date + URI (not embedded in a duri)
jar: (going through steps)
... "update all web clients" is a miracle
tim: you could install plugins in your client
lm: "not actionable" is "not actionable today"
tim: people will provide ways of resolving
ht: i own a couple of the domain names necessary for 'info' to be dereferenced
larry: note there were urn resolution protocols
jar: lsid was another example, it was never maintained
larry: xmp.iid and xmp.did in http://www.adobe.com/content/dam/Adobe/en/devnet/xmp/pdfs/DynamicMediaXMPPartnerGuide.pdf#page=19
jar: whether the http: scheme as specified is suitable for this purpose
... in the case where persistence matters, you can trust the domain owner
larry points to http://larry.masinter.net/9909-twist.pdf
<noah> Jonathan is discussing: http://www.w3.org/2001/tag/2011/02/intervention.html
jar: was on the phone two weeks ago with Dan Connolly on "ownership"
larry: Jefferson's Moose book has an interesting history about top level domain ownership
... see http://jeffersonsmoose.org/
noah: (discussion about security, DNS cache poisoning, etc.)
larry: you've identified several different roles, and each node in the tree needs to be evaluated around impact to those roles... may need to also add 'bad guys' and other players
<ht> Re the earlier aside about info:, when I explored this and its proposed (partial?) resolution mechanism, I discovered a) a dependence on certain sub-domains of the info TLD and b) the fact that several of these were either un-'owned' or in non-appropriate hands. Since then I have 'owned' lccn.info and oclcnum.info, having unsuccessfully tried to get Stu Weibel to take them on
<ht> My registration of them expires again in a few months. . .
jar: what matters is the person who writes a URI, and the person who wants to read the document, and everything else is infrastructure
larry: archivist is necessary and sufficient.... that is, if there are no archives, having long-term identifiers aren't very useful; if there is an archive, then whatever they are doing can be used for long-term identifier
ht: this might turn into a requirement for infrastructure
jar: hypothesis: it would make a difference to make the DNS root manager to admit that some part of the DNS space had some kind of persistence characteristic, or contractually held to
tim: one way to abandon DNS is to set up an alternative root
jar: then you have to convince the entire world to use that alternate root. There is no communication between Alice and Bob to indicate that they use that alternative root, unless you use another URI scheme
tim: if it's just insurance, you could make a file, and distributed by bittorrent...
jar: what if ICANN agrees that '.arc' is agreed to be (something)
... what else do i need to add to this story for the next draft
ht: I need to take the old document to see if the risks it identifies and the goals are all covered here
jar: there are lots of ways of bailing out of this?
ht: information sicence communities have different attitudes to doi
tim: what's interesting, what you want is security in the long term, having more than one solution in parallel is interesting
jar: i imagine some kind of metadata lo
<ht> LM: Put a GUID in the document, and let search be the retrieval mechanism
<ht> JAR: Vulnerable to spoofing
<ht> HST: Use a checksum
<ht> LM: Right, use MD5 as the GUID
<ht> HST: What does the URI look like
lm: every administrative system ends
jar: the binomial system has had, in 250 years, only 10 disputes
(discussion of conflicts over defining documents for species)
noah: (banking systems -- there's a method of correcting anything that is wrong)
jar: my point is that there are systems that are relatively free of authority, that are outside of any system of authority
<ht> I note that the pblm with using a checksum is that it violates a fundamental principle of archiving, which is to keep your content usable by rolling it forward
<ht> In the old days, that meant from paper to microfilm to microfiche
<ht> now it means electronic format evolution
<trackbot> ACTION-478 -- Jonathan Rees to prepare a first draft of a finding on persistence of references, to be based on decision tree from Oct. F2F Due: 2010-01-31 -- due 2011-01-31 -- PENDINGREVIEW
<jar> masinter said " I don't think a system can be simultaneously X, Y, and scalable"
lm: administrative, scalable, and stable
... the bigger it is, the more likely it is it will fail sooner
<trackbot> ACTION-478 -- Jonathan Rees to prepare a second draft of a finding on persistence of references, to be based on decision tree from Oct. F2F Due: 2010-01-31 -- due 2011-01-31 -- OPEN
<jar> masinter, you have just restated zooko's triangle http://en.wikipedia.org/wiki/Zooko's_triangle
<lm> jar, no, zooko's triangle is 'secure, memorable, global' and that's a different set of things
<lm> jar, mine is: "requires administration" and "scalable" => "not reliable"
<jar> bitcoin might show a way to escape it (I'm told... need to research this)
<noah> ACTION-478 Due 2011-03-22
<trackbot> ACTION-478 Prepare a second draft of a finding on persistence of references, to be based on decision tree from Oct. F2F Due: 2010-01-31 due date now 2011-03-22
<trackbot> ACTION-477 -- Henry S. Thompson to organize meeting on persistence of domains -- due 2011-03-15 -- OPEN
<noah> HT: Leave it, still working on it.
Noah: RESOLUTION: The June F2F will be in Cambridge 6-8 June 2011
Noah: So, the meeting that was to have been 5-7 June 2011 is now scheduled 1 day earlier.