W3C Technical Architecture Group Face to Face Meeting - 1st June 2007 (Friday am) -- 1 Jun 2007

<edit> ScribeNick: Rhys

Agenda

SW: Describes agenda

NM: Half an hour for Virtual worlds discussion

SW: Asks about versioning

Self Describing Web

NM: Asks that people read the draft now

<edit> ScribeNick: _Rhys

<Noah> We are going to discuss draft on Self-Describing Web at http://www.w3.org/2001/tag/doc/selfDescribingDocuments.html

<Noah> For the record, the dated URI for this is: http://www.w3.org/2001/tag/doc/selfDescribingDocuments-2007-05-24.html

<DanC_lap> superfluous comma. s/information,/information/

<DanC_lap> (self-reference in the abstract... I've never been fond of that. "This finding ...")

<DanC_lap> hmm... "used should themselves be described in machine readable form" ... I was expecting that the examples that followed would be turing-complete stuff, like browser extensions or plug-ins. hmm.

<timbl> Grammar: all bullets should match te same production. "Supporting ad-hoc exploration is a goal of the Web." is not a senetnce but the ones befroe and above are

<timbl> (label point 3 into "context independence"?

<timbl> )

<DanC_lap> "This finding addresses TAG issue XXXX (to be opened)" hmm... who is our customer here? who are we advising/helping?

<timbl> s/What may be less clear is that//

<DanC_lap> "The Web is global" ... hmm... the web scales up to global scale, and there is one global web, but the web also scales down, and [the rest of this comment is easier said with a whiteboard]

<DanC_lap> under "three characteristics that distinguish it from many other shared information spaces", I don't see how the 3rd one is novel to the web

<dorchard> There's an implication that XML without namespaces is not self-describing, yet that was alleged of XML from the start. Perhaps self-describing but not on the self-describing web?

<Rhys> RL: Not sure that the good practice in section 2 automatically follows from the previous paragraph. It feel like I want to say that just using widely deployed standards doesn't automatically guarantee the kind of semantic completeness you desire

<timbl> "Furthermore, when such documents are linked together, the Web as a whole can support reliable, ad hoc discovery of information." is great in the abstract but non sequitur in the intro.

<DanC_lap> s/It seems fairly obvious that//

<Rhys> RL: Section 3 seems to have echoes of the "top down interpretation of XML docs" from Henry's elaboration of infosets paper

<timbl> Section 2. Scond para wanders off i think, diverts us fro the path.

<Rhys> RL: last para of 3 s/discusses/discuss

<DanC_lap> the 1st GPN, "... widely deployed standards." seems out of place. The para above it argues for good titles and such.

<DanC_lap> it might fit better after "The simplest way to achieve this is if the document is encoded using widely deployed standards and conventions."

<timbl> I would prefer, rather than using english and words and 26 chars, to give examples of actual formats used on the web and stabdradized by W3C, such as HTML, SVG, PNG, (JPEG), RDF.

<DanC_lap> "bits (octets) " er... rather be "bytes (octets)"?

<Rhys> RL: Section 4 beginning "The RDDL document in turn... " I merely note that processing instructions in XML documents can achieve processing at the user agent and wonder if there is anything to say here about that or to contrast the use of GRDDL for this as opposed to XML PI.

<timbl> For "The Content-Type header is generally the appropriate means " write "The Content-Type header is THE ONE AND ONLY means"

<DanC_lap> in section 3, I was pretty lost until I got to "In short, a user agent can work step by step, starting with knowledge of the HTTP protocol and its headers, to determine the full intended interpretation of this example representation. ". I suggest moving that up, and maybe using an itemized list or something.

<Rhys> RL: Good practice on RDFa: Some language designers [:-)] might want to do something similar with RDFa for new languages. Can we extend the practice or add another one to cover that?

<timbl> I don't feel that this finding makes it clear that there is ONE defined anc ommon algorithm for following these steps of dereferencing a URI.

<DanC_lap> well, not THE ONE AND ONLY means; it's the standard means. I think it's worth putting some more orange cones in our media type finding to acknolwedge certain exceptions.

<timbl> (One constantly changing)

<Rhys> RL: Good practice on using GRDDL with XML: Where an XML language (XHTML 2 family for example) already has facilities for explicitly linking semweb information we perhaps don't need to specify GRDDL

<DanC_lap> this GPN relies too much on context; it's hardly worth putting in a box: "Web resource representations SHOULD, to the extent practical, be self-describing."

<Noah> s/mandate GRDDL/specify GRDDL/

<DanC_lap> "Dynamic discovery of specifications is necessary because of the ever changing nature of the information on the Web" seems like the conclusion of an argument, but it comes at the beginning of section 4

<timbl> After section 4, para 1, a para would be useful that says that in these cases, it is good to use a common data model and syntax, RDF, so that the custom-specific knowledge necessary is delegated to an ontologty, and much of the serializing and deserializing and even querying and visualization can be done by generic systems.

<timbl> I want a diagram in here

<DanC_lap> I prefer to not refer to the group except in the status section. "The W3C TAG is currently working on a finding that will ..."

<DanC_lap> I think in the 2001/vcard-rdf/3.0# namespace, fn is capitalized.

<timbl> "A user agent processing an XML document can retrieve representations of the namespaces used in that document, and can use that retrieved information to determine how to correctly process the XML markup. " MUST mention heer that you sould cache stuff for very long times, and not look up every time

<timbl> "Most likely, the finding will recommend the use of [RDDL] as a preferred means of providing machine readable documentation of namespaces. " EXCEPT FOF RDF SYSTEMS

<timbl> s/in an important sense, //

<timbl> Refer to the N3 primer I wondr as well as the RDF one? hmmmm

<timbl> s/(typically the value/(the value/

<timbl> s/means of describing particular uses of RDF/means of describing particular predicates and classes/

<timbl> needs an into to what a class is too i think

<timbl> s/sameAS/sameAs

<timbl> Example here: Painter and Creator.

<timbl> Drk/Nadia example: Norm and are on the hook for te lonhg version f that

<timbl> could ref it and summarize

<timbl> it

<Noah> OK

<timbl> s/self-describing HTML/data in HTML

<edit> ScribeNick: Rhys

NM: First question is is this approach ok, and we just need to pick through the details? Then also could we pick out the comments in the IRC log

SW: suggests a quick trip round the table for initial impressions

NM: Would like not to get rat-holed

DC: Not sure who the audience is. The 'boring first part' is too long and there is not enough in the second part

Dan, could you write what you'd like to see in the minutes for that?

TBL: First part is less interesting because its too simple. Don't worry about using english, forcus on the standards of the web.

<DanC_lap> I wrote more above. What should go in the minute is what the meeting heard, not what I said or meant to say.

TBL: First part is less interesting because its too simple. Don't worry about using english, forcus on the standards of the web. Diagram would be good.

<DanC_lap> (TimBL just said "this is for web devopers" which is starting to answer my "who is our audience" question, but isn't specific enough)

NM: Is the split ok? 1- standards are good. 2- there is an algorithm

TBL: Chapter 4: Ok, makes sense. In RDF you should give an example of simple inference. TBL describes a possible example

TVR: Nothing much to add

<timbl> Like using Inverse Functional property.

SW: Agree with Dan. Not sure about the message and audience.

NM: Do you mean that you are not clear about why we are doing this finding

SW: Yes
... Worry about the particular use of MUST, SHOULD etc. in the good practices. Looks like there are no principles at the moment
... Not all GPNs are actually good practices

<timbl> - This expains how the web works. It explains how to extend it , where the flexibility points are. You intriduce an new (uri scheme, conetn type, encoding type, xml namespace, RD ontology) ... should understand afetr readng the document that it is better to do th higher level things than the lower.

TVR: Would like to see some coverage of Atom publishing protocol and how Google uses them
... For rels that are defined, they use the standard ones. For Google-specific ones, they use qualified rels

SW: sec 4.2 Good practice about contributing to the self describing web doesn't sound like a GPN
... Doesnt sound like a practice

NM: Thought we had others like this
... Agrees its a problem.

<raman> for a GData specific term, it uses:

<timbl> - This document also documents flexibilit points (rel-nofollow) which are not delf-describing or whos use is not standard and well-docuemted or universally deployed. Also classes in micrpfromats

HT: I found the NM example unhelpful in the first couple of sections.
... However an img/svg example in a few lines could help by showing what else there could be

HT: It's in section 3
... Term qualified name is dodgy, expanded names would be better. Could also trim the tutorial nature for this part

NM:

<DanC_lap> TVR's example is a *counter-example*. Google doesn't provide representations for http://schemas.google.com/g/2005

HT: Basically this is fine, and there are some changes needed.

<raman> I know we provide the schemas somewhere;-) but we dont provide it on the namespace url

HT: 4.2 First sentence true only because of the word data. It's too small and I missed it! Think it needs more emphasis by editorial methods.

<DanC_lap> (noah is taking notes-to-self; I'd appreciate a copy of those, perhaps for the meeting record)

HT: End of 4.3 GPN about RDF seems to conflict with earlier statements about HTML being self-describing. Title may be misleading

<Stuart> From earlier in the log: RL: Not sure that the good practice in section 2 automatically follows from the previous paragraph. It feel like I want to say that just using widely deployed standards doesn't automatically guarantee the kind of semantic completeness you desire

DO: Agreed with previous comments on 2 and 3

<DanC_lap> ("the algorithm" is in /TR/webarch already, no?)

DO: Having 'the' algorithm in section 3
... IN self describing XML documents (sec 4) seemed interesting that self desc XML seems to need namespaces.

<DanC_lap> (indeed, <part-number> isn't self-describing in the sense of Web Architecture)

DO: Should say whether or not this is true. Dan said that XML wasn't self-describing, it was self-similar

NM: For the definition of self describing that I've used in this doc, I don't think that XML is self describing

DO: Then I think you should define that. Based on the explanation, I might still have a problem with that
... Other readers might too
... Interesting that RDDL is emphasized, because of lack of deployment

NM: what I hear you say that RDDL is not essential for self description. There may be other ways. Describe that it should be true, and then describe RDDL as a way to do it, but it might not be the only way.

TBL: Reasonable to show the example.

NM: Position the RDDL as a way to achieve it

DO: Yes

TBL: But not for RDF

NM: This is the subject of namespace-8

TBL: But you need to have a coherent story here
... Describes the technology approach

NM: Describes the way that namespace-8 and this finding fit together

DO: 4.2 I have a problem about RDF playing the preferred role for self describing web is not something I can live with
... There is a ton of stuff out there for self description based on XML and schema

TBL/TVR/DO discuss merits of XML/Schema and RDF approaches to self description

NM: I think I had a request from the TAG to ensure that use of triples was the preferred way to do self description

<timbl> TBL: For RDF systems, the namespace documents are in RDF.

<DanC_lap> (I don't remember these instructions to noah; I'd appreciate a pointer to a record)

NM: I'd like a clear instruction about this from the TAG as a whole, because I'm hearing both.

TBL: Why not actually give the story of both RDF and XML/Schema and show the pros and cons

<DanC_lap> (a microformats example would be a counter-example.)

DO: 4.3 and RDFa was helpful, and I'm looking forward to a RDDL example and a microformats example
... We do need to say something about microformats and that they probably are ok.

TVR: Need to factor out the business of triples from the business of getting them into bits

<timbl> For any level of (meta^n)discusssion, it is (sometimes^n) necessary to have a(meta^(n+1)) discussion.

TVR: Problem is that when you say RDF, people here RDF-XML and all the baggage rather than triples

NM: Seems to be that most of the room says that it's the right story but there are problems, rather than being the wrong message

SW: What is the message of the document?

NM: The web aspires to achieve certain things about exploring and combining information in various ways, and that to do this a combination of linking and self description is very valuable.
... Will describe this and show how it is possible to do this.

<Zakim> DanC_lap, you wanted to note that RDFa is only consistent with a self-describing web if the HTML spec is rev'd to reference it.

TBL: Propose that the goal should be that at the end the reader should know to use self description and to use high level not low level descriptions

DC: Who are we writing for?

TBL: People about to write new ontologies, people about to create a new rel='...'
... Extend by making a new ontology rather than a new XML language.

HT: That's Tim's comment but I would be unhappy if that is how it makes it into the document because it's too broad brush

NM: I thought the audience was those plus almost anyone publishing content on the web.

DC: Microformats community says don't reinvent the wheel, and we should treat them as part of the audience
... Agree with their view of not reinventing the wheel, but not how they achieve that.
... The writing won't appeal to that kind of audience

DO: There is an old-school view of how the web works now, and also the new-school view of where this is going in future

NM: I have a view that the audience is very broad. Everyone creating a page should be asking themselves whether they should be adding RDFa to a page they are creating

DC: But virtually no one actually even thinks about whether they should put lang=en on pages

NM: Ok, actually I took it as implicit that this happens via the tools creators
... Objective is to get the content on the web in general better.

DC: Maybe the term self-describing could get in the way.
... The clearest way to talk about this might be from the point of URI. Microformats don't do this. Others do.

SW: When we get to unicode, what counts as self describing at these very fine grained levels.

General agreement that there will be a point at which everyone just has to agree

<DanC_lap> (saying it [don't label it text/html] again here is... maybe counter-productive)

NM: Sometimes, the answer is that the stuff is so simple that you just know. These low level formats are the bootstrap points for the use of these higher level technologies
... Intentially not trying to do all the turtles. At some point you have to say 'let there be turtles'

HT: TBL said that you should create an on
... I'd be unhappy to see this go into the document. There are situations where creating a new XML language is the right thing to do

TBL: Sometimes there are benefits in creating new languages, but where a new language fits the RDF data model you should do an ontology

NM: So the interesting question is when to do XML vs. RDF and is the XML/GRDDL approach second class or is it a first class answer

TBL: Second class

<DanC_lap> [-- discussion is curtailed, noting substantial disagreement -- ]

HT: Thinks that this is a genuine disagreement point between us.
... Example is xproc specification. Reason is that there are lots of tools out there for it and clean mechanism for describing what is and is not valid

DC: All true for RDF

HT: Not true. 1) can't use an XSLT, 2) Can't prevalidate (e.g. via pellet) and 3) Can't use an XML parser as a front end to the implementation
... RDF syntax is very flexible, which is good, but it prevents you writing a stylesheet

TBL: why would you want to do that

HT: So that a human being can get a readable version of the information tailored to that information and audience
... Can produce and SVG diagram from an xproc, which is hugely valuable. Possible with a constrained XML vocabulary but not with a particular RDF graph

TBL: Writes a table of xml vs RDF and shows equivalent of parsing, stylesheet and xslt for the two technologies

Scribe notes that we need a picture of the table that is on the board

TBL: describes the table

HT: So does that mean the W3C should never create a new markup language?

TBL: No, there are cases where new markup should be used.

HT: I think xproc is an example. For example because sequence is important and

SW: adds treehugger and rdftwig as entries and DC moves them to stylesheet row

TBL: asks if DC knows of an XSLT with SPARQL embedded

NM: Want to add to the list of reasons for using XML. Today, it could be a question of cost, since the XML parsers and processors are available today. This might tempt people to use XML today. Not offering an opinion, just that if a customer asked today, the cost is something that needs to be considered.
... Also the tools are production quality today.

<DanC_lap> [yes, I think it would have been better for XML Schema to follow DSD, i.e. the RDF-based submission, though the cost of RDF was pretty high back then. It would have been premature standardization. But heck... that didn't seem to stop XSD in many other ways.]

NM: I would like balanced input on this particular topic. However, I did hear two people who seemed not to see value in proceeding. So should we continue with this?

DC: I just want to pick an audience and focus on them. Needs surgery before I would hand it to the audience I envisage

NM: I think the audience I envisage is broader than the one Dan envisages
... I would add a comment to that effect to the document

DC: How do we get to them?

NM: we go to the product groups in the big companies who create the tools that create this stuff.

SW: Want a pithy message, but didn't find it.

NM: Do you think its in there but unclear, or that it just is not there?

SW: I think its encouragement to use self describing approaches

NM: About right and then to give the specific guidance.

TBL, SW, RL, confirm that they think NM should continue

DO: Would like to see a bit more on XML in this finding.

NM: Can you give an example?

DO: Take an example and show how to progress it using both GRDDL and XML

NM: I think that a lot of people would see that as trying to help the RDF community.

DO: could be that the level of self description in RDF is deeper than in XML
... I see more in the table around programming languages rather than just XSLT, and so the happy face should be a happy face.

<DanC_lap> indeed, the main point of this finding should be uri-based extensibility vs just using <part-number />

TBL: Microformats have to be well known to be useful

NM: Microformats go through a single distribution point.

DC: Draft doesn't say that RDFa will become part of the HTML specification.

NM: Should I write about both paths?

DC: Yes

<DanC_lap> 2nd the proposal to thank the hosts

SW: Extend our thanks to Google for hospitality and meeting support

Applause

SW: Thanks to Raman

<Norm> Safe travels everyone

<ht_google> rssagent, bye

- DRAFT -

W3C Technical Architecture Group Face to Face Meeting - 1st June 2007 (Friday am)

1 Jun 2007

Attendees

Contents

Agenda

Self Describing Web

Summary of Action Items