See also: IRC log
<edit> ScribeNick: Rhys
SW: Describes agenda
NM: Half an hour for Virtual worlds discussion
SW: Asks about versioning
NM: Asks that people read the draft now
<edit> ScribeNick: _Rhys
<Noah> We are going to discuss draft on Self-Describing Web at http://www.w3.org/2001/tag/doc/selfDescribingDocuments.html
<Noah> For the record, the dated URI for this is: http://www.w3.org/2001/tag/doc/selfDescribingDocuments-2007-05-24.html
<DanC_lap> superfluous comma. s/information,/information/
<DanC_lap> (self-reference in the abstract... I've never been fond of that. "This finding ...")
<DanC_lap> hmm... "used should themselves be described in machine readable form" ... I was expecting that the examples that followed would be turing-complete stuff, like browser extensions or plug-ins. hmm.
<timbl> Grammar: all bullets should match te same production. "Supporting ad-hoc exploration is a goal of the Web." is not a senetnce but the ones befroe and above are
<timbl> (label point 3 into "context independence"?
<timbl> )
<DanC_lap> "This finding addresses TAG issue XXXX (to be opened)" hmm... who is our customer here? who are we advising/helping?
<timbl> s/What may be less clear is that//
<DanC_lap> "The Web is global" ... hmm... the web scales up to global scale, and there is one global web, but the web also scales down, and [the rest of this comment is easier said with a whiteboard]
<DanC_lap> under "three characteristics that distinguish it from many other shared information spaces", I don't see how the 3rd one is novel to the web
<dorchard> There's an implication that XML without namespaces is not self-describing, yet that was alleged of XML from the start. Perhaps self-describing but not on the self-describing web?
<Rhys> RL: Not sure that the good practice in section 2 automatically follows from the previous paragraph. It feel like I want to say that just using widely deployed standards doesn't automatically guarantee the kind of semantic completeness you desire
<timbl> "Furthermore, when such documents are linked together, the Web as a whole can support reliable, ad hoc discovery of information." is great in the abstract but non sequitur in the intro.
<DanC_lap> s/It seems fairly obvious that//
<Rhys> RL: Section 3 seems to have echoes of the "top down interpretation of XML docs" from Henry's elaboration of infosets paper
<timbl> Section 2. Scond para wanders off i think, diverts us fro the path.
<Rhys> RL: last para of 3 s/discusses/discuss
<DanC_lap> the 1st GPN, "... widely deployed standards." seems out of place. The para above it argues for good titles and such.
<DanC_lap> it might fit better after "The simplest way to achieve this is if the document is encoded using widely deployed standards and conventions."
<timbl> I would prefer, rather than using english and words and 26 chars, to give examples of actual formats used on the web and stabdradized by W3C, such as HTML, SVG, PNG, (JPEG), RDF.
<DanC_lap> "bits (octets) " er... rather be "bytes (octets)"?
<Rhys> RL: Section 4 beginning "The RDDL document in turn... " I merely note that processing instructions in XML documents can achieve processing at the user agent and wonder if there is anything to say here about that or to contrast the use of GRDDL for this as opposed to XML PI.
<timbl> For "The Content-Type header is generally the appropriate means " write "The Content-Type header is THE ONE AND ONLY means"
<DanC_lap> in section 3, I was pretty lost until I got to "In short, a user agent can work step by step, starting with knowledge of the HTTP protocol and its headers, to determine the full intended interpretation of this example representation. ". I suggest moving that up, and maybe using an itemized list or something.
<Rhys> RL: Good practice on RDFa: Some language designers [:-)] might want to do something similar with RDFa for new languages. Can we extend the practice or add another one to cover that?
<timbl> I don't feel that this finding makes it clear that there is ONE defined anc ommon algorithm for following these steps of dereferencing a URI.
<DanC_lap> well, not THE ONE AND ONLY means; it's the standard means. I think it's worth putting some more orange cones in our media type finding to acknolwedge certain exceptions.
<timbl> (One constantly changing)
<Rhys> RL: Good practice on using GRDDL with XML: Where an XML language (XHTML 2 family for example) already has facilities for explicitly linking semweb information we perhaps don't need to specify GRDDL
<DanC_lap> this GPN relies too much on context; it's hardly worth putting in a box: "Web resource representations SHOULD, to the extent practical, be self-describing."
<Noah> s/mandate GRDDL/specify GRDDL/
<DanC_lap> "Dynamic discovery of specifications is necessary because of the ever changing nature of the information on the Web" seems like the conclusion of an argument, but it comes at the beginning of section 4
<timbl> After section 4, para 1, a para would be useful that says that in these cases, it is good to use a common data model and syntax, RDF, so that the custom-specific knowledge necessary is delegated to an ontologty, and much of the serializing and deserializing and even querying and visualization can be done by generic systems.
<timbl> I want a diagram in here
<DanC_lap> I prefer to not refer to the group except in the status section. "The W3C TAG is currently working on a finding that will ..."
<DanC_lap> I think in the 2001/vcard-rdf/3.0# namespace, fn is capitalized.
<timbl> "A user agent processing an XML document can retrieve representations of the namespaces used in that document, and can use that retrieved information to determine how to correctly process the XML markup. " MUST mention heer that you sould cache stuff for very long times, and not look up every time
<timbl> "Most likely, the finding will recommend the use of [RDDL] as a preferred means of providing machine readable documentation of namespaces. " EXCEPT FOF RDF SYSTEMS
<timbl> s/in an important sense, //
<timbl> Refer to the N3 primer I wondr as well as the RDF one? hmmmm
<timbl> s/(typically the value/(the value/
<timbl> s/means of describing particular uses of RDF/means of describing particular predicates and classes/
<timbl> needs an into to what a class is too i think
<timbl> s/sameAS/sameAs
<timbl> Example here: Painter and Creator.
<timbl> Drk/Nadia example: Norm and are on the hook for te lonhg version f that
<timbl> could ref it and summarize
<timbl> it
<Noah> OK
<timbl> s/self-describing HTML/data in HTML
<edit> ScribeNick: Rhys
NM: First question is is this approach ok, and we just need to pick through the details? Then also could we pick out the comments in the IRC log
SW: suggests a quick trip round the table for initial impressions
NM: Would like not to get rat-holed
DC: Not sure who the audience is. The 'boring first part' is too long and there is not enough in the second part
Dan, could you write what you'd like to see in the minutes for that?
TBL: First part is less interesting because its too simple. Don't worry about using english, forcus on the standards of the web.
<DanC_lap> I wrote more above. What should go in the minute is what the meeting heard, not what I said or meant to say.
TBL: First part is less interesting because its too simple. Don't worry about using english, forcus on the standards of the web. Diagram would be good.
<DanC_lap> (TimBL just said "this is for web devopers" which is starting to answer my "who is our audience" question, but isn't specific enough)
NM: Is the split ok? 1- standards are good. 2- there is an algorithm
TBL: Chapter 4: Ok, makes sense. In RDF you should give an example of simple inference. TBL describes a possible example
TVR: Nothing much to add
<timbl> Like using Inverse Functional property.
SW: Agree with Dan. Not sure about the message and audience.
NM: Do you mean that you are not clear about why we are doing this finding
SW: Yes
... Worry about the particular use of MUST, SHOULD etc. in the good
practices. Looks like there are no principles at the moment
... Not all GPNs are actually good practices
<timbl> - This expains how the web works. It explains how to extend it , where the flexibility points are. You intriduce an new (uri scheme, conetn type, encoding type, xml namespace, RD ontology) ... should understand afetr readng the document that it is better to do th higher level things than the lower.
TVR: Would like to see some coverage
of Atom publishing protocol and how Google uses them
... For rels that are defined, they use the standard ones. For
Google-specific ones, they use qualified rels
SW: sec 4.2 Good practice about
contributing to the self describing web doesn't sound like a
GPN
... Doesnt sound like a practice
NM: Thought we had others like
this
... Agrees its a problem.
<raman> for a GData specific term, it uses:
<timbl> - This document also documents flexibilit points (rel-nofollow) which are not delf-describing or whos use is not standard and well-docuemted or universally deployed. Also classes in micrpfromats
<raman> <category scheme='http://schemas.google.com/g/2005#kind' term='http://schemas.google.com/photos/2007#album'>
<raman> </category>
HT: I found the NM example unhelpful
in the first couple of sections.
... However an img/svg example in a few lines could help by showing
what else there could be
<raman> For a link tag to a feed we use <link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href=' http://picasaweb.google.com/data/feed/api/user/tv.raman.tv/albumid/5067768426093754961'>
<raman> </link>
HT: It's in section 3
... Term qualified name is dodgy, expanded names would be better.
Could also trim the tutorial nature for this part
NM:
<DanC_lap> TVR's example is a *counter-example*. Google doesn't provide representations for http://schemas.google.com/g/2005
HT: Basically this is fine, and there are some changes needed.
<raman> I know we provide the schemas somewhere;-) but we dont provide it on the namespace url
HT: 4.2 First sentence true only because of the word data. It's too small and I missed it! Think it needs more emphasis by editorial methods.
<DanC_lap> (noah is taking notes-to-self; I'd appreciate a copy of those, perhaps for the meeting record)
HT: End of 4.3 GPN about RDF seems to conflict with earlier statements about HTML being self-describing. Title may be misleading
<Stuart> From earlier in the log: RL: Not sure that the good practice in section 2 automatically follows from the previous paragraph. It feel like I want to say that just using widely deployed standards doesn't automatically guarantee the kind of semantic completeness you desire
DO: Agreed with previous comments on 2 and 3
<DanC_lap> ("the algorithm" is in /TR/webarch already, no?)
DO: Having 'the' algorithm in section
3
... IN self describing XML documents (sec 4) seemed interesting
that self desc XML seems to need namespaces.
<DanC_lap> (indeed, <part-number> isn't self-describing in the sense of Web Architecture)
DO: Should say whether or not this is true. Dan said that XML wasn't self-describing, it was self-similar
NM: For the definition of self describing that I've used in this doc, I don't think that XML is self describing
DO: Then I think you should define
that. Based on the explanation, I might still have a problem with
that
... Other readers might too
... Interesting that RDDL is emphasized, because of lack of
deployment
NM: what I hear you say that RDDL is not essential for self description. There may be other ways. Describe that it should be true, and then describe RDDL as a way to do it, but it might not be the only way.
TBL: Reasonable to show the example.
NM: Position the RDDL as a way to achieve it
DO: Yes
TBL: But not for RDF
NM: This is the subject of namespace-8
TBL: But you need to have a coherent
story here
... Describes the technology approach
NM: Describes the way that namespace-8 and this finding fit together
DO: 4.2 I have a problem about RDF
playing the preferred role for self describing web is not something
I can live with
... There is a ton of stuff out there for self description based on
XML and schema
TBL/TVR/DO discuss merits of XML/Schema and RDF approaches to self description
NM: I think I had a request from the TAG to ensure that use of triples was the preferred way to do self description
<timbl> TBL: For RDF systems, the namespace documents are in RDF.
<DanC_lap> (I don't remember these instructions to noah; I'd appreciate a pointer to a record)
NM: I'd like a clear instruction about this from the TAG as a whole, because I'm hearing both.
TBL: Why not actually give the story of both RDF and XML/Schema and show the pros and cons
<DanC_lap> (a microformats example would be a counter-example.)
DO: 4.3 and RDFa was helpful, and I'm
looking forward to a RDDL example and a microformats example
... We do need to say something about microformats and that they
probably are ok.
TVR: Need to factor out the business of triples from the business of getting them into bits
<timbl> For any level of (meta^n)discusssion, it is (sometimes^n) necessary to have a(meta^(n+1)) discussion.
TVR: Problem is that when you say RDF, people here RDF-XML and all the baggage rather than triples
NM: Seems to be that most of the room says that it's the right story but there are problems, rather than being the wrong message
SW: What is the message of the document?
NM: The web aspires to achieve
certain things about exploring and combining information in various
ways, and that to do this a combination of linking and self
description is very valuable.
... Will describe this and show how it is possible to do this.
<Zakim> DanC_lap, you wanted to note that RDFa is only consistent with a self-describing web if the HTML spec is rev'd to reference it.
TBL: Propose that the goal should be that at the end the reader should know to use self description and to use high level not low level descriptions
DC: Who are we writing for?
TBL: People about to write new
ontologies, people about to create a new rel='...'
... Extend by making a new ontology rather than a new XML
language.
HT: That's Tim's comment but I would be unhappy if that is how it makes it into the document because it's too broad brush
NM: I thought the audience was those plus almost anyone publishing content on the web.
DC: Microformats community says don't
reinvent the wheel, and we should treat them as part of the
audience
... Agree with their view of not reinventing the wheel, but not how
they achieve that.
... The writing won't appeal to that kind of audience
DO: There is an old-school view of how the web works now, and also the new-school view of where this is going in future
NM: I have a view that the audience is very broad. Everyone creating a page should be asking themselves whether they should be adding RDFa to a page they are creating
DC: But virtually no one actually even thinks about whether they should put lang=en on pages
NM: Ok, actually I took it as
implicit that this happens via the tools creators
... Objective is to get the content on the web in general
better.
DC: Maybe the term self-describing
could get in the way.
... The clearest way to talk about this might be from the point of
URI. Microformats don't do this. Others do.
SW: When we get to unicode, what counts as self describing at these very fine grained levels.
General agreement that there will be a point at which everyone just has to agree
<DanC_lap> (saying it [don't label it text/html] again here is... maybe counter-productive)
NM: Sometimes, the answer is that the
stuff is so simple that you just know. These low level formats are
the bootstrap points for the use of these higher level
technologies
... Intentially not trying to do all the turtles. At some point you
have to say 'let there be turtles'
HT: TBL said that you should create
an on
... I'd be unhappy to see this go into the document. There are
situations where creating a new XML language is the right thing to
do
TBL: Sometimes there are benefits in creating new languages, but where a new language fits the RDF data model you should do an ontology
NM: So the interesting question is when to do XML vs. RDF and is the XML/GRDDL approach second class or is it a first class answer
TBL: Second class
<DanC_lap> [-- discussion is curtailed, noting substantial disagreement -- ]
HT: Thinks that this is a genuine
disagreement point between us.
... Example is xproc specification. Reason is that there are lots
of tools out there for it and clean mechanism for describing what
is and is not valid
DC: All true for RDF
HT: Not true. 1) can't use an XSLT,
2) Can't prevalidate (e.g. via pellet) and 3) Can't use an XML
parser as a front end to the implementation
... RDF syntax is very flexible, which is good, but it prevents you
writing a stylesheet
TBL: why would you want to do that
HT: So that a human being can get a
readable version of the information tailored to that information
and audience
... Can produce and SVG diagram from an xproc, which is hugely
valuable. Possible with a constrained XML vocabulary but not with a
particular RDF graph
TBL: Writes a table of xml vs RDF and shows equivalent of parsing, stylesheet and xslt for the two technologies
Scribe notes that we need a picture of the table that is on the board
TBL: describes the table
HT: So does that mean the W3C should never create a new markup language?
TBL: No, there are cases where new markup should be used.
HT: I think xproc is an example. For example because sequence is important and
SW: adds treehugger and rdftwig as entries and DC moves them to stylesheet row
TBL: asks if DC knows of an XSLT with SPARQL embedded
NM: Want to add to the list of
reasons for using XML. Today, it could be a question of cost, since
the XML parsers and processors are available today. This might
tempt people to use XML today. Not offering an opinion, just that
if a customer asked today, the cost is something that needs to be
considered.
... Also the tools are production quality today.
<DanC_lap> [yes, I think it would have been better for XML Schema to follow DSD, i.e. the RDF-based submission, though the cost of RDF was pretty high back then. It would have been premature standardization. But heck... that didn't seem to stop XSD in many other ways.]
NM: I would like balanced input on this particular topic. However, I did hear two people who seemed not to see value in proceeding. So should we continue with this?
DC: I just want to pick an audience and focus on them. Needs surgery before I would hand it to the audience I envisage
NM: I think the audience I envisage
is broader than the one Dan envisages
... I would add a comment to that effect to the document
DC: How do we get to them?
NM: we go to the product groups in the big companies who create the tools that create this stuff.
SW: Want a pithy message, but didn't find it.
NM: Do you think its in there but unclear, or that it just is not there?
SW: I think its encouragement to use self describing approaches
NM: About right and then to give the specific guidance.
TBL, SW, RL, confirm that they think NM should continue
DO: Would like to see a bit more on XML in this finding.
NM: Can you give an example?
DO: Take an example and show how to progress it using both GRDDL and XML
NM: I think that a lot of people would see that as trying to help the RDF community.
DO: could be that the level of self
description in RDF is deeper than in XML
... I see more in the table around programming languages rather
than just XSLT, and so the happy face should be a happy face.
<DanC_lap> indeed, the main point of this finding should be uri-based extensibility vs just using <part-number />
TBL: Microformats have to be well known to be useful
NM: Microformats go through a single distribution point.
DC: Draft doesn't say that RDFa will become part of the HTML specification.
NM: Should I write about both paths?
DC: Yes
<DanC_lap> 2nd the proposal to thank the hosts
SW: Extend our thanks to Google for hospitality and meeting support
Applause
SW: Thanks to Raman
<Norm> Safe travels everyone
<ht_google> rssagent, bye