TAG f2f -- 7 Mar 2007 (morning)

NamespaceDocument-8

<Norm> namespaceDocument-8 background NDW 5 Mar

NW: We've gone around on this several times, decided we would benefit by resetting and considering requirements from the ground up

NW: NS documents are sometimes used by agents (people and mechanical)
... Competing demands on what should be there
... No single thing (XML Schema, RDF, ...) will satisfy everything
... When we first looked at this, we thought maybe a single XML format (e.g. RDDL1) would do the job
... But the world has moved on, and maybe we don't need to do anything

Tutti: We should try to do something

NM: But maybe we'll decide we can't

NW: Two options at least: Do some form of RDDL 2 (a single XML vocabulary) designed for the purpose; pursue the route set out in the original finding (focus on an RDF design)

TBL: Let's standardise on three things: For the self-desc. web, Semantic Web wants RDF-XML, OFWeb wants XML or DTD or Schema....
... If you stop putting schema doc as the NS doc't, you raise the bar for applications to have to understand more than you would otherwise
... Imagine web-based python -- would we want to say they all have to have XML parser

DC: XML Schema have already raised the bar -- they already use RDDL, the cost is paid already

TBL: Opposite for SemWeb

NM: Fundamental question is that if I have in my hand an NS URI, will I always want one thing from it, or different things in different situations
... Yes, maybe content negotiation would help
... So, e.g., sometimes a schema and sometimes the GRDDL-produced RDF
... The value of the purpose/nature stuff allowed additional functionality -- how important is it to suppport that?
... There's also the performance issues
... Must not be a requirement that every time you do a validation you must do a GET

RL: Conneg got mentioned, if we thought of the different metadata as all being representations of the same resource

NM, HT: Seems stretching to say that wrt e.g. a spec., a schema and some GRDDL

TBL: It's a violation of the architecture

DC: Works for GRDDL itself (RDF and HTML)

TBL: It can be OK, but it is likely to not be always

NW: Consider the DocBook NS -- when we get to the NS document, we're going to find stylesheets for using it, DTD, RelaxNG schema, User guide
... RDDL is sort of working for much of that, and with GRDDL coming, the SemWeb software should be able to win too

<Noah> Hmm. Norm says some ISPs don't support mod_negotiation. Interesting. I wonder whether we should juice up the Generic Resources finding to say: those who administer servers likely to be used to host generic resources SHOULD support conneg, perhaps with a Norm/Nadia story telling how Norm couldn't use conneg because his ISP wouldn't let him.

TBL: So GRDDL would extract from the DocBook NS doc't a bunch of triples telling me where to find what
... But I don't expect to ever point a SemWeb tool at the DocBook NS

NW: Consider the VCard namespace -- I expect in due course there will be lots of stuff there, including RDF, tutorials and HTML description

TBL: That's fine with conneg

HST: I don't agree --- this is like DocBook, not GRDDL -- lots of different kinds of resource, conneg not appropriate

TBL: The tabulator will find NS documents, would have to indirect via GRDDL -- I don't want to have to do that

NW: For some namespaces that's OK

DC: This is an argument for doing nothing

HST: I think the current XML Schema situaiton is a good model -- if an NS owner has only one thing to say, or a few things which fit with conneg, no need for indirection, just put it/them in place
... but if need to do more than that can handle, TAG provides a story via an RDF vocab. and so on (per the draft)

DC: That works for me, indirection is a necessary evil
... Consider XQuery F&O -- they designed it for their needs, but with my SemWeb hat on I get lots of value out of their URIs
... currently there's an HTML doc't at that NS, NW will turn that into (G)RDDL -- how will I get what I want, which is label, description, domain and range for each ID in that namespace
... How will I do that?

<DanC_lap> . http://www.w3.org/2006/xpath-functions#

NW: Not clear that the current story can do this

HST: What we can do is for {NS}#foo, use (G)RDDL to say -- look at {metaforNS}#foo for RDF

<DanC_lap> fn:abs rdfs:range something:numeric; rdfs:label "abs"; rdf:type owl:FunctionalProperty.

DC: Look at the XPath f&o NS Doc't [URI above]
... The HTML tells me what I want, but I want it RDF

NM, TBL: [rathole wrt I18N]

TBL: The label is different from the URI

[rathole continues]

NM: I remain unconvinced why anyone would want to spell 'sum' any way other than 'sum' -- it's like programming language

<Zakim> Stuart, you wanted to mutter about suspension of disbelief on NS and NSDocument being different resources

NW: I could give you what you want

DC, NW: via conneg? yes, via conneg.

SW: I think we're talking about NS and NS doc't as if they were the same thing
... but they're not

DC: So they haven't given you any way to distinguish namespaces and namespace documents

SW: I can suspend disbelief about this, but I'm surprised TBL is

NW: I think [the DocBook NS URI] identifies (the concept of) DocBook

HST: That will upset NM

NM: It's a bad idea in general to assume that NS names identify languages as well as namespaces
... An NS owner may say that, but don't build software that assumes that all NSes are like that
... because, among other things, languages and NSes are not one-to-one
... HST made a proposal 15 minutes -- use conneg for a broad range of things

HST: [interrupts] that's not what I proposed, rather, use conneg iff all the things you have to offer are reprs of the same thing

NM: Fine.

TBL: This all just convinces me that we need to get on to the Arch of the SemWeb, because the answer to this topic comes at least in part directly from there
... So, as per SW Best Practices, put RDF at NS URIs
... for SW purposes
... and for non-SW purposes, do something else

NW: So what do I do for DocBook?

TBL: Well, I don't use e.g. XML Schemas, so I'm not sure what they want

NW: So, we decided that for the sake of interop

TBL: Interop with whom?

NW: Across the board
... Are you happy with what HST said
... And in particular, using conneg between an HTML doc't and RDF?

TBL: Yes
... What you couldn't do is conneg between an XML Schema and an RDF ontology.

NM: If you have an OWL ontology, it's not describing a schema, for sure
... Consider DocBook -- we retrieve XML, we use GRDDL, we get triples
... Could you ever want a schema for the docbook language and RDF assertions about those GRDDL-generated triples
... GRDDL is a bridgepoint

HST: The only bridgepoint

TBL: There's another one -- quoted XML in the midst of RDF

NM: So GRDDL allows us to get triples from a purchase order

HST: [confusion]

TBL: You can get GRDDL from the schema, for instance

NM: So, from a purchase order in XML, I can find the GRDDL and apply it, to get triples I can use
... I also have a schema, which describes the syntax of the language
... So shouldn't I be able to get the ontology for the language from somewhere nearby the schema?

TBL: Maybe you could, but my assumption is that the RDF you get from GRDDLing the po will draw on many ontologies, e.g. amount of money, lat/long, etc.

NM: So my story about languages is exactly the same, that languages may be made of multiple namespaces

<DanC_lap> (pun: namespace document vs namespace)

HT: What is the nature of the resource identified by a namespace URI?
... If it's not an information resource, you should not return it with a 200.

DC: Dicussing this in the abstract is challenging.

HT: Let's pick the XML Schema namespace URI
... At the moment, if you retrieve from that you get a 200, you will (soon) get a document with anchors for every name in the schema namespace.
... It's what we might call English metadata, I.e. summaries or pointers to what the Schema rec says about them.
... It isn't the namespace, clearly.
... I am assuming that the things identified by NS URIs are not information resources.

DC: That's inconsistent with what data you're presenting.

TBL: Using your email address to identify you doesn't mean we think you're an email address

<Zakim> timbl, you wanted to sy it is an information resource and to

TBL: I like the pun, because I like getting 200s, because I want information
... there's this additional convention/engineering approach, involving #
... So wrt the XML Schema case, I'd say the URI identifies a document

DC:But it's a namespace URI

TBL:It's called a namespace name

<DanC_lap> ("namespace URI" is one standard term; "namespace name" is another. cf http://www.w3.org/TR/2006/REC-xml-names11-20060816/ . "http://www.w3.org/2001/XMLSchema" is a namespace name. Tim stipulated that it's a namespace URI; that's pretty darn close. I can buy the punning story he told, but we should also stipulate that it's a very liberal reading of the namespace rec )

HST: It says in the XML Schema spec. "The namespace URI for the XML Schema namespace is 'http://www.w3.org/2001/XMLSchema'"

TBL: We use the word 'namespace' to refer to either the set of terms defined in the namespace, or in RDF to refer to a set of URIs which share the namespace URI

TBL: but the concept of the set doesn't come up in the architecture, so the fact that we don't treat the namespace URI as identifying that set isn't a problem

DO: In dealing with purchase orders in the real world, getting the schema isn't hard, but isn't interesting either
... the hard part is getting at the meaning and significance of the terms in the po
... there's also all the contractual implications
... So I think going from schemas to the semantics of POs is too simplistic

<Zakim> Noah, you wanted to say I think I would prefer to say namespaces are information resources, probably a set of names

NM: I think there's semweb info at both levels
... If we get some GRDDL-derived triples from a PO document, which simply says locally what the PO says, whereas the complex stuff about the implications of that is global and, indeed, much more complex
... But both are useful, and the first can be associated iwth the schema
... TBL, it feels wrong for me to say the NS URI identifies a document
... I think we can say that the Namespace is an information resource, and it's a set of names
... because we say an info resource is something which can be faithfully represented in a document
... So returning 200 plus a document is OK, because you can faithfully represent that set in a document

<timbl> NM, do not confuse information about a set with the set. Info about the set is an IR, the set isn't IMHO

NM: So as long as a document is a representation of the set, it can participate in conneg wrt the NS URI
... I think that's simple and robust

NW: Back to the finding
... We were close to agreement on "sometimes conneg will do, sometimes you need indirection" and "if you need indirection, here's how"

<Norm> Snapshot of whiteboard

TBL: I want to exempt the RDF Ontologies

HST: You don't need to, it's covered by the first clause -- if you have only one thing to say, just say it

That is, only one representation is trivially using conneg

NW: Consensus?

NM: Reserve my position wrt the final bit, i.e. our approach to providing the indirection

NW: So, I think I understand what the reason my proposal on the indirection failed last time -- there was an asymmetry between [x] and [y]

<Stuart> http://lists.w3.org/Archives/Public/www-tag/2007Mar/0012.html

NW: [summarises the above email]

DC: You asked for RDDL docs from the community, did you get any?

<Norm> Summary of RDDL usage in the wild

HST, DC: [try to work through what a vNext schema processor would do]

NW: I'll try to come up with a specific example over the break

<Norm> BPEL RDDL doc't

DC, NW: [working on whiteboard wrt the RDDL for bpel, and for RDDL itself]

<timbl_> The example discussed in the break was RDDL for RDDL

<timbl_> DanC suggested that that could be mapped to the RDF:

DC: Norm found fifteen RDDL docs in the wild, and summarised their usage of RDDL (see above email)

DC: Looked at the BPEL NS document
... in particular for
rddl:nature="http://www.w3.org/2001/XMLSchema" rddl:purpose="http://www.rddl.org/purposes#schema-validation" href="....xsd"
... I'd want this RDF:
<bpel> rddl:schema-validation <...xsd> [in N3]

<timbl_> ______________
<bpel> rddl:schemaValidation <http://.....bpel...xsd>. <http://.....bpel...xsd> docRootEltName ( "http://www../" "XMLSchema"). # A pair, the XML element name <bpel> rddl:schemaValidation <http://.....bpel...rrg>. <http://.....bpel...rrg> docRootEltName ( "http://www.....rng..." "grammar"). # A pair <http://.....bpel...html> rddl:normativeReference <http://www.w3.org/TR/HTML40>
_____________

DC: for all the above, assume rddl: bound to http://www.rddl.org/purposes#
... @prefix rddl: <http://www.rddl.org/purposes#>

<timbl_> ( "http://www.....rng..." "grammar") is N3 shorthand for [rdf:first "http://www.....rng..."; rdf:rest [ rdf:first "grammar"; rdf:rest rdf:nil ]].

NW: What about adding a .rnc reference, i.e. a relaxng compact syntax document

NM: Broken, in part because not participating in the self-describing web (no media type)

NW: Stipulate we have a media type

DC: <.....bpel...rnc> httpr2:content-type "text/rnc"

SW: Couldn't we use Schema Component Designator instead of docRootEltName ?

<Norm> The media type for compact syntax is application/relax-ng-compact-syntax

NM: SCDs is all about the bit after the '#', but have not yet any story about the bit before that

TBL: [starts down the LHS of SCDs rat-hole, pulls back]

NW: I have enough information to attempt the next step, i.e. to try converting my current draft to the examples on the whiteboard

SW: Any other agreements?

DC: docRootEltName is a matter of fact, but normative-reference is a matter of opinion
... I prefer the matters of fact

NM, HST, others: [back to the SCDs LHS rathole]

HST: The outstanding question here is what the best predicate for identifying things as W3C XML Schemas -- docRootEltName doesn't seem great

NM: And won't work if the xs:schema element isn't the root of a document. . .

<Noah> The back and forth between me and Dan suggests to me that what we need to solve the Schema Component Designator problem is in fact something very similar to namespace description documents, but these would instead be "language" description documents, that would bring together the declarations for a root element, along with the constraints that allow you to designate which documents are in your language. I believe that, when schema is used, that language desc are resolved.

tagsoup

SW: TVR and DC are on point

DC: TBL has just published something relevant, at [see below]

<timbl_> http://www.w3.org/2007/03/vision

DC: W3C just announced the new HTML/XHTML working groups

The above URI is TBL's background about this

<timbl_> This is linked from http://www.w3.org/html/

<Norm>

<DanC_lap> also, http://esw.w3.org/topic/HTMLAsSheAreSpoke has a survey of tagsoup, beatiful soup, ...

DC: This document sets out three options, similar to ones TV set out in a recent telcon:
... 1, Require XML at XHTML 1
... 2, Require XML, at XHTML 2
... 3, Incremental transition
... The WGs have been set up on the assumption that Incrementation transition is the way to go

NM: Where does this doc't fit in

TV: As input on tagSoupIntegration-54

DC: What Incremental Transition means to me is to look at cases until a pattern emerges

TV: In discussion to date, we have brainstormed a lot of use cases
... I think the TAG can help by writing things down in one place, which hasn't been done
... for instance Ian Hickson's email on the problems with XHTML
... We should look in particular for the tension points
... for example the extensibility issue which is raised at the end of the 'vision' document
... I'm prepared to put my own opinions to one side and collect issues

TBL: Support that idea
... The TAG should take apart the idea of 'Incremental Transition' and look carefully at it
... What it means is instead of saying "messy browser reality" vs "pure XHTML2 coolness" -- you must choose
... We look at the differences which are actually out there between tagsoup and well-formed XHTML
... and for each of them, we'll try to motivate change by analyzing how the deviations harm the community
... W3C has commited to trying to produce a validator -- the TAG can help spec. the behaviour of that validator

TV: [scribe missed]
... we know that if you're writing HTML, you can leave off end tags
... and browsers know how to cope

<timbl_> We should for each of the changes motivate them appropriately and with due priority.

<DanC_lap> (some people think quoting html inside RSS is a feature. I have a hard time appreciating that position.)

TV: but if someone does this in the content of RSS and Atom, the well is poisoned, that is, to protect XML well-formedness the XML has to be quoted

<Zakim> DanC_lap, you wanted to ask whether we actual have a useful connection to the RDDL community

<Zakim> ht_mit, you wanted to ask why anyone would ever serialise to the non-XML syntax. . .

HST: Does this mean W3C will standardise two different ways of serializing a DOM

DC: No, perhaps two syntaxes would have been a better phrase

TBL: People have asked if we're heading towards XML2
... that's one possible interpretation of the analysis of the motivation for the differences

TV: We've got a range of phenomena -- close tags, quotes, etc.
... We have to distinguish which of these are really dangerous and which are not so bad

NM: What about new software -- what advice are we giving
... or new versions of old software

NM: I'm not sure about the idea of defending the idea of e.g. leaving out quotes in new software just to save a few bytes
... That's the wrong place to look for efficiency

TV: The point is that given that it works == the browsers shows the right thing, it makes sense for Google and Yahoo! to exploit the shortcuts
... When Google ships to mobile, we ship very carefully valid HTML

TBL: Move that the TAG recommend that browsers should change to Save As and View Source by default to do cleanup first

<timbl_> TV seconded

TV: How much cleaned up?

<Rhys> Rhys notes that when Volantis ships to mobile, which it does a lot, it ships markup carefully matched to the device, warts and all

NW: The whole thing -- no choices, only full XML

DC: I opposed the motion because the TAG can't make this happen

TV: This is in the same spirit as the final observation in the 'vision' document about XML and extensibility

<timbl_> I wonder which of all the things the TAG has recommended can be done by the TAG directly.

NM: We should put before the community the stories that moving to XML really help with

<DanC_lap> I support fact-finding in the form of writing stories/use-cases in the direction of the view-source clean-up

<Zakim> DanC_lap, you wanted to contrast the google homepage, where device-specific renditions are cost-effective, vs the state of kansas tax web sites, where it's probably not

NM: Helping people think about these things is as important as just giving them the answers

DC: The 'one web' principle is very important to me
... My target audience is not Google, but the contractor who works for the state of Kansas tax office
... and he needs clearcut advice about the one thing he should do.

TBL: There's also the long tail of novices who leave out the quotes because it's easier to write and read w/o them, and they have no need to

<DanC_lap> (if the TAG wants to have a software-develoment management discussion about the validator, I'd appreciate that too.)

TBL: [Validates a document pulled off the web, finds a spurious ';' between attributes in a start tag]

SW: [Validates a document pulled off the web, finds unquoted background=#ffffff attribute]

TV: We should start building a list of incremental issues

<DanC_lap> TVR makes an interesting observation, that tidy/tagsoup processes don't obey f(x)=x for all well-formed x

NM: Will this be helpful, DanC?

DC: Can't tell

[break for lunch]

- DRAFT -

TAG f2f

7 Mar 2007 (morning)

Attendees

Contents

NamespaceDocument-8

tagsoup