IRC log of tagmem on 2007-03-07

Timestamps are in UTC.

00:24:27 [Zakim]
Zakim has left #tagmem
02:24:03 [timbl]
timbl has joined #tagmem
03:40:50 [DanC_lap]
DanC_lap has joined #tagmem
03:41:03 [DanC_lap]
DanC_lap has joined #tagmem
03:41:37 [Norm]
Norm has joined #tagmem
14:05:11 [RRSAgent]
RRSAgent has joined #tagmem
14:05:11 [RRSAgent]
logging to
14:05:30 [Zakim]
Zakim has joined #tagmem
14:05:38 [Stuart]
zakim, this is TAG
14:05:38 [Zakim]
Stuart, I see TAG_f2f()9:00AM in the schedule but not yet started. Perhaps you mean "this will be TAG".
14:05:54 [Stuart]
zakim, this will be TAG
14:05:54 [Zakim]
ok, Stuart; I see TAG_f2f()9:00AM scheduled to start 5 minutes ago
14:06:43 [Noah]
Noah has joined #tagmem
14:08:05 [Rhys]
Rhys has joined #tagmem
14:13:08 [Vincent]
Vincent has joined #tagmem
14:13:17 [ht_mit]
ht_mit has joined #tagmem
14:13:31 [ht_mit]
Meeting: TAG f2f
14:13:37 [ht_mit]
Chair: Stuart Williams
14:13:45 [ht_mit]
Scribe: Henry S. Thompson
14:13:52 [ht_mit]
ScribeNick: ht_mit
14:14:04 [ht_mit]
Date: 2007-03-07
14:15:07 [ht_mit]
14:15:46 [DanC_lap]
DanC_lap has joined #tagmem
14:15:47 [Norm]
14:15:51 [ht_mit]
Topic: NamespaceDocument-8
14:16:32 [ht_mit]
NW: We've gone around on this several times, decided we would benefit by resetting and considering requirements from the ground up
14:16:47 [Raman]
Raman has joined #tagmem
14:16:50 [DanC_lap]
-> namespaceDocument-8 background NDW 5 Mar
14:17:01 [ht_mit]
... NS documents are sometimes used by agents (people and mechanical)
14:17:11 [ht_mit]
... Competing demands on what should be there
14:17:27 [ht_mit]
... No single thing (XML Schema, RDF, ...) will satisfy everything
14:17:57 [ht_mit]
... When we first looked at this, we thought maybe a single XML format (e.g. RDDL1) would do the job
14:18:10 [ht_mit]
... But the world has moved on, and maybe we don't need to do anything
14:18:21 [ht_mit]
Tutti: We should try to do something
14:18:32 [ht_mit]
NM: But maybe we'll decide we can't
14:19:32 [ht_mit]
NW: Two options at least: Do some form of RDDL 2 (a single XML vocabulary) designed for the purpose; pursue the route set out in the original finding (focus on an RDF design)
14:20:53 [ht_mit]
TBL: Let's standardise on three things: For the self-desc. web, Semantic Web wants RDF-XML, OFWeb wants XML or DTD or Schema....
14:21:25 [Noah]
14:21:42 [ht_mit]
... If you stop putting schema doc as the NS doc't, you raise the bar for applications to have to understand more than you would otherwise
14:21:44 [Noah]
q+ to ask why you don't want to find GRDDL
14:22:29 [ht_mit]
TBL: Imagine web-based python -- would we want to say they all have to have XML parser
14:22:49 [Rhys]
14:22:54 [Stuart]
ack dan
14:22:55 [Zakim]
DanC_lap, you wanted to note that the XML Schema practitioners have chosen to raise the bar above "just put a schema there"
14:23:04 [ht_mit]
DC: XML Schema have already raised the bar -- they already use RDDL , the cost is paid already
14:23:19 [Norm]
q+ to ask Tim about GRDDL
14:23:21 [ht_mit]
TBL: Opposite for SemWeb
14:23:30 [Stuart]
ack noak
14:23:32 [Stuart]
ack noa
14:23:32 [Zakim]
Noah, you wanted to ask why you don't want to find GRDDL
14:24:05 [ht_mit]
NM: Fundamental question is that if I have in my hand an NS URI, will I always want _one_ thing from it, or different things in different situations
14:24:34 [ht_mit]
... Yes, maybe content negotiation would help
14:24:51 [ht_mit]
... So, e.g., sometimes a schema and sometimes the GRDDL-produced RDF
14:25:15 [timbl]
timbl has joined #tagmem
14:25:19 [timbl]
14:25:23 [ht_mit]
... The value of the purpose/nature stuff allowed additional functionality -- how important is it to suppport that?
14:25:36 [ht_mit]
q+ to mention multiple schemas
14:25:42 [DanC_lap]
q+ to observe that the XQuery terms like fn:doc are handy in the Semantic Web world too, and to ask if what's there, an HTML document, is good enough for the semantic web? And to note that the semweb pratcitioners aren't happy with "just stick RDF there" either, since they value human readability of HTML too
14:26:13 [ht_mit]
... There's also the performance issues
14:26:13 [timbl]
q+ ted to point out that you should defnitely long-term cache namespace docuemnts
14:26:35 [ht_mit]
... Must not be a requirement that every time you do a validation you must do a GET
14:27:09 [Stuart]
ack Rhys
14:28:22 [ht_mit]
RL: Conneg got mentioned, if we thought of the different metadata as all being representations of the same resource
14:28:49 [Stuart]
q+ to mutter about suspension of disbelief on NS and NSDocument being different resources
14:29:19 [ht_mit]
NM, HT: Seems stretching to say that wrt e.g. a spec., a schema and some GRDDL
14:29:32 [ht_mit]
TBL: It's a violation of the architecture
14:29:37 [Stuart]
14:29:48 [ht_mit]
DC: Works for GRDDL itself (RDF and HTML)
14:30:01 [Stuart]
ack Norm
14:30:01 [Zakim]
Norm, you wanted to ask Tim about GRDDL
14:30:09 [ht_mit]
TBL: It _can_ be OK, but it is likely to not be always
14:30:53 [ht_mit]
NW: Consider the DocBook NS -- when we get to the NS document, we're going to find stylesheets for using it, DTD, RelaxNG schema, User guide
14:31:27 [ht_mit]
... RDDL is sort of working for much of that, and with GRDDL coming, the SemWeb software should be able to win too
14:32:01 [Noah]
Hmm. Norm says some ISPs don't support mod_negotiation. Interesting. I wonder whether we should juice up the Generic Resources finding to say: those who administer servers likely to be used to host generic resources SHOULD support conneg, perhaps with a Norm/Nadia story telling how Norm couldn't use conneg because his ISP wouldn't let him.
14:32:09 [timbl]
14:32:16 [ht_mit]
TBL: So GRDDL would extract from the DocBook NS doc't a bunch of triples telling me where to find what
14:32:44 [ht_mit]
... But I don't expect to ever point a SemWeb tool at the DocBook NS
14:34:03 [ht_mit]
NW: Consider the VCard namespace -- I expect in due course there will be lots of stuff there, including RDF, tutorials and HTML description
14:34:27 [ht_mit]
TBL: That's fine with conneg
14:34:52 [ht_mit]
HST: I don't agree --- this is like DocBook, not GRDDL -- lots of different kinds of resource, conneg not appropriate
14:35:27 [Noah]
14:35:27 [ht_mit]
TBL: The tabulator will find NS documents, would have to indirect via GRDDL -- I don't want to have to do that
14:35:50 [ht_mit]
NW: For some namespaces that's OK
14:36:05 [ht_mit]
DC: This is an argument for doing nothing
14:36:05 [Stuart]
ack ht
14:36:05 [Zakim]
ht_mit, you wanted to mention multiple schemas
14:36:14 [timbl]
q+ to say no don't do nothing
14:38:03 [Noah]
q+ to ask about implications of Henry's proposal with respect to the nature of namespace resources.
14:39:44 [Stuart]
q- ted
14:41:16 [ht_mit]
HST: I think the current XML Schema situaiton is a good model -- if an NS owner has only one thing to say, or a few things which fit with conneg, no need for indirection, just put it/them in place
14:42:00 [ht_mit]
... but if need to do more than that can handle, TAG provides a story via an RDF vocab. and so on (per the draft)
14:42:04 [DanC_lap]
ack danc
14:42:04 [Zakim]
DanC_lap, you wanted to observe that the XQuery terms like fn:doc are handy in the Semantic Web world too, and to ask if what's there, an HTML document, is good enough for the
14:42:04 [Stuart]
q- timbl
14:42:07 [Zakim]
... semantic web? And to note that the semweb pratcitioners aren't happy with "just stick RDF there" either, since they value human readability of HTML too
14:42:10 [timbl]
14:42:24 [ht_mit]
DC: That works for me, indirection is a necessary evil
14:43:13 [ht_mit]
DC: Consider XQuery F&O -- they designed it for their needs, but with my SemWeb hat on I get lots of value out of their URIs
14:44:23 [ht_mit]
... currently there's an HTML doc't at that NS, NW will turn that into (G)RDDL -- how will I get what I want, which is label, description, domain and range for each ID in that namespace
14:44:35 [ht_mit]
... How will I do that?
14:46:44 [ht_mit]
NW: Not clear that the current story can do this
14:46:57 [ht_mit]
HST: What we can do is for {NS}
14:47:01 [DanC_lap]
14:47:03 [ht_mit]
14:47:38 [ht_mit]
HST: What we can do is for {NS}#foo, use (G)RDDL to say -- look at {metaforNS}#foo for RDF
14:47:59 [DanC_lap]
fn:abs rdfs:range somethign:numeric; rdfs:label "abs"; rdf:type owl:FunctionalProperty.
14:48:10 [ht_mit]
DC: Look at the XPath f&o NS Doc't [URI above]
14:48:33 [ht_mit]
... The HTML tells me what I want, but I want it RDF
14:48:53 [ht_mit]
NM, TBL: [rathole wrt I18N]
14:49:30 [ht_mit]
TBL: The label is different from the URI
14:50:51 [ht_mit]
[rathole continues]
14:51:33 [Stuart]
14:51:52 [ht_mit]
NM: I remain unconvinced why anyone would want to spell 'sum' any way other than 'sum' -- it's like programming language
14:52:25 [Stuart]
ack Stuart
14:52:25 [Zakim]
Stuart, you wanted to mutter about suspension of disbelief on NS and NSDocument being different resources
14:52:28 [ht_mit]
NW: I could give you what you want
14:52:38 [ht_mit]
DC, NW: via conneg? yes, via conneg.
14:53:08 [ht_mit]
SW: I think we're talking about NS and NS doc't as if they were the same thing
14:53:15 [ht_mit]
... but they're not
14:55:02 [ht_mit]
DC: So they haven't given you any way to distinguish namespaces and namespace documents
14:55:29 [ht_mit]
q+ to ask about httpRange-14
14:55:58 [ht_mit]
SW: I can suspend disbelief about this, but I'm surprised TBL is
14:56:08 [ht_mit]
TBL: We get one level of indirection for free
14:56:27 [timbl]
14:57:00 [ht_mit]
NW: I think [the DocBook NS URI] identifies (the concept of) DocBook
14:57:02 [Stuart]
14:57:10 [ht_mit]
HST: That will upset NM
14:57:13 [timbl]
s/We get one level of indirection for free/A document whih desribes by indirection describes
14:57:16 [Stuart]
ack Noah
14:57:16 [Zakim]
Noah, you wanted to ask about implications of Henry's proposal with respect to the nature of namespace resources.
14:58:20 [ht_mit]
NM: It's a bad idea in general to assume that NS names identify languages as well as namespaces
14:58:42 [ht_mit]
... An NS owner _may_ say that, but don't build software that _assumes_ that _all_ NSes are like that
14:59:18 [ht_mit]
... because, among other things, languages and NSes are not one-to-one
15:00:26 [Noah]
15:00:43 [Stuart]
ack timbl
15:00:48 [ht_mit]
NM: HST made a proposal 15 minutes -- use conneg for a broad range of things
15:01:25 [ht_mit]
HST: [interrupts] that's not what I proposed, rather, use conneg iff all the things you have to offer are reprs of the same thing
15:01:30 [ht_mit]
NM: Fine.
15:02:34 [ht_mit]
TBL: This all just convinces me that we need to get on to the Arch of the SemWeb, because the answer to this topic comes at least in part directly from there
15:03:06 [Stuart]
15:03:07 [ht_mit]
... So, as per SW Best Practices, put RDF at NS URIs
15:03:17 [ht_mit]
... for SW purposes
15:03:26 [ht_mit]
... and for non-SW purposes, do something else
15:03:41 [ht_mit]
NW: So what do I do for DocBook?
15:04:11 [Noah]
q+ to say we need to think about Norm's case, where the same URI is used for the namespace and the language.
15:04:16 [ht_mit]
TBL: Well, I don't use e.g. XML Schemas, so I'm not sure what they want
15:04:40 [ht_mit]
NW: So, we decided that for the sake of interop
15:04:48 [ht_mit]
TBL: Interop with whom?
15:04:58 [ht_mit]
NW: Across the board
15:05:13 [ht_mit]
... Are you happy with what HST said
15:05:59 [ht_mit]
... And in particular, using conneg between an HTML doc't and RDF?
15:06:03 [ht_mit]
TBL: Yes
15:06:52 [ht_mit]
TBL: What you couldn't do is conneg between an XML Schema and RDF.
15:07:34 [timbl]
s/RDF/an RDF ontology
15:08:12 [ht_mit]
s/XML Schema and RDF/XML Schema and RDF ontology/
15:08:40 [ht_mit]
NM: If you have an OWL ontology, it's not describing a schema, for sure
15:09:04 [ht_mit]
... Consider DocBook -- we retrieve XML, we use GRDDL, we get triples
15:09:26 [timbl]
15:09:55 [ht_mit]
... Could you ever want a schema for the docbook language and RDF assertions about those GRDDL-generated triples
15:10:16 [ht_mit]
TBL: We shouldn't be mixing technologies
15:10:56 [ht_mit]
TBL: There's another one -- quoted XML in the midst of RDF
15:11:26 [ht_mit]
NM: So GRDDL allows us to get triples from a purchase order
15:12:36 [Zakim]
TAG_f2f()9:00AM has now started
15:12:43 [Zakim]
15:13:05 [ht_mit]
HST: [confusion]
15:13:16 [Zakim]
15:13:19 [dorchard]
dorchard has joined #tagmem
15:13:20 [ht_mit]
TBL: You can get GRDDL from the schema, for instance
15:13:43 [ht_mit]
NM: So, from a purchase order in XML, I can find the GRDDL and apply it, to get triples I can use
15:14:02 [ht_mit]
... I also have a schema, which describes the syntax of the language
15:14:07 [Stuart]
15:14:24 [DanC_lap]
DanC_lap has changed the topic to: TAG meeting in Cambridge, MA. topic: nsDocument-8
15:14:32 [ht_mit]
... So shouldn't I be able to get the ontology for the language from somewhere nearby the schema?
15:14:41 [dorchard]
15:15:00 [dorchard]
q+ to talk about complexities of meaning of purchase order
15:15:20 [ht_mit]
TBL: Maybe you could, but my assumption is that the RDF you get from GRDDLing the po will draw on many ontologies, e.g. amount of money, lat/long, etc.
15:15:44 [Stuart]
q- noah
15:16:27 [ht_mit]
NM: So my story about languages is exactly the same, that languages may be made of multiple namespaces
15:16:39 [ht_mit]
15:16:43 [ht_mit]
ack ht_mit
15:16:43 [Zakim]
ht_mit, you wanted to ask about httpRange-14
15:17:36 [DanC_lap]
(pun: namespace document vs namespace)
15:17:54 [Noah]
scribenick: noah
15:18:05 [Noah]
HT: What is the nature of the resource identified by a namespace URI?
15:18:15 [Noah]
HT: If it's not an information resource, you should not return it with a 200.
15:18:24 [timbl]
q+ to sy it is an information resource
15:18:31 [Noah]
DC: Dicussing this in the abstract is challenging.
15:18:45 [Noah]
HT: Let's pick the XML Schema namespace URI
15:19:22 [Noah]
HT: At the moment, if you retrieve from that you get a 200, you will (soon) get a document with anchors for every name in the schema namespace.
15:19:47 [Noah]
HT: It's what we might call English metadata, I.e. summaries or pointers to what the Schema rec says about them.
15:19:58 [Noah]
HT: It isn't the namespace, clearly.
15:20:11 [Noah]
HT: I am assuming that documents are not information resources.
15:20:20 [timbl]
15:20:21 [Noah]
DC: That's inconsistent with what we're saying.
15:20:44 [DanC_lap]
s/we're saying/data you're presenting/
15:20:47 [ht_mit]
scribenick: ht_mit
15:21:00 [Noah]
q+ to say I think I would prefer to say namespaces are information resources, probably a set of names
15:21:07 [Stuart]
q- ht
15:21:09 [ht_mit]
s/that documents/that the things identified by NS URIs/
15:21:24 [Stuart]
q- ht
15:21:39 [ht_mit]
TBL: Using your email address to identify you doesn't mean we think you're an email address
15:21:50 [Stuart]
ack timbl
15:21:50 [Zakim]
timbl, you wanted to sy it is an information resource and to
15:21:59 [ht_mit]
... I like the pun, because I like getting 200s, because I want information
15:22:38 [ht_mit]
... there's this additional convention/engineering approach, involving #
15:23:07 [ht_mit]
... So wrt the XML Schema case, I'd say the URI identifies a document
15:23:10 [Noah]
15:23:16 [ht_mit]
DC: But it's a namespace URI
15:23:46 [ht_mit]
TBL: It's called a namespace name
15:24:20 [ht_mit]
HST: It says in the XML Schema spec. "The namespace URI for the XML Schema namespace is ''"
15:25:08 [ht_mit]
TBL: We use the word 'namespace' to refer to either the set of terms defined in the namespace, or in RDF to refer to a set of URIs which share the namespace URI
15:25:18 [Stuart]
15:25:29 [Stuart]
ack dor
15:25:29 [Zakim]
dorchard, you wanted to talk about complexities of meaning of purchase order
15:25:40 [ht_mit]
... but the concept of the set doesn't come up in the architecture, so the fact that we don't treat the namespace URI as identifying that set isn't a problem
15:25:57 [timbl]
s/TBL: It's called a namespace name/TBL: It's called a namespace URI
15:26:05 [timbl]
s/TBL: It's called a namespace name/TBL: It's called a namespace URI/g
15:26:08 [timbl]
15:26:29 [ht_mit]
DO: In dealing with purchase orders in the real world, getting the schema isn't hard, but isn't interesting either
15:26:51 [ht_mit]
... the hard part is getting at the meaning and significance of the terms in the po
15:27:03 [ht_mit]
... there's also all the contractual implications
15:27:28 [Stuart]
15:27:28 [ht_mit]
... So I think going from schemas to the semantics of POs is too simplistic
15:27:29 [DanC_lap]
("namespace URI" is one standard term; "namespace name" is another. cf . "" is a namespace name. Tim stipulated that it's a namespace URI; that's pretty darn close. I can buy the punning story he told, but we should also stipulate that it's a very liberal reading of the namespace rec )
15:27:53 [ht_mit]
ack noah
15:27:53 [Zakim]
Noah, you wanted to say I think I would prefer to say namespaces are information resources, probably a set of names
15:28:21 [ht_mit]
NM: I think there's semweb info at both levels
15:29:23 [ht_mit]
... If we get some GRDDL-derived triples from a PO document, which simply says locally what the PO says, whereas the complex stuff about the _implications_ of that is global and, indeed, much more complex
15:29:43 [ht_mit]
... But both are useful, and the first can be associated iwth the schema
15:30:16 [ht_mit]
NM: TBL, it feels wrong for me to say the NS URI identifies a document
15:30:42 [ht_mit]
... I think we _can_ say that the Namespace _is_ an information resource, and it's a set of names
15:30:46 [Norm]
15:31:17 [ht_mit]
... because we say an info resource is something which can be faithfully represented in a document
15:31:41 [timbl]
15:31:41 [ht_mit]
... So returning 200 plus a document is OK, because you can faithfully represent that set in a document
15:32:25 [timbl]
NM, do not cnfuse information about a set with the set. Info about the set is an IR, the set isn't IMHO
15:32:29 [ht_mit]
... So as long as a document is a representation of the set, it can participate in conneg wrt the NS URI
15:32:37 [ht_mit]
... I think that's simple and robusgt
15:32:45 [Stuart]
ack Norm
15:32:54 [ht_mit]
NW: Back to the finding
15:33:56 [ht_mit]
... We were close to agreement on "sometimes conneg will do, sometimes you need indirection" and "if you need indirection, here's how"
15:34:10 [Norm]
15:34:12 [ht_mit]
TBL: I want to exempt the RDF Ontologies
15:34:31 [ht_mit]
HST: You don't need to, it's covered by the first clause -- if you have only one thing to say, just say it
15:34:55 [ht_mit]
That is, only one representation is trivially using conneg
15:35:05 [ht_mit]
NW: Consensus?
15:35:18 [Rhys]
Rhys has joined #tagmem
15:35:29 [ht_mit]
NM: Reserve my position wrt the final bit, i.e. our approach to providing the indirection
15:36:14 [ht_mit]
NW: So, I think I understand what the reason my proposal on the indirection failed last time -- there was an asymmetry between [x] and [y]
15:37:10 [Stuart]
15:37:29 [ht_mit]
NW: [summarises the above email]
15:37:50 [ht_mit]
DC: You asked for RDDL docs from the community, did you get any?
15:37:53 [Norm]
15:38:13 [Noah]
15:38:50 [Noah]
q+ to remind that the concern I had with the design in Dec was that having natures being namespace documents = category error
15:42:03 [ht_mit]
HST, DC: [try to work through what a vNext schema processor would do]
15:42:18 [ht_mit]
NW: I'll try to come up with a specific example over the break
15:44:51 [Stuart]
david we are back again at the top of the hour.
16:01:35 [Norm]
16:01:41 [Norm]
16:11:10 [timbl_]
timbl_ has joined #tagmem
16:15:38 [ht_mit]
DC, NW: [working on whiteboard wrt the RDDL for bpel, and for RDDL itself]
16:15:48 [timbl_]
The example discussed in the break was RDDL for RDDL
16:15:50 [Norm]
16:16:06 [timbl_]
DanC suggested that that could be mapped to the RDF:
16:16:26 [ht_mit]
DC: Norm found fifteen RDDL docs in the wild, and summarised their usage of RDDL (see above email)
16:17:02 [Norm]
16:17:05 [ht_mit]
... Looked at the BPEL NS document
16:18:59 [ht_mit]
... in particular, rddl:nature="" rddl:purpose="" href="....xsd"
16:20:05 [ht_mit]
... I'd want this RDF: <bpel> rddl:schema-validation <...xsd> [in N3]
16:20:15 [timbl_]
16:20:16 [timbl_]
<bpel> rddl:shemaValidation <http://.....bpel...xsd>.
16:20:16 [timbl_]
<http://.....bpel...xsd> docRootEleName ( "http://www....." "XMLSchema"). # A pair, the XML element name
16:20:16 [timbl_]
<bpel> rddl:shemaValidation <http://.....bpel...rrg>.
16:20:16 [timbl_]
<http://.....bpel...rrg> docRootEleName ( "http://www.....rng..." "grammar"). # A pair
16:20:17 [timbl_]
<http://.....bpel...html> pup:normativeRefernce <>
16:20:19 [timbl_]
16:21:00 [ht_mit]
... for all the above, assume rddl: bound to
16:21:55 [ht_mit]
DC: @prefix rddl: <>
16:22:11 [ht_mit]
16:22:20 [timbl_]
( "http://www.....rng...">http://www.....rng..." "grammar") is N3 shorthand for [rdf:first "http://www.....rng..."; rdf:rest [ rdf:first "grammar"; rdf:rest rdf:nil ]].
16:24:43 [ht_mit]
NW: What about adding a .rnc reference, i.e. a relaxng compact syntax document
16:25:12 [ht_mit]
NM: Broken, in part because not participating in the self-describing web (no media type)
16:25:26 [ht_mit]
NW: Stipulate we have a media type
16:27:06 [ht_mit]
DC: <.....bpel...rnc> httpr2:content-type "text/rnc"
16:27:36 [ht_mit]
SW: Couldn't we use Schema Component Designator instead of docRootEltName ?
16:28:21 [Norm]
The media type for compact syntax is application/relax-ng-compact-syntax
16:28:37 [ht_mit]
NM: SCDs is all about the bit after the '#', but have not yet any story about the bit before that
16:29:10 [ht_mit]
TBL: [starts down the LHS of SCDs rat-hole, pulls back]
16:29:55 [ht_mit]
NW: I have enough information to attempt the next step, i.e. to try converting my current draft to the examples on the whiteboard
16:30:15 [ht_mit]
SW: Any other agreements?
16:30:58 [ht_mit]
DC: docRootEltName is a matter of fact, but normative-reference is a matter of opinion
16:31:07 [ht_mit]
... I prefer the matters of fact
16:33:42 [ht_mit]
NM, HST, others: [back to the SCDs LHS rathole]
16:35:31 [ht_mit]
HST: The outstanding question here is what the best predicate for identifying things as W3C XML Schemas -- docRootEltName doesn't seem great
16:35:54 [ht_mit]
NM: And won't work if the xs:schema element isn't the root of a document. . .
16:36:04 [Noah]
The back and forth between me and Dan suggests to me that what we need to solve the Schema Component Designator problem is in fact something very similar to namespace description documents, but these would instead be "language" description documents, that would bring together the declarations for a root element, along with the constraints that allow you to designate which documents are in your language. I believe that, when schema is used, that language desc
16:36:06 [Noah]
are resolved.
16:37:10 [ht_mit]
Topic: tagsoup
16:37:24 [ht_mit]
SW: TVR and DC are on point
16:37:56 [ht_mit]
DC: TBL has just published something relevant, at xyzzy
16:38:00 [timbl_]
16:41:06 [ht_mit]
... W3C just announced the new HTML/XHTML working groups
16:44:33 [ht_mit]
The above URI is TBL's background about this
16:45:18 [timbl_]
This is linked from
16:45:37 [Noah]
Typo: The architectural directions which the community is moving along now the result of much input
16:45:50 [Noah]
I think that should be "are the result of"
16:46:02 [Noah]
Typo: "will hve"
16:46:10 [Noah]
Spelling of "have"
16:46:25 [Noah]
Typo: relatity -> reality
16:47:26 [ht_mit]
are not made which would not preclude an XML serialization --> are not made which would preclude an XML serialization
16:49:03 [Rhys]
typo no space between 'only' and 'a tag-soup ...' in paragraph 2 of numbered section 3 in 'XML-based Architecture and tag soup'
16:49:49 [timbl_]
I didn't have Chris's OK
16:49:51 [Raman]
typo: words is now may have gotten transposed.
16:49:51 [ht_mit]
q+ to ask why anyone would ever serialise to the _non_-XML syntax. . .
16:49:58 [Raman]
reality is spelt relatity
16:50:28 [Noah]
Possible typo: "This ensures that decisions are not made which would not preclude an XML serialization. "
16:50:38 [Noah]
I suspect the second "not" in the above is a mistake.
16:50:43 [Noah]
16:54:16 [Norm]
16:54:48 [DanC_lap]
also, has a survey of tagsoup, beatiful soup, ...
16:54:50 [Raman]
suggest: "Experience with technologies like Dave Raggett's HTML Tidy and and John Callow's work on TagSoup shows that ..."
16:55:02 [ht_mit]
16:55:11 [Raman]
copy-edit suggestion: the rest of that sentence may want breaking up
16:55:51 [Noah]
typo: chartering of these groups,and (missing space after comma)
16:56:17 [ht_mit]
DC: This document sets out three options, similar to ones TV set out in a recent telcon:
16:56:40 [ht_mit]
... 1, Require XML
16:57:34 [ht_mit]
s/Require XML/Require XML at XHTML 1/
16:57:47 [ht_mit]
... 2, Require XML, at XHTML 2
16:57:58 [ht_mit]
...3, Gentle transition
16:58:08 [ht_mit]
16:58:34 [Stuart]
16:59:29 [ht_mit]
DC: The WGs have been set up on the assumption that Incrementation transition is the way to go
17:00:07 [ht_mit]
NM: Where does this doc't fit in
17:00:18 [ht_mit]
TV: As input on tagSoupIntegration-54
17:00:53 [ht_mit]
DC: What Incremental Transition means to me is to look at cases until a pattern emerges
17:00:55 [timbl_]
q+ to talk about errors and validators
17:01:15 [ht_mit]
TV: In discussion to date, we have brainstormed a lot of use cases
17:01:30 [Stuart]
17:01:34 [ht_mit]
... I think the TAG can help by writing things down in one place, which hasn't been done
17:02:07 [ht_mit]
... for instance Ian Hickson's email on the problems with XHTML
17:03:21 [ht_mit]
... We should look in particular for the tension points
17:03:39 [ht_mit]
... for example the extensibility issue which is raised at the end of the 'vision' document
17:04:10 [ht_mit]
... I'm prepared to put my own opinions to one side and collect issues
17:04:24 [ht_mit]
TBL: Support that idea
17:04:50 [ht_mit]
... The TAG should take apart the idea of 'Incremental Transition' and look carefully at it
17:05:33 [ht_mit]
... What it means is instead of saying "messy browser reality" vs "pure XHTML2 coolness" -- you must choose
17:06:12 [ht_mit]
... We look at the differences which are actually out there between tagsoup and well-formed XHTML
17:06:35 [ht_mit]
... and for each of them, we'll try to motivate change by analyzing how the deviations harm the community
17:07:02 [timbl_]
17:07:20 [ht_mit]
... W3C has commited to trying to produce a validator -- the TAG can help spec. the behaviour of that validator
17:07:22 [timbl_]
17:07:28 [ht_mit]
ack timbl
17:07:51 [ht_mit]
TV: [scribe missed]
17:07:56 [Stuart]
17:08:10 [ht_mit]
TV: we know that if you're writing HTML, you can leave off end tags
17:08:21 [ht_mit]
... and browsers know how to cope
17:08:35 [Stuart]
17:08:46 [timbl_]
We should for each of the changes motivate them appropriately and with due priority.
17:09:12 [DanC_lap]
(some people think quoting html inside RSS is a feature. I have a hard time appreciating that position.)
17:09:20 [ht_mit]
... but if someone does this in the content of RSS and Atom, the well is poisoned, that is, to protect XML well-formedness the XML has to be quoted
17:09:24 [timbl_]
17:09:31 [DanC_lap]
ack danc
17:09:31 [Zakim]
DanC_lap, you wanted to ask whether we actual have a useful connection to the RDDL community
17:09:43 [ht_mit]
ack ht_mit
17:09:43 [Zakim]
ht_mit, you wanted to ask why anyone would ever serialise to the _non_-XML syntax. . .
17:14:03 [ht_mit]
HST: Does this mean W3C will standardise two different ways of serializing a DOM
17:14:38 [ht_mit]
DC: No, perhaps two syntaxes would have been a better phrase
17:15:47 [Noah]
17:15:48 [ht_mit]
TBL: People have asked if we're heading towards XML2
17:16:20 [ht_mit]
... that's one possible interpretation of the analysis of the motivation for the differences
17:16:56 [ht_mit]
TV: We've got a range of phenomena -- close tags, quotes, etc.
17:17:26 [timbl_]
17:17:29 [ht_mit]
... We have to distinguish which of these are really dangerous and which are not so bad
17:18:02 [ht_mit]
NM: What about new software -- what advise are we giving
17:18:11 [ht_mit]
17:18:20 [timbl_]
q+ to propose view source cleanup
17:18:36 [ht_mit]
... or new versions of old software
17:19:56 [timbl_]
17:20:01 [ht_mit]
NM: I'm not sure about the idea of defending the idea of e.g. leaving out quotes in new software just to save a few bytes
17:20:20 [ht_mit]
... That's the wrong place to look for efficiency
17:22:00 [ht_mit]
TV: The point is that given that it works == the browsers shows the right thing, it makes _sense_ for Google and Yahoo! to exploit the shortcuts
17:23:14 [DanC_lap]
q+ to contrast the google homepage, where device-specific renditions are cost-effective, vs the state of kansas tax web sites, where it's probably not
17:23:16 [ht_mit]
... When Google ships to mobile, we ship very carefully valid HTML
17:23:20 [timbl_]
ack timbl
17:23:20 [Zakim]
timbl_, you wanted to propose view source cleanup
17:24:12 [ht_mit]
TBL: Move that the TAG recommend that browsers should change to Save As and View Source by default to do cleanup first
17:24:24 [timbl_]
TV seconded
17:24:58 [ht_mit]
TV: How much cleaned up?
17:25:05 [Rhys]
Rhys notes that when Volantis ships to mobile, which it does a lot, it ships markup carefully matched to the device, warts and all
17:25:15 [ht_mit]
NW: The whole thing -- no choices, only full XML
17:25:31 [ht_mit]
DC: I opposed the motion because the TAG can't make this happen
17:26:12 [ht_mit]
TV: This is in the same spirit as the final observation in the 'vision' document about XML and extensibility
17:26:24 [timbl_]
I wonder which of all the things the TAG has recommended can be done by the TAG directly.
17:26:50 [ht_mit]
NM: We should put before the community the stories that moving to XML really help with
17:27:38 [DanC_lap]
I support fact-finding in the form of writing stories/use-cases in the direction of the view-source clean-up
17:27:58 [Stuart]
17:28:40 [DanC_lap]
ack danc
17:28:40 [Zakim]
DanC_lap, you wanted to contrast the google homepage, where device-specific renditions are cost-effective, vs the state of kansas tax web sites, where it's probably not
17:28:52 [ht_mit]
NM: Helping people think about these things is as important as just giving them the answers
17:28:57 [timbl_]
17:29:20 [ht_mit]
DC: The 'one web' principle is very important to me
17:29:51 [ht_mit]
... My target audience is not Google, but the contractor who works for the state of Kansas tax office
17:30:00 [Rhys]
q+ to say that DIAL is the way to look at that for authoring
17:30:09 [ht_mit]
... and he needs clearcut advice about the _one_ thing he should do.
17:31:46 [ht_mit]
TBL: There's also the long tail of novices who leave out the quotes because it's easier to write and read w/o them, and they have no need to
17:33:00 [DanC_lap]
(if the TAG wants to have a software-develoment management discussion about the validator, I'd appreciate that too.)
17:33:59 [ht_mit]
TBL: Validates a document pulled off the web, finds a spurious ';' between attributes in a start tag
17:35:56 [ht_mit]
SW: Validates a document pulled off the web, finds unquoted background=#ffffff attribute
17:36:43 [ht_mit]
TV: We should start building a list of incremental issues
17:36:59 [DanC_lap]
TVR makes an interesting observation, that tidy/tagsoup processes don't obey f(x)=x for all well-formed x
17:38:20 [Rhys]
17:38:37 [ht_mit]
NM: Will this be helpful, DanC?
17:38:48 [ht_mit]
DC: Can't tell
17:39:08 [ht_mit]
17:53:49 [timbl]
timbl has joined #tagmem
17:57:13 [Zakim]
18:34:41 [timbl_]
timbl_ has joined #tagmem
18:44:11 [Noah]
zakim, who is here?
18:44:11 [Zakim]
On the phone I see [MIT]
18:44:12 [Zakim]
On IRC I see timbl_, Rhys, dorchard, Raman, DanC_lap, ht_mit, Noah, Zakim, RRSAgent, Stuart, Norm
18:44:34 [Noah]
scribenick: noah
18:44:37 [Noah]
scribe: Noah Mendelsohn
18:44:57 [DanC_lap]
I'm afraid I need to tend to some requests to subscribe to the public-html mailing lists rather urgently; I'll be there as soon as I can
18:45:08 [Noah]
meeting; W3C Technical Architecture Group (TAG) Face to Face Meeting
18:45:34 [Noah]
date: 7 March 2007 (Afternoon)
18:45:42 [Noah]
chair: Stuart Williams
18:46:05 [Noah]
18:46:44 [Noah]
present: Vincent Quint, Tim Berners-Lee, Start Williams, T.V. Raman, Norm Walsh, Rhys Lewis, Noah Mendelsohn, Henry Thompons, Dan Connolly
18:46:59 [Noah]
topic: Tag Soup (continued)
18:49:12 [Noah]
DC: I'm tempted to look at one particular issue, XHTML role and RDFa are both things people are trying to add without actually having to be in the loop with the main HTML work.
18:49:45 [Noah]
DC: Beneficiaries are people who want to create/use accessible AJAX applications, without having to get in the queue to get lots of features added to HTML
18:50:10 [Noah]
DC: Are those good focal points for the discussion of extensibility?
18:50:47 [DanC_lap]
("which group?" is a non-trivial question)
18:50:52 [Stuart]
zakim, who is on the phone
18:50:52 [Zakim]
I don't understand 'who is on the phone', Stuart
18:50:56 [Stuart]
zakim, who is on the phone?
18:50:56 [Zakim]
On the phone I see [MIT]
18:51:01 [Zakim]
18:51:12 [Noah]
TVR: The Web Accessibility group has taken an interesting tack. The role mechanism was initially proposed by me in the XHTML groups, and they are defining particular roles.
18:51:33 [Noah]
TVR: There is now a question of how to do this in HTML, particularly given that the values of the attributes are QNames.
18:51:33 [Vincent]
Vincent has joined #tagmem
18:52:29 [Noah]
TVR: What's being done could be viewed as a hack, or could be viewed as the way we should have done the transition. Make class="xxx;role", and there's a standard Javascript library that rewrites the DOM.
18:52:51 [Noah]
NM: So content is different if you're not Javascript enabled.
18:52:58 [Noah]
DC: Yes, but at least they've shown the need is serious.
18:53:15 [Noah]
DC: So, maybe we just say to the HTML WG: this is important, go standardize it.
18:53:31 [Noah]
TVR: The fact that value is a QName will definitely cause concern.
18:53:37 [timbl_]
Thanks, Noah, I have folded in the comments you sent in email.
18:53:43 [timbl_]
(i think)
18:53:58 [Noah]
Sure, hope they're helpful. Definitely were done in a rush, so check carefully before adopting.
18:54:18 [Rhys]
18:54:39 [Noah]
TVR: I can do anything I want using <div> and <span>
18:55:00 [timbl_]
(and css)
18:55:31 [Noah]
RL: I think I read in the vision document that extensibility was not considered appropriate for the HTML working group goals.
18:55:48 [Stuart]
ack Rhys
18:55:48 [Zakim]
Rhys, you wanted to say that DIAL is the way to look at that for authoring and to
18:55:49 [timbl_]
18:55:52 [DanC_lap]
(re role use cases, this deom is interesting . see also ...)
18:55:59 [timbl_]
ack timbl
18:56:07 [Rhys]
ack rhys
18:56:12 [Stuart]
ack Dan
18:56:12 [Zakim]
DanC_lap, you wanted to suggest that the cost of the quotes is dominated by the norms in your community
18:56:12 [Noah]
q+ to ask about goals of for HTML
18:56:14 [Rhys]
18:56:22 [Rhys]
18:56:54 [Noah]
DC: Cost of putting quotes is not about the space taken, typically, but social issues, what people in your community accept, etc.
18:57:13 [Noah]
DC: Interesting you read extensibility as a non-goal for the HTML WG. I don't think I was trying to say that.
18:58:10 [timbl_]
Boah: What I read the vision document to say is that two working groups will share a model.
18:58:13 [Rhys]
nm read the vision document to say that there were two working groups that share a model
18:58:15 [timbl_]
DC: I expect them to share XML.
18:58:40 [timbl_]
the pen
18:58:56 [Rhys]
DanC I don't expect the groups to share a model
18:59:21 [Rhys]
HT who is responsible for XHTML 1.1 maintenance
18:59:29 [Rhys]
DanC its a shared responsibility
19:00:25 [Rhys]
NM quotes the section of the vision document about the serialisations
19:00:41 [Rhys]
DanC The two serialisations are in the HTML working group
19:00:48 [timbl_]
q+ to ask about "Instead, the charter calls for two equivalent serializations to be developed for HTML"
19:01:02 [Noah]
NM: The vision document says:
19:01:04 [Noah]
"Instead, the charter calls for two equivalent serializations to be developed, corresponding to a single DOM (or infoset, though tag soup cannot be considered to have an infoset currently, while it can have a DOM). This ensures that decisions are not made which would not preclude an XML serialization. It allows the two serializations to be inter-converted automatically. Having new language features, there is an incentive for content authors to use it; and ha
19:01:04 [Noah]
19:01:25 [Noah]
NM: I read that as saying that for any given abstract document, at least in the 80% case, then DOM would be the same.
19:01:30 [Stuart]
ack Noah
19:01:30 [Zakim]
Noah, you wanted to ask about goals of for HTML
19:01:33 [Stuart]
ack Rhys
19:01:35 [Noah]
DC: No, it didn't mean to say that.
19:02:54 [Noah]
HT: I'm unclear in the vision document, when it says HTML without qualification, it means all of the several variants, vs. specifically one (scribe infers the tag soup version)
19:03:06 [Noah]
DC: We'll know in a few months.
19:03:15 [DanC_lap]
(I hope)
19:03:27 [ht_mit]
q+ to share some information about John Cowan's tagsoup project
19:03:41 [Noah]
(TimBL edits to say "Instead, the charter calls for two equivalent serializations to be developed by the HTML WG, "
19:04:33 [Noah]
RL: Before lunch we were talking about mobile, especially device independence markup. DIAL is one approach to solving that problem, not having to do redirects, having one representation, etc.
19:04:45 [Noah]
DC: What do we have on DIAL?
19:04:51 [Noah]
RL: There's a WD?
19:05:01 [Noah]
19:05:08 [Noah]
DC: Does it tell a story?
19:05:13 [Noah]
RL: There's a primer.
19:05:55 [Noah]
19:06:44 [Noah]
RL: It's XHTML2 + XForms + Some other modules
19:06:52 [timbl_]
XHTML2 + XFORMS + soem other modules
19:07:07 [ht_mit]
DIAL stands for Device Independent Authoring Language
19:07:08 [Noah]
RL: These stories are based on actual commercial usage, from vendors, network operators, content providers, etc.
19:07:26 [Noah]
RL: Dirk wishes to create a web site viewable on, say, any web device including mobile.
19:07:33 [Noah]
DC: Which kind of org. does he work for?
19:08:07 [Noah]
RL: Say, network operators, content partners (e.g. Disney/eBay), other sites. Maybe it helps you get ringtones for your device.
19:08:27 [Noah]
RL: Dirk writes some DIAL as his markup. It's constructed to avoid device dependency?
19:08:32 [Noah]
DC: He uses emacs?
19:08:42 [Noah]
RL: Probably dial-ware tools.
19:08:47 [Noah]
DC: available?
19:08:56 [Noah]
RL: Yes, e.g. from Volantis (chuckle)
19:09:01 [Noah]
DC: Direct manipulation.
19:09:13 [Noah]
RL: Typically mixed, xml editor with some help around the edges.
19:09:40 [Noah]
RL: If the markup is only the device independent stuff, then the device-specific stuff has to go somewhere, and has to be worth the incremental trouble.
19:09:57 [Noah]
RL: Example: companies may not trust transcoding of their logo images.
19:10:21 [Noah]
RL: So, there are ways of linking to the device dependent stuff. This is generic resources.
19:10:37 [Noah]
RL: The reference is device independent, but the infrastructure serves the right thing.
19:10:44 [Noah]
DC: Mostly deployed server side.
19:10:49 [Noah]
RL: Yes, mostly.
19:11:45 [Noah]
RL: Opera mobile is an interesting example. It can do some level of rearrangement and transcoding on the device for a standard HTML page, but it can tend to be less successful insofar as the HTML they're starting with has already lost some information about the intent.
19:12:06 [Noah]
RL: Eventually, more will happen on the client, but there's a risk you send the device images, etc. it won't need.
19:12:21 [Noah]
RL: I see XHTML2 as being important for doing those things server-side.
19:12:22 [Noah]
19:12:40 [Noah]
q+ to comment on server-side only XHTML2
19:13:24 [Noah]
RL: Forms is the main thing.
19:13:32 [Stuart]
19:13:45 [Noah]
TVR: XHTML2 has some things like navigation lists, and ???
19:14:00 [Noah]
s/???/section stuff/
19:14:06 [Noah]
DC: How does section stuff work?
19:14:12 [Noah]
TVR: lets you open and close a tree.
19:14:48 [Noah]
RL: What's really crucial is the XML for extensibility, and BTW we'd like to do that using CDF>
19:14:57 [Stuart]
ack timbl
19:14:57 [Zakim]
timbl_, you wanted to ask about "Instead, the charter calls for two equivalent serializations to be developed for HTML"
19:15:07 [Stuart]
ack ht
19:15:07 [Zakim]
ht_mit, you wanted to share some information about John Cowan's tagsoup project
19:15:27 [Noah]
HT: Reminding that I have 5-10 minutes of intro on tag soup and how it works.
19:15:55 [Stuart]
ack Noah
19:15:55 [Zakim]
Noah, you wanted to comment on server-side only XHTML2
19:16:03 [Stuart]
q+ ht
19:16:36 [Rhys]
NM Rhys said that we care about this stuff on the server. The discussion changes when you move to the server
19:17:05 [Rhys]
NM: Insofar as we have these compositions only at the server, we've lost
19:17:32 [Noah]
TVR: I disagree that this is similar to JSP or ASP pages, because those will never run on the client.
19:17:43 [Noah]
TVR: Running it only on the server is a bootstrapping mechanism.
19:17:59 [Noah]
TVR: I was several months ago against tag soup because it kills that story.
19:18:19 [Rhys]
19:18:48 [timbl_]
19:19:16 [Noah]
TVR: The notion that it can move from server to client is what matters.
19:20:26 [Noah]
TBL: Lots of content is moved on the wire as part of the server-side business of assembling content.
19:21:52 [Noah]
NM: I agree. The risk is that, if tag soup is the only thing that can go beyond the servers, then you will only get composition and extensibility at the server, which indeed would unfortunately.
19:22:04 [Stuart]
ack Rhys
19:22:13 [Noah]
RL: BTW, I've offered to talk in future on Uniquitous Web Applications Work.
19:22:20 [Stuart]
ack tim
19:23:17 [Noah]
TVR: Before lunch, we talked about writing a document about transition issues.
19:23:30 [Noah]
Raman shows a list of proposed topics.
19:23:36 [Noah]
1. Quotes around attributes.
19:23:41 [Noah]
TBL: This is a bug
19:23:59 [Raman]
19:23:59 [Raman]
* TagSoup Issues
19:23:59 [Raman]
19:23:59 [Raman]
This document will explore the issues that rise at the
19:24:02 [Raman]
intersection of the TAG Soup and XML Web.
19:24:05 [Raman]
As TagSoup evolves to enable incremental transition to XML, we
19:24:09 [Raman]
identify individual differences in traditional XML 1.0
19:24:12 [Raman]
serialization and TAgSoup, and for each such instance, enumerate
19:24:16 [Raman]
the pros and cons (carrot vs stick)
19:24:19 [Raman]
driving that issue, how it affects various issues of deployment,
19:24:22 [Raman]
and who might benefit from us writing down such a document. In
19:24:25 [Raman]
addition, it would be useful for the TAG to arrive at a pithy
19:24:28 [Raman]
conclusion for each point analogous to the assertion
19:24:32 [Raman]
- If you're interested in extensibility, use XML serialization.
19:24:32 [Raman]
19:24:35 [Raman]
19:24:38 [Raman]
19:24:42 [Raman]
* Topic List
19:24:44 [Raman]
19:24:48 [Raman]
1. Quotes around attributes.
19:24:51 [Raman]
1. Example use cases.
19:24:53 [Noah]
TVR: What I'd imagine is a matrix that says, e.g. if you don't put quotes around attributes, you won't be able to mix it with SVG, except that in this case you can clean things up.
19:24:54 [Raman]
2. Situations that justify deviation.
19:24:57 [Raman]
3. Possible drawbacks with use of this deviation.
19:25:00 [Noah]
TVR: I'll refactor the list as you suggest.
19:25:00 [Raman]
4. Suggested best practice.
19:25:04 [Raman]
2. Some tags are special =img= doesn't need close tag.
19:25:07 [Raman]
3. XML or HTML serialization from /show source/
19:25:10 [Raman]
4. Cut and paste between HTML and XML
19:25:13 [Raman]
5. Points on the HTML TAGSoup <-> XML continuum.
19:25:16 [Raman]
6. Integration of SVG, MathML etc into Web pages
19:25:20 [Raman]
7. Integration of HTML into RSS, ATOM.
19:25:20 [Raman]
8. Connection and impact on one-web.
19:25:23 [Raman]
19:25:26 [Raman]
19:25:46 [Stuart]
19:26:09 [Noah]
HT: Missing end tags fall into 2-3 categories: known to be empty, in old SGML dtd were optional, were known not optional
19:26:25 [Noah]
TBL: Unknown tags, possibly with namespaces.
19:26:31 [DanC_lap]
the high-level things like "Integration of HTML into RSS, ATOM" are more appealing to me than "Quotes around attributes."
19:26:35 [Noah]
HT: Hierarchically: unknown start tag.
19:26:45 [Noah]
HT: Under that, unknown namespace qualified start tag.
19:26:59 [Noah]
TVR: And lest we forget, free floating end tags not corresponding to a start.
19:27:18 [Noah]
HT: This is a a good template, at least as a general model, but let's not fill it in in detail for now.
19:27:22 [DanC_lap]
(I realize why I have angst around TAG discussion of missing quotes and end tags... all these great examples and nobody's capturing them for the test suite.)
19:27:32 [Noah]
TVR: For the first bullet I gave subcategories. Can you think of subcats. for others?
19:28:19 [Noah]
HT: Yes, I'd like to see something that says at least hypothetically: "best possible argument in favor -- why do people do this?"
19:28:43 [Noah]
HT: e.g., I'd guess that most missing ";" at end of entity references are just typos, but others are done with conviction.
19:29:15 [timbl_]
q+ to say well there was markup minimization
19:29:17 [Noah]
SW: Question, am I right that this TAG soup thing was not an intentional design, except as a consequence of the "be liberal in what you accept philosphy"
19:29:35 [Noah]
HT: Not quite, the SGML DTD said "you may omit the following end tags..."
19:30:02 [Noah]
SW: In these charters, there's a common DOM, an XML serialization, and a tag soup serialization.
19:30:02 [timbl_]
You could also omit quotes, no?
19:30:38 [Noah]
TVR: It's all well and good if you can clean up soupy input, but why would you reserialize as soup?
19:30:46 [Noah]
SW: Are we doing some of what the WG will do.
19:31:22 [Noah]
TVR: We are learning on our feet. What I want us to focus on is: how will anything we do in the soup world affect the intersection. I want to see ample communication with the TAG.
19:31:33 [Noah]
DC: Group will do similar things, but with different focus and logistics.
19:32:18 [Stuart]
ack Stuart
19:34:26 [timbl_]
19:35:10 [Noah]
NM: Some workgroups have been very effective in taking more time than is sometimes convenient to be very crisp about articulating use cases, getting everyone to agree on what was important about those use cases, and make sure the mechanisms supported the use cases.
19:35:27 [Noah]
NM: That, ideally, would be a good way to get people to make conscious decisions about where extensibility is of value and where not.
19:35:46 [Noah]
TVR: The functions and operators stuff was very well done that way, even those XForms didn't use it in the end.
19:37:35 [Noah]
TBL: One of the very important questions is whether valid XML with namespaces is a subset of the tag soup serialization.
19:37:54 [Noah]
DC: With namespaces?
19:38:01 [Noah]
TBL: Hmm, maybe using the default namespace.
19:38:33 [Noah]
TVR: Does it mean that a browser that consumes soup can necessarily consume valid XHTML with MathML.
19:38:58 [Noah]
TBL: Yes, especially if HTML is default namespace, and the math stuff may not render right
19:39:04 [Noah]
TVR: There's debate about that.
19:39:26 [Noah]
TBL: Today what's happening is that they'll ignore the namespaces and the math markup, but the math content will render, perhaps messily.
19:39:41 [Noah]
DC: The question is not ignoring unknown tags, it's what can you get at from Javascript.
19:39:56 [Noah]
DC: Sure you can stick in namespace decls, but can you get at them from Javascript.
19:40:05 [Noah]
TVR: Yes, what's in the DOM.
19:44:08 [Noah]
NM: I was confused. You have now explained that in addition to the work being done XHTML2, the HTML WG will take responsibility for two serializtions, one XML-based and one soupy.
19:44:12 [Noah]
Several: yes
19:44:38 [Noah]
NM: Thank you, I was confused. That's very helpful.
19:45:04 [Noah]
TBL: I think you'd probably need to use the XML serialization for namespace qualified stuff.
19:45:26 [Noah]
DC: I'm not convinced folks in the HTML WG are fully bought into supporting namespaces at all.
19:45:38 [Noah]
HT: I think the existing drafts suggest it's possible.
19:49:14 [ht_mit]
19:49:23 [ht_mit]
HTML5 current draft
19:49:39 [Stuart]
19:49:53 [ht_mit]
Web Applications 1.0
19:49:53 [ht_mit]
Working Draft — 6 March 2007
19:50:14 [Noah]
Working draft of HTML 5 (Web Applications 1.0):
19:50:24 [timbl]
timbl has joined #tagmem
19:50:41 [Stuart]
ack ht
19:51:12 [ht_mit]
19:52:09 [Noah]
From that: "Implementations that support XHTML5 must support some version of XML, as well as its corresponding namespaces specification, because XHTML5 uses an XML serialisation with namespaces. [XML] [XMLNAMES]"
19:52:55 [Noah]
We are discussing John Cowan's "TagSoup: A SAX parser in Java
19:52:55 [Noah]
for nasty, ugly HTML"
19:53:21 [Noah]
TVR: Recovers from lots of "errors" in the markup.
19:54:06 [Noah]
"The HTML Scanner
19:54:06 [Noah]
• DOCTYPE declarations are ignored completely
19:54:06 [Noah]
• Consequently, external DTDs are not read
19:54:06 [Noah]
• Comments and processing instructions (ending in
19:54:06 [Noah]
>, not ?>) are passed through to the application
19:54:07 [Noah]
• Entity references are expanded or turned into
19:54:09 [Noah]
19:54:28 [Noah]
"Element Rectification
19:54:28 [Noah]
• Rectification takes the incoming stream of starttags,
19:54:28 [Noah]
end-tags, and character data and makes it
19:54:28 [Noah]
19:54:28 [Noah]
• TagSoup is essentially an HTML scanner plus a
19:54:29 [Noah]
schema-driven element rectifier
19:54:31 [Noah]
• TagSoup uses its own schema language compiled
19:54:33 [Noah]
into Schema objects"
19:55:09 [Noah]
"Parent Element Types
19:55:09 [Noah]
• Parent element types represent the most
19:55:09 [Noah]
conservative possible parent of an element
19:55:09 [Noah]
• The schema gives a parent element type for each
19:55:09 [Noah]
element type:
19:55:10 [Noah]
–The parent of BODY is HTML
19:55:12 [Noah]
–The parent of LI is UL
19:55:14 [Noah]
–The parent of #PCDATA is BODY"
19:55:47 [Noah]
HT: I think there's a meta annotation explicitly in the schema to declare most conservative possible parent, at least in some cases where it can't be inferred.
19:57:04 [Noah]
DC: It's a bit like LEXX and YACC, in that there is a scanner table and a parser table you can fool with.
19:57:17 [Noah]
HT: But the fixups are built into the Java code, though key'd off the schema
19:58:08 [Noah]
HT: It so happened that the very first document I looked at happened to be one that neither John's Tag Soup nor Dave Ragett's tidy could successfully handle.
19:58:23 [Noah]
HT: It was <center><tr> ... </center>
19:58:46 [Noah]
HT: Both tools made the "mistake" of closing the table(in earlier discussions, it was noted that browsers
19:58:54 [Noah]
ignore the <center>)
19:59:11 [Noah]
***Scribe note to self - Henry didn't say that last bit...scribe did ***
19:59:20 [Noah]
HT: Possible fix is look ahead.
19:59:34 [Noah]
TVR: Maybe just throw away the center tag.
19:59:46 [Noah]
HT: Yes, you could probably do that with John's model.
20:00:18 [Noah]
HT: I'm thinking of something like a shift/reduce parser, but instead you get shift/ship-as-sax-event
20:01:25 [DanC_lap]
(I wonder if ht is going to connect this to extensibility or something else beyond straight HTML parser design.)
20:01:46 [DanC_lap]
20:01:53 [Noah]
HT: I'm experimenting with a system that uses John's tokenizer, and my own upper level. Wondering whether it can reconstruct the HTML 5 English language spec.
20:02:14 [Noah]
HT: Since that is sometimes described as a way of capturing the error recovery of today's browsers.
20:02:39 [Noah]
TVR: Sounds appealing. Trouble is likely to be where HTML 5 does backtracking...almost does an "unshift".
20:02:55 [Noah]
HT: I asked on Tag Soup list whether John has a regression test suite.
20:03:30 [Noah]
HT: Elliotte suggested John has things, but got them from the Web, and it's likely there would be copyright problems in sharing it.
20:03:51 [Stuart]
ack Dan
20:04:18 [Noah]
DC: Was waiting for you to relate this to extensibility. Our job is not to do a better job on HTML 5 than the WG is going to do.
20:04:44 [Noah]
HT: The TAG has a least power finding.
20:05:04 [Noah]
TVR: It's been suggested we should right a validator.
20:05:21 [Noah]
DC: Do you acknowledge that this could be seen as being rude, in that it's not our business as a workgroup to do this.
20:05:38 [Noah]
20:05:48 [Noah]
HT: Well, they have gone far down the road.
20:06:43 [ht_mit]
s/right a/write a/
20:06:45 [Rhys]
NM: I think there is a line to be walked and that we need to acknowledge Dan's concern about ownership.
20:07:10 [Rhys]
NM: Reasonable for people to be circumspect about the role of the TAG in this particular case, and others actually.
20:07:57 [Rhys]
NM: TAG should be careful and either contribute as individuals or learn from what is happening in particular working groups.
20:08:17 [Rhys]
NM: Good for us to discuss it because it helps us learn the what the issues are
20:08:41 [ht_mit]
20:09:10 [Noah]
HT: In the statement of tagSoupIntegration-54 it says "Treat it "as if" it had been processed by [some formalization
20:09:10 [Noah]
of] 'tidy -asxhtml';
20:09:10 [Noah]
20:09:16 [Noah]
HT: I feel I'm exploring that.
20:09:43 [Noah]
HT: I think the reasoning is closely related to least power, but trying to make the story as declarative as possible.
20:09:48 [Stuart]
ack Danc
20:09:48 [Zakim]
DanC_lap, you wanted to ask if ht's explorations suggest anything about the @role situation or other extensibility cases
20:09:52 [Noah]
DC: Any new insights into what to do about role attribute?
20:10:02 [Noah]
HT: Don't think so, the Tag Soup program predates that.
20:10:34 [Noah]
DC: Suppose I want to use something like role without waiting to go through HTML WG. Want to get at it from Javascript.
20:11:06 [Noah]
HT: Posit that it's not in the HTML spec. I don't know what he does with unknown attributes. Seems to me that you should be able to control that in the formalizaiton of the mapping.
20:11:20 [Noah]
DC: Good thing to study. Also think about simpler stuff like SVG elements.
20:11:57 [Noah]
HT: Think it's passed through. Philosohpy of Tag Soup is to pass through when possible. Suspect he passes through.
20:12:02 [Noah]
NW: My experience a bit difference.
20:12:20 [Noah]
NW: I had trouble with a bunch of RDDL. It munged the namespaces.
20:12:32 [Noah]
HT: There's something about that, and a switch.
20:12:50 [Noah]
DC: Sam Ruby, Ian Hixie and others are building a parsing library and 200 tests.
20:13:15 [Noah]
DC: Something like 2% of web documents use <image> spelled that way.
20:14:22 [ht_mit]
s/munged the namespaces/munged the namespace declarations/
20:14:29 [DanC_lap]
20:15:13 [DanC_lap]
or 0.2%
20:15:18 [Noah]
SW: We have 15 mins to go.
20:15:35 [Noah]
SW: We have a set of points Raman has set down, now need a strategy moving forward.
20:15:49 [Noah]
NM: What's the success criteria for the list Raman is working on.
20:16:30 [DanC_lap]
20:19:17 [Noah]
TVR: I would like it to be the place holder document for tag soup issue 54
20:19:32 [Noah]
NM: And is it the list of answers to some question? Things to worry about?
20:19:40 [Noah]
DC: Potential table of contents for a document.
20:19:43 [Noah]
NM: Works for me.
20:19:53 [Noah]
TVR: And a framework to govern our work.
20:20:09 [DanC_lap]
TVR asked for DanC to work on it with him. DanC agreed.
20:20:45 [Noah]
TVR: Happy to do an initial draft, as long as people view it as fodder for discussion, not something to shred.
20:21:46 [Noah]
ACTION: T.V. Raman to draft initial discussion material on tag soup for discussion on 26 March, draft on the 19th or so.
20:23:51 [Noah]
TVR: Public or private?
20:24:18 [Noah]
NM: Public. Just make sure it's clear that we're trying to come up to speed, not tread on other peoples' toes.
20:24:32 [Noah]
topic: Future meetings
20:24:39 [Noah]
SW: Next telcon on the 12th
20:24:43 [Noah]
DC: regrets
20:24:46 [DanC_lap]
I'm at risk for 12 March; travelling to SxSWi
20:24:57 [Noah]
SW: won't have time for agenda work until just after arriving.
20:25:40 [Noah]
DC: What about discussing XML chunk whatever.
20:25:57 [Noah]
NW: Was going to ask to just close it.
20:26:27 [Noah]
NW: xmlChunk-44 was an attempt to tackle deep equals for XML. I now think we can't do better than XML Functions and Operators.
20:26:36 [Noah]
TBL: No communication from us.
20:26:45 [Noah]
NW: We always write a note when closing the issue.
20:26:56 [Noah]
DC: Garbage collect or endorse the draft.
20:27:01 [Noah]
NW: collect it.
20:27:04 [Noah]
SW: Objections?
20:27:06 [Noah]
20:28:02 [Noah]
ACTION: Norm to mark as abandoned the finding on deep equals and announce xmlChunk-44 is being closed without further action, with reason
20:28:26 [Noah]
RESOLUTION: close issue xmlChunk-44
20:30:59 [DanC_lap]
sounds like 12 March call is cancelled
20:31:13 [DanC_lap]
RESOLVED: to meet next 19 March
20:31:55 [Noah]
SW: Adjourned
20:32:08 [Zakim]
20:37:28 [DanC_lap]
DanC_lap has joined #tagmem
20:40:41 [Zakim]
20:40:43 [Zakim]
TAG_f2f()9:00AM has ended
20:40:44 [Zakim]
Attendees were DOrchard, [MIT]
20:43:39 [timbl]
timbl has joined #tagmem
21:08:40 [Norm]
Norm has joined #tagmem
21:09:50 [Noah]
Noah has joined #tagmem
21:10:20 [timbl]
timbl has left #tagmem
22:40:00 [Zakim]
Zakim has left #tagmem